fastp Workflow
All-in-one preprocessing tool that handles adapter trimming, quality filtering, deduplication, and report generation in a single pass.
Basic Usage
Single-End
fastp -i input.fastq.gz -o output.fastq.gz
Paired-End
fastp -i R1.fastq.gz -I R2.fastq.gz -o R1_clean.fastq.gz -O R2_clean.fastq.gz
With Custom HTML/JSON Reports
fastp -i R1.fq.gz -I R2.fq.gz
-o R1_clean.fq.gz -O R2_clean.fq.gz
-h sample_report.html
-j sample_report.json
Adapter Trimming
fastp auto-detects Illumina adapters by default.
Auto-detect (default)
fastp -i in.fq -o out.fq
Specify adapters manually
fastp -i in.fq -o out.fq
--adapter_sequence AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
Paired-end with manual adapters
fastp -i R1.fq -I R2.fq -o R1.out.fq -O R2.out.fq
--adapter_sequence AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
--adapter_sequence_r2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
Disable adapter trimming
fastp -i in.fq -o out.fq --disable_adapter_trimming
Adapter FASTA file
fastp -i in.fq -o out.fq --adapter_fasta adapters.fa
Quality Filtering
Per-base quality threshold (default Q15)
fastp -i in.fq -o out.fq -q 20
Mean read quality threshold
fastp -i in.fq -o out.fq -e 25
Max unqualified bases percent (default 40)
fastp -i in.fq -o out.fq -q 20 --unqualified_percent_limit 30
Disable quality filtering
fastp -i in.fq -o out.fq --disable_quality_filtering
Quality Trimming
Sliding window from 3' end (recommended)
fastp -i in.fq -o out.fq
--cut_right
--cut_right_window_size 4
--cut_right_mean_quality 20
Sliding window from 5' end
fastp -i in.fq -o out.fq
--cut_front
--cut_front_window_size 4
--cut_front_mean_quality 20
Both ends
fastp -i in.fq -o out.fq
--cut_front --cut_tail
--cut_front_window_size 4
--cut_front_mean_quality 20
--cut_tail_window_size 4
--cut_tail_mean_quality 20
Length Filtering
Minimum length (default 15)
fastp -i in.fq -o out.fq -l 36
Maximum length
fastp -i in.fq -o out.fq --length_limit 150
Required length (discard shorter AND longer)
fastp -i in.fq -o out.fq -l 100 --length_limit 100
Poly-X Trimming
Trim poly-G (NovaSeq/NextSeq artifacts) - auto-enabled for these platforms
fastp -i in.fq -o out.fq --trim_poly_g
Disable poly-G trimming
fastp -i in.fq -o out.fq --disable_trim_poly_g
Trim poly-X (any homopolymer)
fastp -i in.fq -o out.fq --trim_poly_x
Custom poly-G minimum length (default 10)
fastp -i in.fq -o out.fq --trim_poly_g --poly_g_min_len 5
N Base Handling
Max N bases (default 5)
fastp -i in.fq -o out.fq -n 3
Disable N filtering
fastp -i in.fq -o out.fq --n_base_limit 50
Deduplication
Enable deduplication
fastp -i in.fq -o out.fq --dedup
Accuracy level (1-6, higher = more memory, default 3)
fastp -i in.fq -o out.fq --dedup --dup_calc_accuracy 4
Base Correction (Paired-End Only)
Enable overlap-based correction
fastp -i R1.fq -I R2.fq -o R1.out.fq -O R2.out.fq --correction
Required overlap length (default 30)
fastp -i R1.fq -I R2.fq -o R1.out.fq -O R2.out.fq
--correction --overlap_len_require 20
Paired-End Merge
Merge overlapping paired reads
fastp -i R1.fq -I R2.fq
--merge --merged_out merged.fq
-o R1_unmerged.fq -O R2_unmerged.fq
UMI Processing
UMI in read (extract to header)
fastp -i in.fq -o out.fq
--umi --umi_loc read1 --umi_len 8
UMI in separate read
fastp -i R1.fq -I R2.fq -o R1.out.fq -O R2.out.fq
--umi --umi_loc index1
UMI locations: index1, index2, read1, read2, per_index, per_read
Complete Workflow Example
Standard Illumina Pipeline
fastp
-i raw_R1.fastq.gz -I raw_R2.fastq.gz
-o clean_R1.fastq.gz -O clean_R2.fastq.gz
--detect_adapter_for_pe
--cut_right --cut_right_window_size 4 --cut_right_mean_quality 20
-q 20 -l 36
--thread 8
-h sample_fastp.html -j sample_fastp.json
NovaSeq/NextSeq Pipeline
fastp
-i raw_R1.fastq.gz -I raw_R2.fastq.gz
-o clean_R1.fastq.gz -O clean_R2.fastq.gz
--detect_adapter_for_pe
--trim_poly_g
--cut_right --cut_right_window_size 4 --cut_right_mean_quality 20
-q 20 -l 36
--thread 8
-h sample_fastp.html -j sample_fastp.json
RNA-seq Pipeline
fastp
-i raw_R1.fastq.gz -I raw_R2.fastq.gz
-o clean_R1.fastq.gz -O clean_R2.fastq.gz
--detect_adapter_for_pe
--cut_right --cut_right_window_size 4 --cut_right_mean_quality 20
-q 20 -l 50
--thread 8
-h sample_fastp.html -j sample_fastp.json
Output Files
File Description
*.html
Interactive HTML report
*.json
Machine-readable statistics
Output FASTQ Processed reads
JSON Report Structure
import json
with open('sample_fastp.json') as f: report = json.load(f)
summary = report['summary'] print(f"Total reads: {summary['before_filtering']['total_reads']}") print(f"Passed reads: {summary['after_filtering']['total_reads']}") print(f"Q20 rate: {summary['after_filtering']['q20_rate']:.2%}") print(f"Q30 rate: {summary['after_filtering']['q30_rate']:.2%}")
Performance
Set threads (default 3)
fastp -i in.fq -o out.fq --thread 8
Disable HTML report (faster)
fastp -i in.fq -o out.fq --html /dev/null
Process from stdin
zcat in.fq.gz | fastp --stdin -o out.fq
Related Skills
-
quality-reports - MultiQC can aggregate fastp JSON reports
-
adapter-trimming - Cutadapt for complex adapter scenarios
-
quality-filtering - Trimmomatic alternative