Long-Read Alignment with minimap2
Oxford Nanopore Alignment
Basic ONT alignment
minimap2 -ax map-ont reference.fa reads.fastq.gz |
samtools sort -o aligned.bam
samtools index aligned.bam
PacBio HiFi Alignment
PacBio HiFi reads (high accuracy)
minimap2 -ax map-hifi reference.fa reads.fastq.gz |
samtools sort -o aligned.bam
samtools index aligned.bam
PacBio CLR Alignment
PacBio CLR (continuous long reads, lower accuracy)
minimap2 -ax map-pb reference.fa reads.fastq.gz |
samtools sort -o aligned.bam
samtools index aligned.bam
Pre-Build Index for Multiple Runs
Build index once
minimap2 -d reference.mmi reference.fa
Use index for alignment
minimap2 -ax map-ont reference.mmi reads.fastq.gz | samtools sort -o aligned.bam
Common Options
minimap2 -ax map-ont
-t 8 \ # Threads
-R '@RG\tID:sample\tSM:sample' \ # Read group
--secondary=no \ # No secondary alignments
--MD \ # Generate MD tag for variants
-Y \ # Use soft clipping for supplementary
reference.fa reads.fastq.gz |
samtools sort -@ 4 -o aligned.bam
Splice-Aware Alignment (RNA)
For direct RNA or cDNA sequencing
minimap2 -ax splice reference.fa reads.fastq.gz |
samtools sort -o aligned.bam
With Junction BED (Known Splice Sites)
Provide known splice junctions
minimap2 -ax splice --junc-bed junctions.bed
reference.fa reads.fastq.gz | samtools sort -o aligned.bam
Assembly to Reference Alignment
Assembly with ~0.1% divergence
minimap2 -ax asm5 reference.fa assembly.fa > aligned.sam
Assembly with higher divergence (~5%)
minimap2 -ax asm20 reference.fa assembly.fa > aligned.sam
Output PAF (Faster, No BAM)
PAF format (faster, for quick analysis)
minimap2 -x map-ont reference.fa reads.fastq.gz > alignments.paf
Keep Secondary and Supplementary
Keep all alignments (for SV calling)
minimap2 -ax map-ont
--secondary=yes
-N 5 \ # Max secondary alignments
reference.fa reads.fastq.gz | samtools sort -o aligned.bam
Filter Alignments
During alignment pipeline
minimap2 -ax map-ont reference.fa reads.fastq.gz |
samtools view -b -q 10 | \ # Min mapping quality 10
samtools sort -o aligned.bam
Multiple FASTQ Files
Concatenate inputs
minimap2 -ax map-ont reference.fa reads1.fastq.gz reads2.fastq.gz |
samtools sort -o aligned.bam
Or use file list
cat file_list.txt | xargs minimap2 -ax map-ont reference.fa |
samtools sort -o aligned.bam
Output Statistics
Get alignment statistics
samtools flagstat aligned.bam
Detailed stats
samtools stats aligned.bam | grep ^SN
Convert PAF to BED
Extract alignments to BED
awk 'OFS="\t" {print $6, $8, $9, $1, $12, ($5=="+")?"+":"-"}' alignments.paf > alignments.bed
Key Presets
Preset Description Best For
map-ont ONT reads Nanopore genomic
map-hifi PacBio HiFi PacBio genomic
map-pb PacBio CLR PacBio CLR
splice Long RNA reads cDNA, direct RNA
asm5 Low divergence Same species assembly
asm20 High divergence Cross-species assembly
sr Short reads Illumina (basic)
Key Parameters
Parameter Default Description
-t 3 CPU threads
-k 15 K-mer size
-w 10 Minimizer window
-a off Output SAM
-x none Preset
--secondary yes Output secondary
-N 5 Max secondary alignments
--MD off Generate MD tag
-R none Read group header
-Y off Soft clipping for supplementary
Output Formats
Format Flag Description
PAF (default) Pairwise Alignment Format
SAM -a Sequence Alignment Map
BAM -a | samtools Binary SAM
Related Skills
-
medaka-polishing - Polish consensus with medaka
-
structural-variants - Call SVs from alignments
-
alignment-files - BAM manipulation