Bismark Alignment
Prepare Genome Index
One-time genome preparation (creates bisulfite-converted index)
bismark_genome_preparation --bowtie2 /path/to/genome_folder/
Genome folder should contain FASTA files (e.g., hg38.fa, chr1.fa, etc.)
Creates Bisulfite_Genome/ subdirectory with CT and GA converted indices
Basic Single-End Alignment
bismark --genome /path/to/genome_folder/ reads.fastq.gz -o output_dir/
Paired-End Alignment
bismark --genome /path/to/genome_folder/
-1 reads_R1.fastq.gz
-2 reads_R2.fastq.gz
-o output_dir/
Common Options
bismark --genome /path/to/genome_folder/
--bowtie2 \ # Use bowtie2 (default)
--parallel 4 \ # Number of parallel instances
--temp_dir /tmp/ \ # Temporary directory
--non_directional \ # For non-directional libraries
--nucleotide_coverage \ # Generate nucleotide coverage report
-o output_dir/
reads.fastq.gz
RRBS Mode
Reduced Representation Bisulfite Sequencing
bismark --genome /path/to/genome_folder/
--pbat \ # For PBAT libraries (post-bisulfite adapter tagging)
reads.fastq.gz
MspI digestion (RRBS standard)
Bismark handles MspI-digested libraries automatically
PBAT Libraries
Post-Bisulfite Adapter Tagging (e.g., scBS-seq)
bismark --genome /path/to/genome_folder/ --pbat reads.fastq.gz
Non-Directional Libraries
For libraries where all 4 strands are present
bismark --genome /path/to/genome_folder/ --non_directional reads.fastq.gz
With Quality/Adapter Trimming (Pre-alignment)
Trim adapters first with Trim Galore (recommended)
trim_galore --illumina --paired reads_R1.fastq.gz reads_R2.fastq.gz
Then align
bismark --genome /path/to/genome_folder/
-1 reads_R1_val_1.fq.gz
-2 reads_R2_val_2.fq.gz
Multicore Processing
--parallel sets instances per alignment direction
Total threads = parallel * 2 (for directional) or parallel * 4 (non-directional)
bismark --genome /path/to/genome_folder/
--parallel 4
reads.fastq.gz
Output Files
Bismark produces:
- reads_bismark_bt2.bam # Aligned reads
- reads_bismark_bt2_SE_report.txt # Alignment report
View alignment report
cat output_dir/reads_bismark_bt2_SE_report.txt
Sort and Index BAM
Bismark output is unsorted
samtools sort output.bam -o output.sorted.bam samtools index output.sorted.bam
Deduplicate (Optional)
Remove PCR duplicates (recommended for WGBS, not RRBS)
deduplicate_bismark --bam output_bismark_bt2.bam
For paired-end
deduplicate_bismark --paired --bam output_bismark_bt2_pe.bam
Check Alignment Statistics
Bismark generates detailed report
cat *_SE_report.txt
Key metrics:
- Sequences analyzed
- Unique alignments
- Mapping efficiency
- C methylated in CpG context
Genome Preparation with HISAT2 (Recommended for Large Genomes)
HISAT2 is faster and uses less memory for large mammalian genomes
bismark_genome_preparation --hisat2 /path/to/genome_folder/
Align with HISAT2
bismark --genome /path/to/genome_folder/ --hisat2 reads.fastq.gz
HISAT2 paired-end
bismark --genome /path/to/genome_folder/ --hisat2
-1 reads_R1.fastq.gz
-2 reads_R2.fastq.gz
Key Parameters
Parameter Description
--genome Path to genome folder
--bowtie2 Use Bowtie2 aligner (default)
--hisat2 Use HISAT2 aligner
--parallel Parallel alignment instances
--non_directional Non-directional library
--pbat PBAT library protocol
-o Output directory
--temp_dir Temporary file directory
--nucleotide_coverage Generate nuc coverage report
-N Mismatches in seed (0 or 1, default 0)
-L Seed length (default 20)
Library Types
Type Parameter Description
Directional (default) Standard WGBS/RRBS
Non-directional --non_directional All 4 strands
PBAT --pbat Post-bisulfite adapter tagging
Related Skills
-
methylation-calling - Extract methylation from Bismark BAM
-
methylkit-analysis - Import Bismark output to R
-
sequence-io/read-sequences - FASTQ handling
-
alignment-files/sam-bam-basics - BAM manipulation