bio-workflows-chipseq-pipeline

Complete workflow from raw ChIP-seq FASTQ files to annotated peaks.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "bio-workflows-chipseq-pipeline" with this command: npx skills add gptomics/bioskills/gptomics-bioskills-bio-workflows-chipseq-pipeline

ChIP-seq Pipeline

Complete workflow from raw ChIP-seq FASTQ files to annotated peaks.

Workflow Overview

FASTQ files (IP + Input) | v [1. QC & Trimming] -----> fastp | v [2. Alignment] ---------> Bowtie2 | v [3. BAM Processing] ----> sort, markdup, filter | v [4. Peak Calling] ------> MACS3 | v [5. QC] ----------------> FRiP, fingerprint plots | v [6. Annotation] --------> ChIPseeker | v Annotated peaks + QC report

Primary Path: Bowtie2 + MACS3 + ChIPseeker

Step 1: Quality Control with fastp

Process both IP and Input samples

for sample in IP_rep1 IP_rep2 Input_rep1 Input_rep2; do fastp -i ${sample}_R1.fastq.gz -I ${sample}_R2.fastq.gz
-o trimmed/${sample}_R1.fq.gz -O trimmed/${sample}_R2.fq.gz
--detect_adapter_for_pe
--qualified_quality_phred 20
--length_required 25
--html qc/${sample}_fastp.html done

Step 2: Alignment with Bowtie2

Build index (once)

bowtie2-build genome.fa bt2_index/genome

Align

for sample in IP_rep1 IP_rep2 Input_rep1 Input_rep2; do bowtie2 -p 8 -x bt2_index/genome
-1 trimmed/${sample}_R1.fq.gz
-2 trimmed/${sample}_R2.fq.gz
--no-mixed --no-discordant
--maxins 1000
2> aligned/${sample}.log |
samtools view -@ 4 -bS -q 30 - |
samtools sort -@ 4 -o aligned/${sample}.bam done

QC Checkpoint: Check alignment rate

  • Overall alignment >80%

  • Unique mapping >70%

Step 3: BAM Processing

for sample in IP_rep1 IP_rep2 Input_rep1 Input_rep2; do # Mark and remove duplicates samtools fixmate -m aligned/${sample}.bam - |
samtools sort - |
samtools markdup -r - aligned/${sample}.dedup.bam

# Index
samtools index aligned/${sample}.dedup.bam

# Remove chrM reads (high mitochondrial is common)
samtools view -h aligned/${sample}.dedup.bam | \
    grep -v chrM | \
    samtools view -b - > aligned/${sample}.final.bam
samtools index aligned/${sample}.final.bam

done

Step 4: Peak Calling with MACS3

Narrow peaks (TFs, sharp histone marks like H3K4me3)

macs3 callpeak
-t aligned/IP_rep1.final.bam aligned/IP_rep2.final.bam
-c aligned/Input_rep1.final.bam aligned/Input_rep2.final.bam
-f BAMPE
-g hs
-n experiment
--outdir peaks
-q 0.01

Broad peaks (H3K27me3, H3K36me3)

macs3 callpeak
-t aligned/IP_rep1.final.bam aligned/IP_rep2.final.bam
-c aligned/Input_rep1.final.bam aligned/Input_rep2.final.bam
-f BAMPE
-g hs
-n experiment_broad
--outdir peaks
--broad
--broad-cutoff 0.1

Step 5: QC Metrics

Calculate FRiP (Fraction of Reads in Peaks)

total_reads=$(samtools view -c aligned/IP_rep1.final.bam) reads_in_peaks=$(bedtools intersect -a aligned/IP_rep1.final.bam -b peaks/experiment_peaks.narrowPeak -u | samtools view -c) frip=$(echo "scale=4; $reads_in_peaks / $total_reads" | bc) echo "FRiP: $frip"

Generate bigWig for visualization

bamCoverage -b aligned/IP_rep1.final.bam
-o bigwig/IP_rep1.bw
--normalizeUsing RPKM
-p 8

Fingerprint plot (assess enrichment)

plotFingerprint
-b aligned/IP_rep1.final.bam aligned/Input_rep1.final.bam
--labels IP Input
-o qc/fingerprint.pdf

QC Checkpoint: Assess enrichment quality

  • FRiP >1% (ideally >5% for good enrichment)

  • Fingerprint shows clear separation between IP and Input

Step 6: Peak Annotation with ChIPseeker

library(ChIPseeker) library(TxDb.Hsapiens.UCSC.hg38.knownGene) library(org.Hs.eg.db)

txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene

Read peaks

peaks <- readPeakFile('peaks/experiment_peaks.narrowPeak')

Annotate

peak_anno <- annotatePeak(peaks, TxDb = txdb, annoDb = 'org.Hs.eg.db', tssRegion = c(-3000, 3000))

Visualize

plotAnnoPie(peak_anno) plotDistToTSS(peak_anno)

Export

write.csv(as.data.frame(peak_anno), 'peaks/annotated_peaks.csv')

Get genes with peaks in promoter

promoter_peaks <- as.data.frame(peak_anno) promoter_genes <- unique(promoter_peaks$SYMBOL[grepl('Promoter', promoter_peaks$annotation)]) write.table(promoter_genes, 'peaks/promoter_genes.txt', row.names = FALSE, col.names = FALSE, quote = FALSE)

Parameter Recommendations

Step Parameter Narrow Peaks Broad Peaks

MACS3 --broad No Yes

MACS3 -q 0.01

MACS3 --broad-cutoff

0.1

MACS3 -g hs/mm/ce/dm Same

Bowtie2 -q (samtools) 30 30

Troubleshooting

Issue Likely Cause Solution

Few peaks Low enrichment, wrong parameters Check fingerprint, adjust -q threshold

Many peaks High noise, PCR duplicates Remove duplicates, use stricter -q

Low FRiP Poor antibody, low enrichment Check antibody, increase sequencing

Peaks in blacklist Technical artifacts Filter against ENCODE blacklist

Complete Pipeline Script

#!/bin/bash set -e

THREADS=8 GENOME="genome.fa" INDEX="bt2_index/genome" IP_SAMPLES="IP_rep1 IP_rep2" INPUT_SAMPLES="Input_rep1 Input_rep2" OUTDIR="results"

mkdir -p ${OUTDIR}/{trimmed,aligned,peaks,qc,bigwig}

Step 1: QC

for sample in $IP_SAMPLES $INPUT_SAMPLES; do fastp -i ${sample}_R1.fastq.gz -I ${sample}_R2.fastq.gz
-o ${OUTDIR}/trimmed/${sample}_R1.fq.gz
-O ${OUTDIR}/trimmed/${sample}_R2.fq.gz
--html ${OUTDIR}/qc/${sample}_fastp.html -w ${THREADS} done

Step 2-3: Align and process

for sample in $IP_SAMPLES $INPUT_SAMPLES; do bowtie2 -p ${THREADS} -x ${INDEX}
-1 ${OUTDIR}/trimmed/${sample}_R1.fq.gz
-2 ${OUTDIR}/trimmed/${sample}_R2.fq.gz
--no-mixed --no-discordant 2> ${OUTDIR}/qc/${sample}_align.log |
samtools view -@ ${THREADS} -bS -q 30 - |
samtools fixmate -m - - |
samtools sort -@ ${THREADS} - |
samtools markdup -r - ${OUTDIR}/aligned/${sample}.bam samtools index ${OUTDIR}/aligned/${sample}.bam done

Step 4: Peak calling

ip_bams=$(for s in $IP_SAMPLES; do echo "${OUTDIR}/aligned/${s}.bam"; done | tr '\n' ' ') input_bams=$(for s in $INPUT_SAMPLES; do echo "${OUTDIR}/aligned/${s}.bam"; done | tr '\n' ' ')

macs3 callpeak -t ${ip_bams} -c ${input_bams}
-f BAMPE -g hs -n experiment
--outdir ${OUTDIR}/peaks -q 0.01

echo "Pipeline complete. Peaks: ${OUTDIR}/peaks/experiment_peaks.narrowPeak"

Related Skills

  • chip-seq/peak-calling - MACS3 parameters and options

  • chip-seq/peak-annotation - ChIPseeker annotation details

  • chip-seq/differential-binding - Compare conditions with DiffBind

  • chip-seq/chipseq-qc - Comprehensive QC metrics

  • chip-seq/motif-analysis - Find enriched motifs in peaks

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

bio-read-qc-fastp-workflow

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

bio-workflow-management-snakemake-workflows

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

bio-workflow-management-cwl-workflows

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

bio-workflows-rnaseq-to-de

No summary provided by upstream source.

Repository SourceNeeds Review