alphafold

Validate protein designs using AlphaFold2 structure prediction. Use this skill when: (1) Validating designed sequences fold correctly, (2) Predicting binder-target complex structures, (3) Calculating confidence metrics (pLDDT, pTM, ipTM), (4) Self-consistency validation of designs, (5) Multi-chain complex prediction with AlphaFold-Multimer. For faster single-chain prediction, use esm. For QC thresholds, use protein-qc.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "alphafold" with this command: npx skills add adaptyvbio/protein-design-skills/adaptyvbio-protein-design-skills-alphafold

AlphaFold2 Structure Validation

Prerequisites

RequirementMinimumRecommended
Python3.8+3.10
CUDA11.0+12.0+
GPU VRAM32GB40GB (A100)
RAM32GB64GB
Disk100GB500GB (for databases)

How to run

First time? See Installation Guide to set up Modal and biomodals.

Option 1: ColabFold (recommended for multimer)

cd biomodals
modal run modal_colabfold.py \
  --input-faa sequences.fasta \
  --out-dir output/

GPU: A100 (40GB) | Timeout: 3600s default

Option 2: Local installation

git clone https://github.com/deepmind/alphafold.git
cd alphafold

python run_alphafold.py \
  --fasta_paths=query.fasta \
  --output_dir=output/ \
  --model_preset=monomer \
  --max_template_date=2026-01-01

Option 3: ESMFold (fast single-chain)

modal run modal_esmfold.py \
  --sequence "MKTAYIAKQRQISFVK..."

Key parameters

ParameterDefaultOptionsDescription
--model_presetmonomermonomer/multimerModel type
--num_recycle31-20Recycling iterations
--max_template_date-YYYY-MM-DDTemplate cutoff
--use_templatesTrueTrue/FalseUse template search

Output format

output/
├── ranked_0.pdb           # Best model
├── ranked_1.pdb           # Second best
├── ranking_debug.json     # Confidence scores
├── result_model_1.pkl     # Full results
├── msas/                  # MSA files
└── features.pkl           # Input features

Extracting metrics

import pickle

with open('result_model_1.pkl', 'rb') as f:
    result = pickle.load(f)

plddt = result['plddt']
ptm = result['ptm']
iptm = result.get('iptm', None)  # Multimer only
pae = result['predicted_aligned_error']

Sample output

Successful run

$ python run_alphafold.py --fasta_paths complex.fasta --model_preset multimer
[INFO] Running MSA search...
[INFO] Running model 1/5...
[INFO] Running model 5/5...
[INFO] Relaxing structures...

Results:
  ranked_0.pdb:
    pLDDT: 87.3 (mean)
    pTM: 0.78
    ipTM: 0.62
    PAE (interface): 8.5

Saved to output/

What good output looks like:

  • pLDDT: > 85 (mean, on 0-100 scale) or > 0.85 (normalized)
  • pTM: > 0.70
  • ipTM: > 0.50 for complexes
  • PAE_interface: < 10

Decision tree

Should I use AlphaFold?
│
├─ What are you predicting?
│  ├─ Single protein → ESMFold (faster)
│  ├─ Protein-protein complex → AlphaFold/ColabFold ✓
│  ├─ Protein + ligand → Chai or Boltz
│  └─ Batch of sequences → ColabFold ✓
│
├─ What do you need?
│  ├─ Highest accuracy → AlphaFold/ColabFold ✓
│  ├─ Fast screening → ESMFold
│  └─ MSA-free prediction → Chai or ESMFold
│
└─ Which AF2 option?
   ├─ Local installation → Full control, slow setup
   ├─ ColabFold → Easier, MSA server
   └─ Modal → Recommended for batch

Typical performance

Campaign SizeTime (A100)Cost (Modal)Notes
100 complexes1-2h~$8With MSA server
500 complexes5-10h~$40Standard campaign
1000 complexes10-20h~$80Large campaign

Per-complex: ~30-60s with MSA server.


Verify

find output -name "ranked_0.pdb" | wc -l  # Should match input count

Troubleshooting

Low pLDDT regions: May indicate disorder or poor design Low ipTM: Interface not confident, check hotspots High PAE off-diagonal: Chains may not interact OOM errors: Use ColabFold with MSA server instead

Error interpretation

ErrorCauseFix
RuntimeError: CUDA out of memorySequence too longUse A100 or split prediction
KeyError: 'iptm'Running monomer on complexUse multimer preset
FileNotFoundError: databaseMissing MSA databasesUse ColabFold MSA server
TimeoutErrorMSA search slowReduce num_recycles

Next: protein-qc for filtering and ranking.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

cell-free-expression

No summary provided by upstream source.

Repository SourceNeeds Review
General

binding-characterization

No summary provided by upstream source.

Repository SourceNeeds Review
General

protein-qc

No summary provided by upstream source.

Repository SourceNeeds Review
General

ipsae

No summary provided by upstream source.

Repository SourceNeeds Review