Chemical Safety & Toxicology Assessment
Comprehensive chemical safety and toxicology analysis integrating predictive AI models, curated toxicogenomics databases, regulatory safety data, and chemical-biological interaction networks. Generates structured risk assessment reports with evidence grading.
When to Use This Skill
Triggers:
-
"Is this chemical toxic?" / "What are the toxicity endpoints for [compound]?"
-
"Assess the safety profile of [drug/chemical]"
-
"What are the ADMET properties of [SMILES]?"
-
"What genes does [chemical] interact with?"
-
"What diseases are linked to [chemical] exposure?"
-
"Predict toxicity for these molecules"
-
"Drug safety assessment for [drug name]"
-
"Environmental health risk of [chemical]"
-
"Chemical hazard profiling"
-
"Toxicogenomic analysis of [compound]"
Use Cases:
-
Predictive Toxicology: AI-predicted toxicity endpoints (AMES mutagenicity, DILI, LD50, carcinogenicity, skin reactions) for novel compounds via SMILES
-
ADMET Profiling: Full absorption, distribution, metabolism, excretion, toxicity characterization
-
Toxicogenomics: Chemical-gene interaction mapping, gene-disease associations from CTD
-
Regulatory Safety: FDA label warnings, boxed warnings, contraindications, adverse reactions
-
Drug Safety Assessment: Combined DrugBank safety + FDA labels + adverse event data
-
Chemical-Protein Interactions: STITCH-based chemical-protein binding and interaction networks
-
Environmental Toxicology: Chemical-disease associations for environmental contaminants
KEY PRINCIPLES
-
Report-first approach - Create report file FIRST, then populate progressively
-
Tool parameter verification - Verify params via get_tool_info before calling unfamiliar tools
-
Evidence grading - Grade all safety claims by evidence strength (T1-T4)
-
Citation requirements - Every toxicity finding must have inline source attribution
-
Mandatory completeness - All sections must exist with data minimums or explicit "No data" notes
-
Disambiguation first - Resolve compound identity (name -> SMILES, CID, ChEMBL ID) before analysis
-
Negative results documented - "No toxicity signals found" is data; empty sections are failures
-
Conservative risk assessment - When evidence is ambiguous, flag as "requires further investigation"
-
English-first queries - Always use English chemical/drug names in tool calls
Evidence Grading System (MANDATORY)
Grade every toxicity claim by evidence strength:
Tier Symbol Criteria Examples
T1 [T1] Direct human evidence, regulatory finding FDA boxed warning, clinical trial toxicity, human case reports
T2 [T2] Animal studies, validated in vitro Nonclinical toxicology, AMES positive, animal LD50
T3 [T3] Computational prediction, association data ADMET-AI prediction, CTD association, QSAR model
T4 [T4] Database annotation, text-mined Literature mention, database entry without validation
Required Evidence Grading Locations
Evidence grades MUST appear in:
-
Executive Summary - Key toxicity findings graded
-
Toxicity Predictions - Every ADMET-AI endpoint with confidence note
-
Regulatory Safety - FDA findings marked [T1]
-
Chemical-Gene Interactions - CTD data marked by curation status
-
Risk Assessment - Final risk classification with supporting evidence tiers
Core Strategy: 8 Research Dimensions
Chemical/Drug Query | +-- PHASE 0: Compound Disambiguation (ALWAYS FIRST) | +-- Resolve name -> SMILES, PubChem CID, ChEMBL ID | +-- Get molecular formula, weight, canonical structure | +-- PHASE 1: Predictive Toxicology (ADMET-AI) | +-- Mutagenicity (AMES) | +-- Hepatotoxicity (DILI, ClinTox) | +-- Carcinogenicity | +-- Acute toxicity (LD50) | +-- Skin reactions | +-- Stress response pathways | +-- Nuclear receptor activity | +-- PHASE 2: ADMET Properties | +-- Absorption: BBB penetrance, bioavailability | +-- Distribution: clearance, volume of distribution | +-- Metabolism: CYP interactions (1A2, 2C9, 2C19, 2D6, 3A4) | +-- Physicochemical: solubility, lipophilicity, pKa | +-- PHASE 3: Toxicogenomics (CTD) | +-- Chemical-gene interactions | +-- Chemical-disease associations | +-- Affected biological pathways | +-- PHASE 4: Regulatory Safety (FDA Labels) | +-- Boxed warnings (Black Box) | +-- Contraindications | +-- Adverse reactions | +-- Warnings and precautions | +-- Nonclinical toxicology | +-- PHASE 5: Drug Safety Profile (DrugBank) | +-- Toxicity data | +-- Contraindications | +-- Drug interactions affecting safety | +-- PHASE 6: Chemical-Protein Interactions (STITCH) | +-- Direct chemical-protein binding | +-- Interaction confidence scores | +-- Off-target effects | +-- PHASE 7: Structural Alerts (ChEMBL) | +-- Known toxic substructures (PAINS, Brenk) | +-- Structural alert flags | +-- SYNTHESIS: Integrated Risk Assessment +-- Aggregate all evidence tiers +-- Risk classification (Low/Medium/High/Critical) +-- Data gaps and recommendations
Phase 0: Compound Disambiguation (ALWAYS FIRST)
CRITICAL: Resolve compound identity before any analysis.
Input Types Handled
Input Format Resolution Strategy
Drug name (e.g., "Aspirin") PubChem_get_CID_by_compound_name -> get SMILES from properties
SMILES string Use directly for ADMET-AI; resolve to CID for other tools
PubChem CID PubChem_get_compound_properties_by_CID -> get SMILES + name
ChEMBL ID ChEMBL_get_molecule -> get SMILES + properties
Resolution Steps
-
Input detection: Determine if input is name, SMILES, CID, or ChEMBL ID
-
SMILES: contains typical SMILES characters (=, #, [, ], (, ), c, n, o and no spaces in middle)
-
CID: numeric only
-
ChEMBL: starts with "CHEMBL"
-
Otherwise: treat as compound name
-
Name to CID: PubChem_get_CID_by_compound_name(name=<compound_name>)
-
CID to properties: PubChem_get_compound_properties_by_CID(cid=<cid>)
-
Extract SMILES: Get SMILES from PubChem properties (field: ConnectivitySMILES , CanonicalSMILES , or IsomericSMILES depending on response format)
-
Store resolved IDs: Maintain dict with name , smiles , cid , formula , weight , inchi
Disambiguation Output
Compound Identity
| Property | Value |
|---|---|
| Name | Acetaminophen |
| PubChem CID | 1983 |
| SMILES | CC(=O)Nc1ccc(O)cc1 |
| Formula | C8H9NO2 |
| Molecular Weight | 151.16 |
| InChI | InChI=1S/C8H9NO2/... |
Phase 1: Predictive Toxicology (ADMET-AI)
When: SMILES is available (from Phase 0 or provided directly)
Objective: Run comprehensive AI-predicted toxicity endpoints
Tools Used
All ADMET-AI tools take the same parameter format:
Tool Predicted Endpoints Parameter
ADMETAI_predict_toxicity
AMES, Carcinogens_Lagunin, ClinTox, DILI, LD50_Zhu, Skin_Reaction, hERG smiles : list[str]
ADMETAI_predict_stress_response
Stress response pathway activation (ARE, ATAD5, HSE, MMP, p53) smiles : list[str]
ADMETAI_predict_nuclear_receptor_activity
AhR, AR, ER, PPARg, Aromatase nuclear receptor activity smiles : list[str]
Workflow
-
Call ADMETAI_predict_toxicity(smiles=[resolved_smiles])
-
Call ADMETAI_predict_stress_response(smiles=[resolved_smiles])
-
Call ADMETAI_predict_nuclear_receptor_activity(smiles=[resolved_smiles])
-
For each endpoint, interpret prediction:
-
Classification endpoints: Active (1) = toxic signal, Inactive (0) = no signal
-
Regression endpoints (LD50): Report numerical value with context
-
All predictions graded [T3] (computational prediction)
Decision Logic
-
Multiple SMILES: Can batch up to ~10 SMILES in single call
-
Failed prediction: If ADMET-AI fails, note "prediction unavailable" (don't fail entire report)
-
Confidence: Note that AI predictions are [T3] evidence, not definitive
-
hERG flag: If hERG = Active, flag prominently (cardiac safety risk)
-
AMES flag: If AMES = Active, flag prominently (mutagenicity concern)
-
DILI flag: If DILI = Active, flag prominently (liver toxicity concern)
Output Table
Toxicity Predictions [T3]
| Endpoint | Prediction | Interpretation | Concern Level |
|---|---|---|---|
| AMES Mutagenicity | Inactive | No mutagenic signal | Low |
| Carcinogenicity | Inactive | No carcinogenic signal | Low |
| ClinTox | Active | Clinical toxicity signal | HIGH |
| DILI | Active | Drug-induced liver injury risk | HIGH |
| LD50 (Zhu) | 2.45 log(mg/kg) | ~282 mg/kg (moderate) | Medium |
| Skin Reaction | Inactive | No skin sensitization signal | Low |
| hERG Inhibition | Active | Cardiac arrhythmia risk | HIGH |
All predictions from ADMET-AI. Evidence tier: [T3] (computational prediction)
Phase 2: ADMET Properties
When: SMILES is available
Objective: Full ADMET characterization beyond toxicity
Tools Used
Tool Properties Predicted Parameter
ADMETAI_predict_BBB_penetrance
Blood-brain barrier crossing probability smiles : list[str]
ADMETAI_predict_bioavailability
Oral bioavailability (F20%, F30%) smiles : list[str]
ADMETAI_predict_clearance_distribution
Clearance, VDss, half-life, PPB smiles : list[str]
ADMETAI_predict_CYP_interactions
CYP1A2, 2C9, 2C19, 2D6, 3A4 inhibition/substrate smiles : list[str]
ADMETAI_predict_physicochemical_properties
LogP, LogD, LogS, MW, pKa smiles : list[str]
ADMETAI_predict_solubility_lipophilicity_hydration
Aqueous solubility, lipophilicity, hydration free energy smiles : list[str]
Workflow
-
Call all 6 ADMET tools in parallel (independent calls)
-
Compile results into Absorption / Distribution / Metabolism / Excretion sections
-
Assess Lipinski Rule of 5 compliance from physicochemical properties
-
Flag drug-drug interaction risks from CYP inhibition profiles
Decision Logic
-
BBB penetrant + toxicity: If BBB = Yes and any CNS toxicity endpoint active, flag as neurotoxicity risk
-
Low bioavailability: If F20% = Low, note absorption concerns
-
CYP inhibitor: If CYP3A4 inhibitor = Yes, flag high DDI risk
-
Lipinski violations: Count violations and report drug-likeness assessment
Output Format
ADMET Profile [T3]
Absorption
| Property | Value | Interpretation |
|---|---|---|
| BBB Penetrance | Yes | Crosses blood-brain barrier |
| Bioavailability (F20%) | 85% | Good oral absorption |
Distribution
| Property | Value | Interpretation |
|---|---|---|
| VDss | 1.2 L/kg | Moderate tissue distribution |
| PPB | 92% | Highly protein bound |
Metabolism
| CYP Enzyme | Substrate | Inhibitor |
|---|---|---|
| CYP1A2 | No | No |
| CYP2C9 | Yes | No |
| CYP2C19 | No | No |
| CYP2D6 | No | No |
| CYP3A4 | Yes | Yes (DDI risk) |
Excretion
| Property | Value | Interpretation |
|---|---|---|
| Clearance | 8.5 mL/min/kg | Moderate clearance |
| Half-life | 6.2 h | Moderate half-life |
Phase 3: Toxicogenomics (CTD)
When: Compound name is resolved
Objective: Map chemical-gene-disease relationships from curated CTD data
Tools Used
Tool Function Parameter
CTD_get_chemical_gene_interactions
Genes affected by chemical input_terms : str (chemical name)
CTD_get_chemical_diseases
Diseases linked to chemical exposure input_terms : str (chemical name)
Workflow
-
Call CTD_get_chemical_gene_interactions(input_terms=compound_name)
-
Call CTD_get_chemical_diseases(input_terms=compound_name)
-
Parse gene interactions: extract gene symbols, interaction types (increases/decreases expression, binding, etc.)
-
Parse disease associations: extract disease names, evidence types (marker/mechanism/therapeutic)
-
Identify most affected biological processes from gene list
Decision Logic
-
Direct evidence vs inferred: CTD separates curated direct evidence from inferred associations
-
Therapeutic vs toxic: Disease associations can be therapeutic (drug treats disease) or adverse (chemical causes disease)
-
Gene interaction types: Distinguish between expression changes, binding, and activity modulation
-
Prioritize marker/mechanism: These indicate stronger causal evidence than simple associations
-
Grade curated as [T2]: Direct curated CTD evidence from literature
-
Grade inferred as [T3]: Computationally inferred associations
Output Format
Toxicogenomics (CTD) [T2/T3]
Chemical-Gene Interactions (Top 20)
| Gene | Interaction | Type | Evidence |
|---|---|---|---|
| CYP1A2 | increases expression | mRNA | [T2] curated |
| TP53 | affects activity | protein | [T2] curated |
| ... | ... | ... | ... |
Total interactions found: 156 Top affected pathways: Xenobiotic metabolism, Apoptosis, DNA damage response
Chemical-Disease Associations (Top 10)
| Disease | Association Type | Evidence |
|---|---|---|
| Liver Neoplasms | marker/mechanism | [T2] curated |
| Contact Dermatitis | therapeutic | [T2] curated |
| ... | ... | ... |
Phase 4: Regulatory Safety (FDA Labels)
When: Compound has an approved drug name
Objective: Extract regulatory safety information from FDA drug labels
Tools Used
Tool Information Retrieved Parameter
FDA_get_boxed_warning_info_by_drug_name
Black box warnings (most serious) drug_name : str
FDA_get_contraindications_by_drug_name
Absolute contraindications drug_name : str
FDA_get_adverse_reactions_by_drug_name
Known adverse reactions drug_name : str
FDA_get_warnings_by_drug_name
Warnings and precautions drug_name : str
FDA_get_nonclinical_toxicology_info_by_drug_name
Animal toxicology data drug_name : str
FDA_get_carcinogenic_mutagenic_fertility_by_drug_name
Carcinogenicity/mutagenicity/fertility data drug_name : str
Workflow
-
Call all 6 FDA tools in parallel (independent queries by drug name)
-
Parse and structure each response
-
Prioritize: Boxed Warnings > Contraindications > Warnings > Adverse Reactions
-
All FDA label data is [T1] evidence (regulatory finding based on human/animal data)
Decision Logic
-
Boxed warning present: Flag as CRITICAL safety concern in executive summary
-
No FDA data: Chemical may not be an approved drug; note "Not an FDA-approved drug" and continue with other phases
-
Multiple warnings: Categorize by organ system (hepatic, cardiac, renal, CNS, etc.)
-
Nonclinical toxicology: Grade as [T2] (animal data supporting human risk)
Output Format
Regulatory Safety (FDA) [T1]
Boxed Warning
PRESENT - Hepatotoxicity risk with doses >4g/day. Liver failure reported. [T1]
Contraindications
- Severe hepatic impairment [T1]
- Known hypersensitivity [T1]
Adverse Reactions (by frequency)
| Reaction | Frequency | Severity |
|---|---|---|
| Nausea | Common (>1%) | Mild |
| Hepatotoxicity | Rare (<0.1%) | Severe |
| ... | ... | ... |
Nonclinical Toxicology [T2]
- Carcinogenicity: No carcinogenic potential in 2-year rat/mouse studies
- Mutagenicity: Negative in Ames assay and in vivo micronucleus test
- Fertility: No effects on fertility at doses up to 10x human dose
Phase 5: Drug Safety Profile (DrugBank)
When: Compound is a known drug
Objective: Retrieve curated drug safety data from DrugBank
Tools Used
Tool Information Parameters
drugbank_get_safety_by_drug_name_or_drugbank_id
Toxicity, contraindications query : str, case_sensitive : bool, exact_match : bool, limit : int
Workflow
-
Call drugbank_get_safety_by_drug_name_or_drugbank_id(query=drug_name, case_sensitive=False, exact_match=False, limit=5)
-
Parse toxicity information, overdose data, contraindications
-
Cross-reference with FDA data from Phase 4
Decision Logic
-
Toxicity field: Contains LD50 values, overdose symptoms, organ toxicity data
-
DrugBank ID: Note if found for cross-referencing
-
Conflict with FDA: If DrugBank and FDA disagree, note discrepancy and defer to FDA [T1]
-
Not found: Chemical may not be in DrugBank; continue with other phases
Phase 6: Chemical-Protein Interactions (STITCH)
When: Compound can be identified by name or SMILES
Objective: Map chemical-protein interaction network for off-target assessment
Tools Used
Tool Function Parameters
STITCH_resolve_identifier
Resolve chemical name to STITCH ID identifier : str, species : int (9606=human)
STITCH_get_chemical_protein_interactions
Get chemical-protein interactions identifiers : list[str], species : int, required_score : int
STITCH_get_interaction_partners
Get interaction network identifiers : list[str], species : int, limit : int
Workflow
-
Resolve compound: STITCH_resolve_identifier(identifier=compound_name, species=9606)
-
Get interactions: STITCH_get_chemical_protein_interactions(identifiers=[stitch_id], species=9606, required_score=700)
-
Identify off-target proteins (not the intended drug target)
-
Flag safety-relevant targets: hERG (cardiac), CYP enzymes (metabolism), nuclear receptors (endocrine)
Decision Logic
-
High confidence (>900): Well-established interaction [T2]
-
Medium confidence (700-900): Probable interaction [T3]
-
Low confidence (400-700): Possible interaction, needs validation [T4]
-
Safety-relevant targets: Flag interactions with known safety targets
-
No STITCH data: Chemical may be too novel; note and continue
Phase 7: Structural Alerts (ChEMBL)
When: ChEMBL molecule ID is available (from Phase 0)
Objective: Check for known toxic substructures
Tools Used
Tool Function Parameters
ChEMBL_search_compound_structural_alerts
Find structural alert matches molecule_chembl_id : str, limit : int
Workflow
-
If ChEMBL ID available: ChEMBL_search_compound_structural_alerts(molecule_chembl_id=chembl_id, limit=20)
-
Parse alert types: PAINS (pan-assay interference), Brenk (medicinal chemistry), Glaxo (GSK structural alerts)
-
Categorize severity: Some alerts are informational, others indicate likely toxicity
Decision Logic
-
PAINS alerts: May cause false positives in screening; note for medicinal chemistry
-
Brenk alerts: Known problematic substructures; flag if present
-
No alerts: Good sign but not definitive proof of safety
-
No ChEMBL ID: Skip this phase gracefully; note "structural alert analysis not available"
Synthesis: Integrated Risk Assessment (MANDATORY)
Always the final section. Integrates all evidence into actionable risk classification.
Risk Classification Matrix
Risk Level Criteria
CRITICAL FDA boxed warning present OR multiple [T1] toxicity findings OR active DILI + active hERG
HIGH FDA warnings present OR [T2] animal toxicity OR multiple active ADMET endpoints
MEDIUM Some [T3] predictions positive OR CTD disease associations OR structural alerts
LOW All ADMET endpoints negative AND no FDA/DrugBank safety flags AND no CTD concerns
INSUFFICIENT DATA Fewer than 3 phases returned data; cannot make confident assessment
Synthesis Template
Integrated Risk Assessment
Overall Risk Classification: [HIGH]
Evidence Summary
| Dimension | Finding | Evidence Tier | Concern |
|---|---|---|---|
| ADMET Toxicity | DILI active, hERG active | [T3] | HIGH |
| FDA Label | Boxed warning for hepatotoxicity | [T1] | CRITICAL |
| CTD Toxicogenomics | 156 gene interactions, liver neoplasms | [T2] | HIGH |
| DrugBank | Known hepatotoxicity at high doses | [T2] | HIGH |
| STITCH | Binds CYP3A4, hERG | [T3] | MEDIUM |
| Structural Alerts | 2 Brenk alerts | [T3] | MEDIUM |
Key Safety Concerns
- Hepatotoxicity [T1]: FDA boxed warning + ADMET-AI DILI prediction + CTD liver disease associations
- Cardiac Risk [T3]: ADMET-AI hERG prediction + STITCH hERG interaction
- Drug Interactions [T3]: CYP3A4 substrate/inhibitor, potential DDI risk
Data Gaps
- No in vivo genotoxicity data available
- STITCH interaction scores moderate (700-900)
- No environmental exposure data
Recommendations
- Avoid doses >4g/day (hepatotoxicity threshold) [T1]
- Monitor liver function in chronic use [T1]
- Screen for CYP3A4 interactions before co-administration [T3]
- Consider cardiac monitoring for at-risk patients [T3]
Mandatory Completeness Checklist
Before finalizing any report, verify:
-
Phase 0: Compound fully disambiguated (SMILES + CID at minimum)
-
Phase 1: At least 5 toxicity endpoints reported or "prediction unavailable" noted
-
Phase 2: ADMET profile with A/D/M/E sections or "not available" noted
-
Phase 3: CTD queried; gene interactions and disease associations reported or "no data in CTD"
-
Phase 4: FDA labels queried; results or "not an FDA-approved drug" noted
-
Phase 5: DrugBank queried; results or "not found in DrugBank" noted
-
Phase 6: STITCH queried; results or "no STITCH data available" noted
-
Phase 7: Structural alerts checked or "ChEMBL ID not available" noted
-
Synthesis: Risk classification provided with evidence summary
-
Evidence Grading: All findings have [T1]-[T4] annotations
-
Data Gaps: Explicitly listed in synthesis section
Tool Parameter Reference
Critical Parameter Notes (verified from source code):
Tool Parameter Name Type Notes
All ADMETAI tools smiles
list[str]
Always a list, even for single compound
All CTD tools input_terms
str
Chemical name, MeSH name, CAS RN, or MeSH ID
All FDA tools drug_name
str
Brand or generic drug name
drugbank_get_safety_* query , case_sensitive , exact_match , limit
str, bool, bool, int All 4 required
STITCH_resolve_identifier identifier , species
str, int species=9606 for human
STITCH_get_chemical_protein_interactions identifiers , species , required_score
list[str], int, int required_score=400 default
PubChem_get_CID_by_compound_name name
str
Compound name (not SMILES)
PubChem_get_compound_properties_by_CID cid
int
Numeric CID
ChEMBL_search_compound_structural_alerts molecule_chembl_id
str
ChEMBL ID (e.g., "CHEMBL112")
Response Format Notes
-
ADMET-AI: Returns {status: "success", data: {...}} with prediction values
-
CTD: Returns list of interaction/association objects
-
FDA: Returns {status, data} with label text
-
DrugBank: Returns {data: [...]} with drug records
-
STITCH: Returns list of interaction objects with scores
-
PubChem CID lookup: Returns {IdentifierList: {CID: [...]}} (may or may not have data wrapper)
-
PubChem properties: Returns dict with CID , MolecularWeight , ConnectivitySMILES , IUPACName
Fallback Strategies
Compound Resolution
-
Primary: PubChem by name -> CID -> properties -> SMILES
-
Fallback 1: ChEMBL search by name -> molecule -> SMILES
-
Fallback 2: If SMILES provided directly, skip name resolution
Toxicity Prediction
-
Primary: All 9 ADMET-AI endpoints
-
Fallback: If ADMET-AI fails for a compound, note "prediction failed" and continue with database evidence
-
Note: ADMET-AI may fail for very large or unusual SMILES
Regulatory Data
-
Primary: FDA labels by drug name
-
Fallback: If FDA returns no data, try alternative drug names (brand vs generic)
-
Note: Non-drug chemicals (pesticides, industrial) will not have FDA labels
CTD Data
-
Primary: Search by common chemical name
-
Fallback: Try MeSH name if common name fails
-
Note: Novel compounds may not be in CTD
Common Use Patterns
Pattern 1: Novel Compound Assessment
Input: SMILES string for new molecule Workflow: Phase 0 (SMILES->CID) -> Phase 1 (toxicity) -> Phase 2 (ADMET) -> Phase 7 (structural alerts) -> Synthesis Output: Predictive safety profile for novel compound
Pattern 2: Approved Drug Safety Review
Input: Drug name (e.g., "Acetaminophen") Workflow: All phases (0-7 + Synthesis) Output: Complete safety dossier with regulatory + predictive + database evidence
Pattern 3: Environmental Chemical Risk
Input: Chemical name (e.g., "Bisphenol A") Workflow: Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 (CTD, key for env chemicals) -> Phase 6 -> Synthesis Output: Environmental health risk assessment focused on gene-disease associations
Pattern 4: Batch Toxicity Screening
Input: Multiple SMILES strings Workflow: Phase 0 -> Phase 1 (batch) -> Phase 2 (batch) -> Comparative table -> Synthesis Output: Comparative toxicity table ranking compounds by safety
Pattern 5: Toxicogenomic Deep-Dive
Input: Chemical name + specific gene or disease interest Workflow: Phase 0 -> Phase 3 (CTD expanded) -> Literature search -> Synthesis Output: Detailed chemical-gene-disease mechanistic analysis
Output Report Structure
All analyses generate a structured markdown report with progressive sections:
Chemical Safety & Toxicology Report: [Compound Name]
Generated: YYYY-MM-DD HH:MM Compound: [Name] | SMILES: [SMILES] | CID: [CID]
Executive Summary
[2-3 sentence overview with risk classification and key findings, all graded]
1. Compound Identity
[Phase 0 results - disambiguation table]
2. Predictive Toxicology
[Phase 1 results - ADMET-AI toxicity endpoints]
3. ADMET Profile
[Phase 2 results - absorption, distribution, metabolism, excretion]
4. Toxicogenomics
[Phase 3 results - CTD chemical-gene-disease relationships]
5. Regulatory Safety
[Phase 4 results - FDA label information]
6. Drug Safety Profile
[Phase 5 results - DrugBank data]
7. Chemical-Protein Interactions
[Phase 6 results - STITCH network]
8. Structural Alerts
[Phase 7 results - ChEMBL alerts]
9. Integrated Risk Assessment
[Synthesis - risk classification, evidence summary, data gaps, recommendations]
Appendix: Methods and Data Sources
[Tool versions, databases queried, date of access]
Limitations & Known Issues
Tool-Specific
-
ADMET-AI: Predictions are computational [T3]; should not replace experimental testing
-
CTD: Curated but may lag behind latest literature by 6-12 months
-
FDA: Only covers FDA-approved drugs; not applicable to environmental chemicals or supplements
-
DrugBank: Primarily drugs; limited coverage of industrial chemicals
-
STITCH: Score thresholds affect sensitivity; lower scores increase false positives
-
ChEMBL: Structural alerts require ChEMBL ID; not all compounds have one
Analysis
-
Novel compounds: May only have ADMET-AI predictions (no database evidence)
-
Environmental chemicals: FDA/DrugBank phases will be empty; rely on CTD and ADMET-AI
-
Batch mode: ADMET-AI can handle batches; other tools require individual queries
-
Species specificity: Most data is human-centric; animal data noted where applicable
Technical
-
SMILES validity: Invalid SMILES will cause ADMET-AI failures
-
Name ambiguity: Chemical names can be ambiguous; always verify with CID
-
Rate limits: Some FDA endpoints may rate-limit for rapid queries
Summary
Chemical Safety & Toxicology Assessment Skill provides comprehensive safety evaluation by integrating:
-
Predictive toxicology (ADMET-AI) - 9 tools covering toxicity, ADMET, physicochemical properties
-
Toxicogenomics (CTD) - Chemical-gene-disease relationship mapping
-
Regulatory safety (FDA) - 6 tools for label-based safety extraction
-
Drug safety (DrugBank) - Curated toxicity and contraindication data
-
Chemical interactions (STITCH) - Chemical-protein interaction networks
-
Structural alerts (ChEMBL) - Known toxic substructure detection
Outputs: Structured markdown report with risk classification, evidence grading, and actionable recommendations
Best for: Drug safety assessment, chemical hazard profiling, environmental toxicology, ADMET characterization, toxicogenomic analysis
Total tools integrated: 25+ tools across 6 databases