medchem

Medchem is a Python library for molecular filtering and prioritization in drug discovery workflows. Apply hundreds of well-established and novel molecular filters, structural alerts, and medicinal chemistry rules to efficiently triage and prioritize compound libraries at scale. Rules and filters are context-specific—use as guidelines combined with domain expertise.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "medchem" with this command: npx skills add jimmc414/kosmos/jimmc414-kosmos-medchem

Medchem

Overview

Medchem is a Python library for molecular filtering and prioritization in drug discovery workflows. Apply hundreds of well-established and novel molecular filters, structural alerts, and medicinal chemistry rules to efficiently triage and prioritize compound libraries at scale. Rules and filters are context-specific—use as guidelines combined with domain expertise.

When to Use This Skill

This skill should be used when:

  • Applying drug-likeness rules (Lipinski, Veber, etc.) to compound libraries

  • Filtering molecules by structural alerts or PAINS patterns

  • Prioritizing compounds for lead optimization

  • Assessing compound quality and medicinal chemistry properties

  • Detecting reactive or problematic functional groups

  • Calculating molecular complexity metrics

Installation

uv pip install medchem

Core Capabilities

  1. Medicinal Chemistry Rules

Apply established drug-likeness rules to molecules using the medchem.rules module.

Available Rules:

  • Rule of Five (Lipinski)

  • Rule of Oprea

  • Rule of CNS

  • Rule of leadlike (soft and strict)

  • Rule of three

  • Rule of Reos

  • Rule of drug

  • Rule of Veber

  • Golden triangle

  • PAINS filters

Single Rule Application:

import medchem as mc

Apply Rule of Five to a SMILES string

smiles = "CC(=O)OC1=CC=CC=C1C(=O)O" # Aspirin passes = mc.rules.basic_rules.rule_of_five(smiles)

Returns: True

Check specific rules

passes_oprea = mc.rules.basic_rules.rule_of_oprea(smiles) passes_cns = mc.rules.basic_rules.rule_of_cns(smiles)

Multiple Rules with RuleFilters:

import datamol as dm import medchem as mc

Load molecules

mols = [dm.to_mol(smiles) for smiles in smiles_list]

Create filter with multiple rules

rfilter = mc.rules.RuleFilters( rule_list=[ "rule_of_five", "rule_of_oprea", "rule_of_cns", "rule_of_leadlike_soft" ] )

Apply filters with parallelization

results = rfilter( mols=mols, n_jobs=-1, # Use all CPU cores progress=True )

Result Format: Results are returned as dictionaries with pass/fail status and detailed information for each rule.

  1. Structural Alert Filters

Detect potentially problematic structural patterns using the medchem.structural module.

Available Filters:

  • Common Alerts - General structural alerts derived from ChEMBL curation and literature

  • NIBR Filters - Novartis Institutes for BioMedical Research filter set

  • Lilly Demerits - Eli Lilly's demerit-based system (275 rules, molecules rejected at >100 demerits)

Common Alerts:

import medchem as mc

Create filter

alert_filter = mc.structural.CommonAlertsFilters()

Check single molecule

mol = dm.to_mol("c1ccccc1") has_alerts, details = alert_filter.check_mol(mol)

Batch filtering with parallelization

results = alert_filter( mols=mol_list, n_jobs=-1, progress=True )

NIBR Filters:

import medchem as mc

Apply NIBR filters

nibr_filter = mc.structural.NIBRFilters() results = nibr_filter(mols=mol_list, n_jobs=-1)

Lilly Demerits:

import medchem as mc

Calculate Lilly demerits

lilly = mc.structural.LillyDemeritsFilters() results = lilly(mols=mol_list, n_jobs=-1)

Each result includes demerit score and whether it passes (≤100 demerits)

  1. Functional API for High-Level Operations

The medchem.functional module provides convenient functions for common workflows.

Quick Filtering:

import medchem as mc

Apply NIBR filters to a list

filter_ok = mc.functional.nibr_filter( mols=mol_list, n_jobs=-1 )

Apply common alerts

alert_results = mc.functional.common_alerts_filter( mols=mol_list, n_jobs=-1 )

  1. Chemical Groups Detection

Identify specific chemical groups and functional groups using medchem.groups .

Available Groups:

  • Hinge binders

  • Phosphate binders

  • Michael acceptors

  • Reactive groups

  • Custom SMARTS patterns

Usage:

import medchem as mc

Create group detector

group = mc.groups.ChemicalGroup(groups=["hinge_binders"])

Check for matches

has_matches = group.has_match(mol_list)

Get detailed match information

matches = group.get_matches(mol)

  1. Named Catalogs

Access curated collections of chemical structures through medchem.catalogs .

Available Catalogs:

  • Functional groups

  • Protecting groups

  • Common reagents

  • Standard fragments

Usage:

import medchem as mc

Access named catalogs

catalogs = mc.catalogs.NamedCatalogs

Use catalog for matching

catalog = catalogs.get("functional_groups") matches = catalog.get_matches(mol)

  1. Molecular Complexity

Calculate complexity metrics that approximate synthetic accessibility using medchem.complexity .

Common Metrics:

  • Bertz complexity

  • Whitlock complexity

  • Barone complexity

Usage:

import medchem as mc

Calculate complexity

complexity_score = mc.complexity.calculate_complexity(mol)

Filter by complexity threshold

complex_filter = mc.complexity.ComplexityFilter(max_complexity=500) results = complex_filter(mols=mol_list)

  1. Constraints Filtering

Apply custom property-based constraints using medchem.constraints .

Example Constraints:

  • Molecular weight ranges

  • LogP bounds

  • TPSA limits

  • Rotatable bond counts

Usage:

import medchem as mc

Define constraints

constraints = mc.constraints.Constraints( mw_range=(200, 500), logp_range=(-2, 5), tpsa_max=140, rotatable_bonds_max=10 )

Apply constraints

results = constraints(mols=mol_list, n_jobs=-1)

  1. Medchem Query Language

Use a specialized query language for complex filtering criteria.

Query Examples:

Molecules passing Ro5 AND not having common alerts

"rule_of_five AND NOT common_alerts"

CNS-like molecules with low complexity

"rule_of_cns AND complexity < 400"

Leadlike molecules without Lilly demerits

"rule_of_leadlike AND lilly_demerits == 0"

Usage:

import medchem as mc

Parse and apply query

query = mc.query.parse("rule_of_five AND NOT common_alerts") results = query.apply(mols=mol_list, n_jobs=-1)

Workflow Patterns

Pattern 1: Initial Triage of Compound Library

Filter a large compound collection to identify drug-like candidates.

import datamol as dm import medchem as mc import pandas as pd

Load compound library

df = pd.read_csv("compounds.csv") mols = [dm.to_mol(smi) for smi in df["smiles"]]

Apply primary filters

rule_filter = mc.rules.RuleFilters(rule_list=["rule_of_five", "rule_of_veber"]) rule_results = rule_filter(mols=mols, n_jobs=-1, progress=True)

Apply structural alerts

alert_filter = mc.structural.CommonAlertsFilters() alert_results = alert_filter(mols=mols, n_jobs=-1, progress=True)

Combine results

df["passes_rules"] = rule_results["pass"] df["has_alerts"] = alert_results["has_alerts"] df["drug_like"] = df["passes_rules"] & ~df["has_alerts"]

Save filtered compounds

filtered_df = df[df["drug_like"]] filtered_df.to_csv("filtered_compounds.csv", index=False)

Pattern 2: Lead Optimization Filtering

Apply stricter criteria during lead optimization.

import medchem as mc

Create comprehensive filter

filters = { "rules": mc.rules.RuleFilters(rule_list=["rule_of_leadlike_strict"]), "alerts": mc.structural.NIBRFilters(), "lilly": mc.structural.LillyDemeritsFilters(), "complexity": mc.complexity.ComplexityFilter(max_complexity=400) }

Apply all filters

results = {} for name, filt in filters.items(): results[name] = filt(mols=candidate_mols, n_jobs=-1)

Identify compounds passing all filters

passes_all = all(r["pass"] for r in results.values())

Pattern 3: Identify Specific Chemical Groups

Find molecules containing specific functional groups or scaffolds.

import medchem as mc

Create group detector for multiple groups

group_detector = mc.groups.ChemicalGroup( groups=["hinge_binders", "phosphate_binders"] )

Screen library

matches = group_detector.get_all_matches(mol_list)

Filter molecules with desired groups

mol_with_groups = [mol for mol, match in zip(mol_list, matches) if match]

Best Practices

Context Matters: Don't blindly apply filters. Understand the biological target and chemical space.

Combine Multiple Filters: Use rules, structural alerts, and domain knowledge together for better decisions.

Use Parallelization: For large datasets (>1000 molecules), always use n_jobs=-1 for parallel processing.

Iterative Refinement: Start with broad filters (Ro5), then apply more specific criteria (CNS, leadlike) as needed.

Document Filtering Decisions: Track which molecules were filtered out and why for reproducibility.

Validate Results: Remember that marketed drugs often fail standard filters—use these as guidelines, not absolute rules.

Consider Prodrugs: Molecules designed as prodrugs may intentionally violate standard medicinal chemistry rules.

Resources

references/api_guide.md

Comprehensive API reference covering all medchem modules with detailed function signatures, parameters, and return types.

references/rules_catalog.md

Complete catalog of available rules, filters, and alerts with descriptions, thresholds, and literature references.

scripts/filter_molecules.py

Production-ready script for batch filtering workflows. Supports multiple input formats (CSV, SDF, SMILES), configurable filter combinations, and detailed reporting.

Usage:

python scripts/filter_molecules.py input.csv --rules rule_of_five,rule_of_cns --alerts nibr --output filtered.csv

Documentation

Official documentation: https://medchem-docs.datamol.io/ GitHub repository: https://github.com/datamol-io/medchem

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

clinical-reports

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

clinical-decision-support

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

zarr-python

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

clinicaltrials-database

No summary provided by upstream source.

Repository SourceNeeds Review