clinical-data-cleaner

Use when cleaning clinical trial data, preparing data for FDA/EMA submission, standardizing SDTM datasets, handling missing values in clinical studies, detecting outliers in lab results, or converting raw CRF data to CDISC format. Cleans and standardizes clinical trial data for regulatory compliance with audit trails.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "clinical-data-cleaner" with this command: npx skills add renhaosu2024/clinical-data-cleaner

Clinical Data Cleaner

Clean, validate, and standardize clinical trial data to meet CDISC SDTM standards for regulatory submissions to FDA or EMA.

Quick Start

from scripts.main import ClinicalDataCleaner

# Initialize for Demographics domain
cleaner = ClinicalDataCleaner(domain='DM')

# Clean data with default settings
cleaned = cleaner.clean(raw_data)

# Save with audit trail
cleaner.save_report('output.csv')

Core Capabilities

1. SDTM Domain Validation

cleaner = ClinicalDataCleaner(domain='DM')  # or 'LB', 'VS'
is_valid, missing = cleaner.validate_domain(data)

Required Fields:

  • DM: STUDYID, USUBJID, SUBJID, RFSTDTC, RFENDTC, SITEID, AGE, SEX, RACE
  • LB: STUDYID, USUBJID, LBTESTCD, LBCAT, LBORRES, LBORRESU, LBSTRESC, LBDTC
  • VS: STUDYID, USUBJID, VSTESTCD, VSORRES, VSORRESU, VSSTRESC, VSDTC

2. Missing Value Handling

cleaner = ClinicalDataCleaner(
    domain='DM',
    missing_strategy='median'  # mean, median, mode, forward, drop
)
cleaned = cleaner.handle_missing_values(data)

3. Outlier Detection

cleaner = ClinicalDataCleaner(
    domain='LB',
    outlier_method='domain',  # iqr, zscore, domain
    outlier_action='flag'     # flag, remove, cap
)
flagged = cleaner.detect_outliers(data)

Clinical Thresholds:

ParameterRangeUnit
Glucose50-500mg/dL
Hemoglobin5-20g/dL
Systolic BP70-220mmHg

4. Date Standardization

standardized = cleaner.standardize_dates(data)
# Converts to ISO 8601: 2023-01-15T09:30:00

5. Complete Pipeline

cleaner = ClinicalDataCleaner(
    domain='DM',
    missing_strategy='median',
    outlier_method='iqr',
    outlier_action='flag'
)
cleaned_data = cleaner.clean(data)
cleaner.save_report('output.csv')

Output Files:

  • output.csv - Cleaned SDTM data
  • output.report.json - Audit trail for regulatory submission

CLI Usage

# Clean demographics
python scripts/main.py \
  --input dm_raw.csv \
  --domain DM \
  --output dm_clean.csv \
  --missing-strategy median \
  --outlier-method iqr \
  --outlier-action flag

# Clean lab data with clinical thresholds
python scripts/main.py \
  --input lb_raw.csv \
  --domain LB \
  --output lb_clean.csv \
  --outlier-method domain

Common Patterns

See references/common-patterns.md for detailed examples:

  • Regulatory Submission Preparation
  • Interim Analysis Data Preparation
  • Database Migration Cleanup
  • External Lab Data Integration

Troubleshooting

See references/troubleshooting.md for solutions to:

  • Validation failures
  • Date parsing errors
  • Memory errors with large datasets
  • Outlier detection issues

Quality Checklist

Pre-Cleaning:

  • IACUC approval obtained (animal studies)
  • Sample size adequately powered
  • Randomization method documented

Post-Cleaning:

  • Validate against CDISC SDTM IG
  • Review all cleaning actions in audit trail
  • Test import to analysis software

References

  • references/sdtm_ig_guide.md - CDISC SDTM Implementation Guide
  • references/domain_specs.json - Domain-specific field requirements
  • references/outlier_thresholds.json - Clinical outlier thresholds
  • references/common-patterns.md - Detailed usage patterns
  • references/troubleshooting.md - Problem-solving guide

Skill ID: 189 | Version: 2.0 | License: MIT

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Unity Plugin

Control Unity Editor via OpenClaw Unity Plugin. Use for Unity game development tasks including scene management, GameObject/Component manipulation, debugging...

Registry SourceRecently Updated
Coding

one-mail

统一邮箱管理 CLI,支持 Gmail、Outlook、网易邮箱(163.com、126.com)。适用于:(1) 收取/发送邮件,(2) 跨账户搜索邮件,(3) 管理多个邮箱账户,(4) 查看邮件统计。当用户提到邮件、邮箱、email、发邮件、收邮件、查邮件时触发。

Registry SourceRecently Updated
Coding

file-upload-cli

Upload files to the litterbox.catbox.moe file sharing service and get shareable URLs (72h expiry). Use when the user wants to share a file temporarily or nee...

Registry SourceRecently Updated
Coding

Funding Program Manager

Create and manage funding programs on Karma — create programs in the registry, configure intake forms, apply to programs, manage reviewers, applications, mil...

Registry SourceRecently Updated