data-quality-validator

Data quality validation and profiling toolkit for tabular data. Use when checking data completeness, detecting anomalies, validating schemas, profiling datasets, or assessing data cleanliness. Triggers on phrases like "data quality", "data validation", "schema validation", "data profiling", "missing data", "anomaly detection", "data completeness", "dirty data".

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "data-quality-validator" with this command: npx skills add kaiyuelv/data-validator-pro

Data Quality Validator

Toolkit for validating and profiling tabular data quality.

Features

  • Schema validation - Check column types, constraints, and rules
  • Completeness analysis - Missing value detection and reporting
  • Anomaly detection - Statistical outlier detection
  • Profiling - Summary statistics and distribution analysis
  • Constraint checking - Range checks, uniqueness, regex patterns

Quick Start

from scripts.data_profiler import DataProfiler
from scripts.schema_validator import SchemaValidator

# Profile a dataset
profiler = DataProfiler()
report = profiler.profile(df)  # pandas DataFrame
print(report["missing"])
print(report["outliers"])

# Validate against schema
schema = {
    "age": {"type": "int", "min": 0, "max": 150},
    "email": {"type": "str", "regex": r"^\S+@\S+\.\S+$"},
    "id": {"type": "int", "unique": True}
}
validator = SchemaValidator(schema)
errors = validator.validate(df)
for err in errors:
    print(err)

Scripts

  • scripts/data_profiler.py - Dataset profiling and summary stats
  • scripts/schema_validator.py - Schema-based validation engine
  • scripts/anomaly_detector.py - Statistical anomaly detection

References

  • references/validation_rules.md - Common validation patterns

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

BeerGaao

A股/港股/美股量化分析工具,提供技术分析、策略生成、回测验证和风险管理功能

Registry SourceRecently Updated
General

Learning Secretary Skill

学习秘书。每天 10:00 自动扫描对话生成学习卡片,每周日 20:00 汇总本周学习卡片生成 Word 文档。支持手动触发。

Registry SourceRecently Updated
General

Junyi Doc Reader

大文档归档与检索管线(v5)。支持本地文件(Word/PDF/TXT/Markdown)和飞书云文档,转换、分块、可选 LLM 增强,输出结构化 Markdown 和索引,存入 Obsidian。触发词:读大文档、归档文档、junyi-doc-reader、doc-reader、文档索引、帮我读这个PDF、把文档...

Registry SourceRecently Updated
General

Yt Dlp Downloader Skill

Download videos from YouTube, Bilibili, Twitter, and thousands of other sites using yt-dlp. Use when the user provides a video URL and wants to download it,...

Registry SourceRecently Updated