data-toolkit

Complete data conversion, validation, and cleaning toolkit. Convert between JSON/CSV/YAML/XML, validate schemas, clean duplicates and nulls. Essential utilities for data processing workflows.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "data-toolkit" with this command: npx skills add atlasnexusops/data-toolkit

Data Toolkit

Complete data processing utilities for OpenClaw agents.

Features

Converters

  • JSON ↔ CSV - Bidirectional conversion with schema inference
  • JSON ↔ YAML - Clean formatting, comment preservation
  • JSON ↔ XML - Configurable root elements and attributes
  • CSV ↔ YAML - Direct conversion without intermediate steps
  • Multi-format batch conversion - Process entire directories

Validators

  • JSON Schema validation - Validate against JSON Schema specs
  • CSV structure validation - Check headers, columns, data types
  • Data type inference - Automatic type detection and validation
  • Custom rules - Define business logic validations

Cleaners

  • Duplicate removal - Smart deduplication with configurable keys
  • Null/empty handling - Remove or replace null values
  • Data normalization - Standardize formats (dates, numbers, strings)
  • Whitespace cleanup - Trim, collapse multiple spaces
  • Column operations - Remove, rename, reorder columns

Usage

Convert Data

# JSON to CSV
./src/convert.py --input data.json --output data.csv --format csv

# CSV to JSON
./src/convert.py --input data.csv --output data.json --format json

# JSON to YAML
./src/convert.py --input data.json --output data.yaml --format yaml

# XML to JSON
./src/convert.py --input data.xml --output data.json --format json

# Batch conversion
./src/convert.py --input-dir ./raw --output-dir ./processed --format json

Validate Data

# Validate against JSON schema
./src/validate.py --input data.json --schema schema.json

# Validate CSV structure
./src/validate.py --input data.csv --check-headers --check-types

# Custom validation rules
./src/validate.py --input data.json --rules validation-rules.yaml

Clean Data

# Remove duplicates
./src/clean.py --input data.json --dedupe --key id

# Handle nulls
./src/clean.py --input data.csv --remove-nulls
./src/clean.py --input data.csv --replace-nulls "N/A"

# Normalize data
./src/clean.py --input data.json --normalize dates,numbers,strings

# Full cleanup pipeline
./src/clean.py --input messy.csv --dedupe --remove-nulls --normalize all --output clean.csv

API Usage (Python)

from data_toolkit import convert, validate, clean

# Convert
convert.json_to_csv('input.json', 'output.csv')
convert.csv_to_yaml('input.csv', 'output.yaml')

# Validate
is_valid = validate.json_schema('data.json', 'schema.json')
errors = validate.csv_structure('data.csv')

# Clean
clean.remove_duplicates('data.json', key='id')
clean.normalize_dates('data.csv', format='ISO8601')

Examples

See examples/ directory for complete workflows:

  • examples/etl-pipeline.sh - Full ETL workflow
  • examples/api-data-processing.py - API response processing
  • examples/batch-conversion.sh - Bulk file conversion

Installation

Dependencies are minimal and common:

  • Python 3.8+
  • PyYAML
  • pandas (optional, for advanced CSV operations)
pip install pyyaml pandas

Requirements

  • Node.js (for JSON/YAML parsing)
  • Python 3.8+
  • 10MB disk space

License

MIT

Support

Issues: https://github.com/forge-agent/data-toolkit Docs: See docs/ directory

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Maverick Asana MCP

Manage Asana tasks, projects, portfolios, goals, and team workspaces via Asana's hosted MCP server (https://mcp.asana.com/v2/mcp). Use when the user asks abo...

Registry SourceRecently Updated
General

InvestorClaw

Deterministic-first portfolio analyzer — holdings, performance, Sharpe + Sortino, FRED yield curves, bond duration, sector breakdowns, scenario rebalancing —...

Registry SourceRecently Updated
General

Douyin Messager | 抖音私信助手

Douyin DMs and video/note comment assistant. 抖音私信与视频/图文评论助手;可读私信、分析评论区,评论/回复等写入前必须确认。

Registry SourceRecently Updated
3.2K13moroiser
General

Visa Company

Visa公司运营全球最大电子支付网络,连接数十亿用户和商户,提供安全高效的跨境及实时支付基础设施。

Registry SourceRecently Updated