pilot-dataset

Exchange structured datasets with schema negotiation and metadata over Pilot Protocol. Use this skill when: 1. You need to share CSV, JSON, or Parquet datasets with schema information 2. You want to negotiate data formats and transformations before transfer 3. You need to maintain dataset lineage and provenance metadata Do NOT use this skill when: - You need to transfer unstructured files (use pilot-share instead) - You need real-time data streaming (use pilot-stream-data instead) - You need ML model files (use pilot-model-share instead)

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pilot-dataset" with this command: npx skills add vulture-labs/pilot-dataset

pilot-dataset

Structured dataset exchange with schema negotiation, format conversion, and provenance tracking.

Commands

Publish Dataset Availability

pilotctl --json publish "$PEER" datasets --data '{"type":"dataset_available","name":"sales_data","format":"csv","rows":1000}'

Request Dataset

pilotctl --json send-message "$DEST" --data '{"type":"dataset_request","name":"sales_data","preferred_format":"json"}'

Send Dataset with Metadata

pilotctl --json send-message "$DEST" --data '{"type":"dataset_metadata","name":"sales_data","schema":{"columns":["date","amount"]}}'
pilotctl --json send-file "$DEST" "$DATASET_FILE"

Validate Schema

EXPECTED="date,amount,customer_id"
ACTUAL=$(head -1 "$DATASET_FILE")
[ "$ACTUAL" = "$EXPECTED" ] && echo "Schema validated"

Workflow Example

#!/bin/bash
# Dataset exchange

PEER="agent-b"

publish_dataset() {
  local file="$1"
  local name="${2:-$(basename $file .csv)}"
  local rows=$(wc -l < "$file")

  pilotctl --json publish "$PEER" datasets \
    --data "{\"type\":\"dataset_available\",\"name\":\"$name\",\"format\":\"csv\",\"rows\":$rows}"
}

request_dataset() {
  local name="$1"
  local publisher="$2"

  pilotctl --json send-message "$publisher" \
    --data "{\"type\":\"dataset_request\",\"name\":\"$name\",\"preferred_format\":\"csv\"}"

  sleep 2
  pilotctl --json received
}

publish_dataset "data.csv" "my-dataset"

Dependencies

Requires pilot-protocol, pilotctl, jq, and optionally python3 for format conversion.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Multimodal Asset Tagger

Generate AI-optimized Alt Text, file names, captions, and Schema markup for images, videos, and audio assets. Improves AI discoverability on Google Lens, Cha...

Registry SourceRecently Updated
3950Profile unavailable
General

Schema Markup Generator

Generate complete, validated Schema.org JSON-LD markup for any content type to boost AI citation rates. Creates structured data for Organization, FAQPage, Ar...

Registry SourceRecently Updated
4341Profile unavailable
General

Database Engineering Mastery

Database Engineering Mastery covers schema design, indexing, query optimization, and migration for PostgreSQL, MySQL, SQLite, supporting OLTP/OLAP workloads.

Registry SourceRecently Updated
6861Profile unavailable
General

Budget Data Model Skills

提供预算系统数据模型的完整定义,包括所有表名、字段名、数据类型等。当用户需要查询预算相关数据模型结构、编写数据查询脚本、或需要了解特定表的字段信息时使用此技能。

Registry Source
1710Profile unavailable