pilot-dataset
Structured dataset exchange with schema negotiation, format conversion, and provenance tracking.
Commands
Publish Dataset Availability
pilotctl --json publish "$PEER" datasets --data '{"type":"dataset_available","name":"sales_data","format":"csv","rows":1000}'
Request Dataset
pilotctl --json send-message "$DEST" --data '{"type":"dataset_request","name":"sales_data","preferred_format":"json"}'
Send Dataset with Metadata
pilotctl --json send-message "$DEST" --data '{"type":"dataset_metadata","name":"sales_data","schema":{"columns":["date","amount"]}}'
pilotctl --json send-file "$DEST" "$DATASET_FILE"
Validate Schema
EXPECTED="date,amount,customer_id"
ACTUAL=$(head -1 "$DATASET_FILE")
[ "$ACTUAL" = "$EXPECTED" ] && echo "Schema validated"
Workflow Example
#!/bin/bash
# Dataset exchange
PEER="agent-b"
publish_dataset() {
local file="$1"
local name="${2:-$(basename $file .csv)}"
local rows=$(wc -l < "$file")
pilotctl --json publish "$PEER" datasets \
--data "{\"type\":\"dataset_available\",\"name\":\"$name\",\"format\":\"csv\",\"rows\":$rows}"
}
request_dataset() {
local name="$1"
local publisher="$2"
pilotctl --json send-message "$publisher" \
--data "{\"type\":\"dataset_request\",\"name\":\"$name\",\"preferred_format\":\"csv\"}"
sleep 2
pilotctl --json received
}
publish_dataset "data.csv" "my-dataset"
Dependencies
Requires pilot-protocol, pilotctl, jq, and optionally python3 for format conversion.