ride-receipts-llm

Build, refresh, export, and query a local SQLite ride-history database from Gmail ride receipt emails (Uber, Bolt, Yandex Go, Lyft) using LLM extraction from full email HTML. Use when asked to ingest receipts, rebuild/update `rides.sqlite`, investigate ride totals/routes, or extend provider coverage. Requires `gog` Gmail CLI plus authenticated Google account access; processes sensitive receipt data and sends raw email HTML to the active LLM.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ride-receipts-llm" with this command: npx skills add tuleyko/ride-receipts-llm

ride-receipts-llm

Run a reproducible 3-stage pipeline:

  1. initialize/validate SQLite schema (fixed; do not edit)
  2. fetch full receipt emails into JSONL
  3. extract structured rides with LLM (one-shot + repair)
  4. upsert into SQLite

Prerequisites and safety

  • Require gog CLI installed and authenticated for the selected Gmail account.
  • Prefer configured account: skills.entries.ride-receipts-llm.config.gmailAccount; if missing, ask user for account explicitly.
  • Ask for date scope before fetch: all-time, after YYYY-MM-DD, or between dates.
  • Treat receipt content as sensitive financial/location data.
  • Before extraction, explicitly confirm user is okay sending raw email HTML to the active LLM.
  • Extraction uses raw text_html from emails; do not claim local-only parsing.
  • Never hallucinate fields; keep unknown values null.

Paths

  • Schema (do not modify): skills/ride-receipts-llm/references/schema_rides.sql
  • Emails JSONL: ./data/ride_emails.jsonl
  • Extracted rides JSONL: ./data/rides_extracted.jsonl
  • SQLite DB: ./data/rides.sqlite

0) Initialize DB

python3 skills/ride-receipts-llm/scripts/init_db.py \
  --db ./data/rides.sqlite \
  --schema skills/ride-receipts-llm/references/schema_rides.sql

1) Fetch Gmail receipts → JSONL

python3 skills/ride-receipts-llm/scripts/fetch_emails_jsonl.py \
  --account <gmail-account> \
  --after YYYY-MM-DD \
  --before YYYY-MM-DD \
  --max-per-provider 5000 \
  --out ./data/ride_emails.jsonl
  • Omit --after / --before when not needed.
  • Output rows include provider metadata, snippet, and raw text_html.

2) LLM extraction contract

Read ./data/ride_emails.jsonl; write one JSON object per line to ./data/rides_extracted.jsonl.

Per email:

  1. Run one-shot extraction for all fields.
  2. Quality-gate: amount,currency,pickup,dropoff,payment_method,distance_text,duration_text,start_time_text,end_time_text.
  3. If any are missing, run repair pass(es) for missing fields only.
  4. Merge additively; never replace existing non-null values with null.

Schema (one line per ride):

{
  "provider": "Uber|Bolt|Yandex|Lyft",
  "source": {"gmail_message_id": "...", "email_date": "YYYY-MM-DD HH:MM", "subject": "..."},
  "ride": {
    "start_time_text": "...",
    "end_time_text": "...",
    "total_text": "...",
    "currency": "EUR|PLN|USD|BYN|RUB|UAH|null",
    "amount": 12.34,
    "pickup": "...",
    "dropoff": "...",
    "pickup_city": "...",
    "pickup_country": "...",
    "dropoff_city": "...",
    "dropoff_country": "...",
    "payment_method": "...",
    "driver": "...",
    "distance_text": "...",
    "duration_text": "...",
    "notes": "..."
  }
}

Rules:

  • Use text_html as primary source; fallback to snippet only if text_html is empty.
  • Keep addresses/time strings verbatim.
  • Keep amount numeric; if only textual total exists, set amount: null and preserve text in total_text.

3) Insert extracted rides → SQLite

python3 skills/ride-receipts-llm/scripts/insert_rides_sqlite_jsonl.py \
  --db ./data/rides.sqlite \
  --schema skills/ride-receipts-llm/references/schema_rides.sql \
  --rides-jsonl ./data/rides_extracted.jsonl

Schema is idempotent via UNIQUE(provider, gmail_message_id) ON CONFLICT REPLACE.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Fast Douyin Publish

抖音视频自动发布助手。一键上传视频到抖音,支持自动文案生成和标签优化。

Registry SourceRecently Updated
General

Skills Finder

Intelligent skill matcher that searches multiple skill marketplaces (ClawHub & Skills.sh) in real-time. Supports ANY language for user input, multi-step skil...

Registry SourceRecently Updated
General

Claw Self Improving Plus

Turn raw mistakes, corrections, discoveries, and repeated decisions into structured learnings and promotion candidates. Use when the user wants a conservativ...

Registry SourceRecently Updated