dont-be-greedy

When a user uploads or references a data file (CSV, JSON, XLSX, TXT, LOG) or any file larger than 100KB, immediately estimate token cost using scripts/estimate_size.py. If >30k tokens, chunk the file and summarize each chunk. If smaller, run quick inspection. Return a safe preview and summary without asking the user what to do.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "dont-be-greedy" with this command: npx skills add elliotjlt/claude-skill-potions/elliotjlt-claude-skill-potions-dont-be-greedy

Don't Be Greedy

<purpose> Prevents context overflow by enforcing size-aware data loading. Large files can exceed context windows and crash agent workflows. This skill measures files before loading, chunks oversized data, and returns compact summaries with safe previews so downstream processing can continue without context exhaustion. </purpose>

Instructions

Step 1: Estimate Token Cost

Before loading ANY data file:

python scripts/estimate_size.py "<file_path>"

This returns byte count and estimated token count.

Step 2: Apply Strategy Based on Size

Estimated TokensAction
< 10,000Run quick inspection, load directly
10,000 - 30,000Run quick inspection, consider filtering
> 30,000Chunk and summarize before loading

Step 3: Execute Appropriate Workflow

<strategy name="small-file"> For files under 10k tokens:
python scripts/quick_inspect.py "<file_path>"

Return stats and load file directly. </strategy>

<strategy name="large-file"> For files over 30k tokens:
python scripts/chunker.py "<file_path>"
python scripts/summarize.py "<chunk_file>"

Return overall summary + per-chunk summaries + safe preview of first rows. </strategy>

Step 4: Return Structured Output

Always provide:

  • Overall summary (1-3 paragraphs)
  • Safe preview (first N rows/lines)
  • Recommendation for next steps
  • Chunk information if file was split

NEVER

  • Load files without running estimate_size.py first
  • Use cat on unknown or large files
  • Ask "What would you like me to do with this file?"
  • Wait for user direction before acting on file uploads
  • Load raw data exceeding 30k tokens into context

ALWAYS

  • Run size estimation before any file operation
  • Chunk files over 30k tokens automatically
  • Provide a safe preview even for large files
  • Act immediately when a data file is detected
  • Be thorough in first response with summary + preview + recommendation

Examples

Example 1: User uploads large CSV

Input: User says "Analyze this sales data" and uploads a 50MB CSV file

Workflow:

  1. Run scripts/estimate_size.py sales.csv → Output: bytes=52428800 (50.0MB) tokens=13107200
  2. Way over 30k tokens. Run scripts/chunker.py sales.csv → Creates 6500+ chunks
  3. Run scripts/summarize.py on representative chunks
  4. Return:
    • Overall summary of data structure and content
    • Safe preview showing first 10 rows
    • Recommendation: "Data contains 1M rows of sales transactions. I've chunked it for processing. Want me to analyze specific columns or date ranges?"

Example 2: User references small JSON config

Input: User asks "Check my config.json for issues"

Workflow:

  1. Run scripts/estimate_size.py config.json → Output: bytes=2048 (2.0KB) tokens=512
  2. Under 10k tokens. Run scripts/quick_inspect.py config.json
  3. Load file directly and analyze
  4. Return: Full analysis with any issues found

Example 3: User uploads medium log file

Input: User uploads a 500KB application.log

Workflow:

  1. Run scripts/estimate_size.py application.log → Output: bytes=512000 (500.0KB) tokens=128000
  2. Over 30k tokens. Run scripts/chunker.py application.log
  3. Summarize chunks focusing on errors and warnings
  4. Return:
    • Summary of log timespan and key events
    • Count of errors, warnings, info messages
    • Safe preview of recent entries
    • Recommendation for focused analysis

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

rubber-duck

No summary provided by upstream source.

Repository SourceNeeds Review
General

you-sure

No summary provided by upstream source.

Repository SourceNeeds Review
General

battle-plan

No summary provided by upstream source.

Repository SourceNeeds Review
General

pre-mortem

No summary provided by upstream source.

Repository SourceNeeds Review