Pandas

Analyze, transform, and clean DataFrames with efficient patterns for filtering, grouping, merging, and pivoting.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Pandas" with this command: npx skills add ivangdavila/pandas

Setup

On first use, create ~/pandas/ and read setup.md for initialization. User preferences are stored in ~/pandas/memory.md — users can view or edit this file anytime.

When to Use

User needs to work with tabular data in Python. Agent handles DataFrame operations, data cleaning, aggregations, merges, pivots, and exports.

Architecture

Memory lives in ~/pandas/. See memory-template.md for structure.

~/pandas/
├── memory.md     # User preferences and common patterns
└── snippets/     # Saved code patterns (optional)

Quick Reference

TopicFile
Setup processsetup.md
Memory templatememory-template.md

Core Rules

1. Use Vectorized Operations

  • NEVER iterate with for loops over DataFrame rows
  • Use .apply() only when vectorized alternatives don't exist
  • Prefer df['col'].str.method() over apply(lambda x: x.method())

2. Chain Methods for Readability

# Good: method chaining
result = (df
    .query('age > 30')
    .groupby('city')
    .agg({'salary': 'mean'})
    .reset_index())

# Bad: intermediate variables everywhere
filtered = df[df['age'] > 30]
grouped = filtered.groupby('city')
result = grouped.agg({'salary': 'mean'}).reset_index()

3. Handle Missing Data Explicitly

  • Always check df.isna().sum() before analysis
  • Choose strategy: dropna(), fillna(), or interpolation
  • Document WHY missing values exist before removing them

4. Use Categorical for Repeated Strings

# Memory savings for columns with few unique values
df['status'] = df['status'].astype('category')
df['country'] = df['country'].astype('category')

5. Merge with Validation

# Always specify how and validate
result = pd.merge(
    df1, df2,
    on='id',
    how='left',
    validate='m:1'  # Many-to-one: catch unexpected duplicates
)

6. Prefer query() for Complex Filters

# Readable
df.query('age > 30 and city == "NYC" and salary < 100000')

# Hard to read
df[(df['age'] > 30) & (df['city'] == 'NYC') & (df['salary'] < 100000)]

7. Set Index When Appropriate

# Faster lookups, cleaner merges
df = df.set_index('user_id')
user_data = df.loc[12345]  # O(1) lookup

Common Traps

  • SettingWithCopyWarning → Use .loc[] for assignment: df.loc[mask, 'col'] = value
  • Slow loops → Replace iterrows() with vectorized ops or apply()
  • Memory explosion → Use dtype in read_csv(): pd.read_csv(f, dtype={'id': 'int32'})
  • Silent data loss → Check shape before/after merge: print(f"Before: {len(df1)}, After: {len(result)}")
  • Index confusion → Use reset_index() after groupby() to get clean DataFrame
  • Chained indexingdf['a']['b'] fails silently; use df.loc[:, ['a', 'b']]

Security & Privacy

Data storage:

  • User preferences stored in ~/pandas/memory.md
  • All DataFrame operations run locally
  • No data is sent externally

This skill does NOT:

  • Upload data to any service
  • Access files outside ~/pandas/ and the working directory
  • Modify source data files without explicit instruction

User control:

  • View stored preferences: cat ~/pandas/memory.md
  • Clear all data: rm -rf ~/pandas/

Related Skills

Install with clawhub install <slug> if user confirms:

  • data-analysis — general data analysis patterns
  • csv — CSV file handling
  • sql — database queries
  • excel-xlsx — Excel file operations

Feedback

  • If useful: clawhub star pandas
  • Stay updated: clawhub sync

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Accelo

Accelo integration. Manage Organizations, Leads, Pipelines, Users, Goals, Filters. Use when the user wants to interact with Accelo data.

Registry SourceRecently Updated
General

8X8

8x8 integration. Manage Persons, Organizations, Deals, Leads, Activities, Notes and more. Use when the user wants to interact with 8x8 data.

Registry SourceRecently Updated
General

7Shifts

7shifts integration. Manage Companies. Use when the user wants to interact with 7shifts data.

Registry SourceRecently Updated
General

46Elks

46elks integration. Manage Organizations. Use when the user wants to interact with 46elks data.

Registry SourceRecently Updated