clean-data-xls

Clean messy data in the active sheet or a specified range.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "clean-data-xls" with this command: npx skills add anthropics/financial-services-plugins/anthropics-financial-services-plugins-clean-data-xls

Clean Data

Clean messy data in the active sheet or a specified range.

Environment

  • If running inside Excel (Office Add-in / Office JS): Use Office JS directly (Excel.run(async (context) => {...}) ). Read via range.values , write helper-column formulas via range.formulas = [["=TRIM(A2)"]] . The in-place vs helper-column decision still applies.

  • If operating on a standalone .xlsx file: Use Python/openpyxl.

Workflow

Step 1: Scope

  • If a range is given (e.g. A1:F200 ), use it

  • Otherwise use the full used range of the active sheet

  • Profile each column: detect its dominant type (text / number / date) and identify outliers

Step 2: Detect issues

Issue What to look for

Whitespace leading/trailing spaces, double spaces

Casing inconsistent casing in categorical columns (usa / USA / Usa )

Number-as-text numeric values stored as text; stray $ , , , % in number cells

Dates mixed formats in the same column (3/8/26 , 2026-03-08 , March 8 2026 )

Duplicates exact-duplicate rows and near-duplicates (case/whitespace differences)

Blanks empty cells in otherwise-populated columns

Mixed types a column that's 98% numbers but has 3 text entries

Encoding mojibake (é , ’ ), non-printing characters

Errors #REF! , #N/A , #VALUE! , #DIV/0!

Step 3: Propose fixes

Show a summary table before changing anything:

Column Issue Count Proposed Fix

Step 4: Apply

  • Prefer formulas over hardcoded cleaned values — where the cleaned output can be expressed as a formula (e.g. =TRIM(A2) , =VALUE(SUBSTITUTE(B2,"$","")) , =UPPER(C2) , =DATEVALUE(D2) ), write the formula in an adjacent helper column rather than computing the result in Python and overwriting the original. This keeps the transformation transparent and auditable.

  • Only overwrite in place with computed values when the user explicitly asks for it, or when no sensible formula equivalent exists (e.g. encoding/mojibake repair)

  • For destructive operations (removing duplicates, filling blanks, overwriting originals), confirm with the user first

  • After each category of fix (whitespace → casing → number conversion → dates → dedup), show the user a sample of what changed and get confirmation before moving to the next category

  • Report a before/after summary of what changed

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

ppt-template-creator

No summary provided by upstream source.

Repository SourceNeeds Review
General

macro-rates-monitor

No summary provided by upstream source.

Repository SourceNeeds Review
General

dcf-model

No summary provided by upstream source.

Repository SourceNeeds Review