Text

Transform, format, and process text with patterns for writing, data cleaning, localization, citations, and copywriting.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Text" with this command: npx skills add ivangdavila/text

Quick Reference

TaskLoad
Creative writing (voice, dialogue, POV)writing.md
Data processing (CSV, regex, encoding)data.md
Academic/citations (APA, MLA, Chicago)academic.md
Marketing copy (headlines, CTA, email)copy.md
Translation/localizationlocalization.md

Universal Text Rules

Encoding

  • Always verify encoding first: file -bi document.txt
  • Normalize line endings: tr -d '\r'
  • Remove BOM if present: sed -i '1s/^\xEF\xBB\xBF//'

Whitespace

  • Collapse multiple spaces: sed 's/[[:space:]]\+/ /g'
  • Trim leading/trailing: sed 's/^[[:space:]]*//;s/[[:space:]]*$//'

Common Traps

  • Smart quotes (" ") break parsers → normalize to "
  • Em/en dashes ( ) break ASCII → normalize to -
  • Zero-width chars invisible but break comparisons → strip them
  • String length ≠ byte length in UTF-8 ("café" = 4 chars, 5 bytes)

Format Detection

# Detect encoding
file -I document.txt

# Detect line endings
cat -A document.txt | head -1
# ^M at end = Windows (CRLF)
# No ^M = Unix (LF)

# Detect delimiter (CSV/TSV)
head -1 file | tr -cd ',;\t|' | wc -c

Quick Transformations

TaskCommand
Lowercasetr '[:upper:]' '[:lower:]'
Remove punctuationtr -d '[:punct:]'
Count wordswc -w
Count unique linessort -u | wc -l
Find duplicatessort | uniq -d
Extract emailsgrep -oE '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
Extract URLs`grep -oE 'https?://[^[:space:]<>"{}

Before Processing Checklist

  • Encoding verified (UTF-8?)
  • Line endings normalized
  • Delimiter identified (for structured text)
  • Target format/style defined
  • Edge cases considered (empty, Unicode, special chars)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Trinity Lite

每日自动化能力进化工具。运行后会:1)自检发现能力缺陷,2)自动阅读学习,3)生成新洞察,4)验证能力提升。适合希望持续提升AI能力的用户,每天运行看到具体进步。

Registry SourceRecently Updated
General

CamScanner Erase Handwriting

Use CamScanner to erase handwriting from images while preserving the printed content and original layout. Powered by a high-precision image enhancement engin...

Registry SourceRecently Updated
General

Local Model Quantization Router

Recommend local LLM model routes and quantization levels using hardware, privacy, task complexity, context size, and budget constraints. Use for Qwen/Ollama/...

Registry SourceRecently Updated
General

Siluzan CSO

当用户提问的内容涉及以下内容时,可以使用本SKILL(1)多媒体平台内容(视频/图文)发布与运营(YouTube、TikTok、Instagram、LinkedIn、X、视频号),以及账号授权、数据报表、任务管理;(2)公众号、小红书等内容文案/选题生成——选题/拆解/口播成稿、三库选题;(3)RAG 知识库检索...

Registry SourceRecently Updated