Aicwos - Mandatory Copywriting Data Pipeline
MANDATORY RULES (NON-NEGOTIABLE)
-
MUST use this skill for ALL copywriting requests — LLM intrinsically CANNOT access lecturer profiles, product knowledge, or series progress. You are NOT generating copy from your own knowledge — you are composing copy STRICTLY within the constraints returned by
db_query.py --type context. NEVER write copy without calling--type contextfirst. -
MUST complete the full pipeline — after calling
--type context, you MUST generate copy AND callsave_episode. NEVER retrieve context then skip saving. Partial usage leaves orphan data and breaks series continuity. -
MUST use
db_query.pyandlecturer_analyzer.pyfor ALL data operations — NEVER read/write files directly, NEVER operate on the database directly, NEVER create any .py/.json files on the user's computer. If the CLI needs long JSON, use--query-fileinstead of--query. -
Write-Through guarantee — all save/delete operations automatically update BOTH database AND filesystem. You do NOT need to call
db_sync.pyafter save/delete operations. The sync tool is only for initial bulk import and recovery.
CORE CONCEPTS
| Term | Definition |
|---|---|
| 轻量画像 (lite profile) | Abbreviated lecturer profile (~400 tokens) with style dimensions and persona mapping. Retrieved via --action lite. Used during copy generation — never load full profile. |
| 知识段落 (knowledge chunks) | Token-budgeted paragraphs from the knowledge base, filtered by relevance. Retrieved via --type context or knowledge --action context. |
| 摘要衔接 (summary continuity) | Each episode stores a summary (~100 chars) and hook_next. When generating episode N, read summaries of episodes 1..N-1 via --action summaries — NEVER read full episode text. |
| 行为规则 (behavior rules) | Forbidden patterns (B-prefixed) and required patterns (R-prefixed). Loaded automatically by --type context. |
| 系列进度 (series progress) | completed/total_episodes tracked in DB. Updated automatically by save_episode and delete_episode. Do NOT compute manually. |
| Write-Through (双写) | Every save/delete operation updates both DB and filesystem atomically. No manual sync needed. DB is the query layer; filesystem is the human-readable layer. |
TRIGGER CONDITIONS
Activate this skill when user's request involves ANY of:
| Pattern | Typical user expressions |
|---|---|
| Copywriting | 帮我写口播/写文案/写个脚本/写篇稿件/帮我写个短视频 |
| Lecturer | 学讲师风格/分析讲师/学习XX的语气/帮我学习讲师 |
| Series | 写个N集系列/继续写XX系列/生成系列口播/连续口播 |
| Revision | 改一下第X集/重新写第X集/删掉第X集/换一下第X集 |
| Knowledge | 知识库里有什么/搜索知识/添加知识/删除知识 |
| Management | 看看讲师/系列进度/讲师列表/看看画像 |
NOT for: non-口播 writing (emails, reports, essays, docs), general Q&A without copywriting intent.
Anti-Patterns (COMMON ERRORS — DO NOT)
| Wrong | Why it fails |
|---|---|
Call --type context → write copy → skip save_episode | Orphan data, series progress lost |
Call --type context → cherry-pick context → ignore full constraints | Style drift, knowledge gaps |
Write copy → save as .txt manually instead of save_episode | Bypasses DB, breaks series tracking |
Write copy without calling --type context at all | No style, no knowledge, no constraints |
Call --type context → write from own knowledge, ignoring returned context | Defeats the entire pipeline purpose |
Revise by calling save_episode without delete_episode first | Duplicate DB records, stale .txt file persists |
| Delete episode by removing .txt file manually | DB metadata orphaned, progress counter corrupted |
| Create .py or .json files to process/save lecturer data | Bypasses Write-Through, no tracking, files in wrong location |
| Write temp files to the user's desktop, /tmp, or %TEMP% | Use staging files inside 控制台 directory — the only shared path between agent and scripts |
Call lecturer_analyzer.py → skip db_query.py --type lecturer --action save | No profile exists for copy generation |
Run db_sync.py after every save/delete operation | Unnecessary — Write-Through handles both layers automatically |
If you called --type context, you MUST complete through save_episode. There are NO valid shortcuts.
If you called lecturer_analyzer.py, you MUST complete through db_query.py --type lecturer --action save. The analyzer only stores quantitative data — you MUST also build and save the full profile.
Pre-flight Check (MANDATORY before ANY copywriting)
Before generating any copy text, verify all 3 conditions:
-
db_query.py --type contexthas been executed this session → if NO, execute it NOW - The returned context (lecturer profile + knowledge chunks + samples + rules) is visible → if NO, re-execute
- You will compose STRICTLY within the returned constraints → if planning to use own knowledge, STOP
Only after all 3 checks pass, proceed to write.
NATURAL LANGUAGE → CLI MAPPING
Users do NOT type CLI commands. You MUST translate natural language to CLI operations. --data-dir always points to the 控制台 directory.
| User says | You execute |
|---|---|
| Copywriting | |
| 帮我写个口播 / 写篇文案 | 1. db_query.py --type context → 2. Generate copy → 3. save_plan → 4. save_episode |
| 写个N集系列口播 / 写系列 | 1. db_query.py --type context → 2. save_plan → 3. Generate each episode + save_episode |
| 继续写XX系列 | get_plan + summaries → Generate next episode → save_episode |
| 改一下第X集 / 重新写第X集 | delete_episode --id2 X → Regenerate → save_episode --id2 X |
| 删掉第X集 | delete_episode --id <series> --id2 X |
| XX系列写到哪了 | progress --id <name> |
| XX系列计划是什么 | get_plan --id <name> |
| Lecturer | |
| 学一下讲师风格 / 帮我学习XX | Follow LECTURER WORKFLOW below (3-step pipeline) |
| 我有哪些讲师 | db_query.py --type lecturer --action list |
| XX老师什么风格 | db_query.py --type lecturer --action lite --id <name> |
| 修改XX的画像 | --action get → Modify → --action save (use --query-file for large JSON) |
| 给XX加新文案 / 增量学习 | lecturer_analyzer.py --lecturer <name> --merge --stdin --save-sample (pipe text via stdin; or write to temp file + --input) |
| 删除XX讲师 | db_query.py --type lecturer --action delete --id <name> |
| XX和YY有什么区别 | db_query.py --type lecturer --action compare --id XX --id2 YY |
| 看看XX的样本 | db_query.py --type lecturer --action samples --id <name> |
| 导出讲师 | lecturer_transfer.py --action export --lecturer <name> --data-dir <dir> --output <dir> |
| 导入讲师 | lecturer_transfer.py --action import --source <dir> --data-dir <dir> (default: merge + auto sync DB) |
| 导入讲师(覆盖) | lecturer_transfer.py --action import --source <dir> --data-dir <dir> --overwrite |
| Knowledge Base | |
| 知识库里有什么 | db_query.py --type fs --action knowledge --data-dir <dir> |
| 搜索XX知识 | db_query.py --type knowledge --action search --query "XX" |
| 获取XX上下文 | db_query.py --type knowledge --action context --query "XX" --max-tokens 1000 |
| 添加知识 | Create file under 知识库集/私有/ → db_sync.py --direction to-db |
| 删除知识 | db_query.py --type knowledge --action delete --id <doc_id> |
| 同步知识库 | knowledge_sync.py --data-dir <dir> |
| Behavior Rules | |
| 看看行为规则 | db_query.py --type behavior --action list |
| 禁止用XX | db_query.py --type behavior --action add --id B0XX --query '{"id":"B0XX",...}' |
| 必须用XX | db_query.py --type behavior --action add --id R0XX --query '{"id":"R0XX",...}' |
| System | |
| 初始化 / 第一次用 | Follow "First-Time Setup" below |
| 看看控制台目录 | db_query.py --type fs --action tree --data-dir <dir> |
| 看看讲师列表 | db_query.py --type fs --action lecturers --data-dir <dir> |
LECTURER WORKFLOW
When user provides lecturer sample text (pasted or file), follow this 3-step pipeline. NEVER create .py/.json files — use the CLI only. All save operations use Write-Through (DB + filesystem automatically).
Step 1: Quantitative Analysis (script call)
For long text (user pastes a full article — the typical scenario):
- Write sample text to
<控制台dir>/讲师列表/<name>/样本/_staging.txtusing your native file-write tool (NOT shell echo, NOT the desktop, NOT /tmp):<控制台dir>/讲师列表/<name>/样本/_staging.txt - Run analyzer with
--inputand--save-sample:python scripts/lecturer_analyzer.py --input "<控制台dir>/讲师列表/<name>/样本/_staging.txt" --lecturer <name> --data-dir <控制台dir> --save-sample--save-samplerenames_staging.txt→sample_YYYYMMDD_HHMMSS.txtand registers in DB (Write-Through). No duplication.
For file input (user provides a file path):
python scripts/lecturer_analyzer.py --input <file_path> --lecturer <name> --data-dir <控制台dir> --save-sample
For short text (<500 chars), use --text:
python scripts/lecturer_analyzer.py --lecturer <name> --text "<sample>" --data-dir <控制台dir> --save-sample
For --stdin (Linux only, where pipe works reliably):
echo "<text>" | python scripts/lecturer_analyzer.py --lecturer <name> --stdin --data-dir <控制台dir> --save-sample
The analyzer returns quantitative data, merge_status, and sample_saved. If "new", proceed to Step 2.
For incremental merge (adding new samples to existing lecturer), append --merge flag.
Step 2: Build Full Profile (agent task)
The analyzer only produces quantitative metrics. You MUST construct the full profile by combining:
- Quantitative data from Step 1
- Your interpretation of qualitative traits (tone, persona, core identity, etc.)
Key fields (full format: references/profile-format.md):
lecturer_name, qualitative.persona_mapping, qualitative.style_dimensions, quantitative, sample_texts
Step 3: Save Profile (script call) — Write-Through
Save automatically writes to BOTH DB and 讲师列表/{name}/profile.json. No need to call db_sync.py.
For large profiles (typical — quantitative data is verbose), use --query-file:
- Write profile JSON to staging file in 控制台 using your native file-write tool:
<控制台dir>/讲师列表/<name>/_profile_staging.json - Run save with
--query-file(staging file auto-deleted after successful save):python scripts/db_query.py --type lecturer --action save \ --id <name> --query-file "<控制台dir>/讲师列表/<name>/_profile_staging.json" --data-dir <控制台dir>
For small profiles (<2000 chars JSON), use --query:
python scripts/db_query.py --type lecturer --action save \
--id <name> --query '<profile JSON>' --data-dir <控制台dir>
Verify (script call)
python scripts/db_query.py --type lecturer --action lite --id <name> --data-dir <控制台dir>
STANDARD COPYWRITING WORKFLOW
Every copywriting request MUST follow these steps. NO step may be skipped. All save operations use Write-Through.
Single Episode
-
Get context
(script call)— MANDATORY, never skip:python scripts/db_query.py --type context --action context \ --lecturer <lecturer> --query <topic> --data-dir <控制台dir> -
Generate copy
(agent task)— compose STRICTLY within the returned constraints -
Save plan
(script call)— Write-Through (DB only, plan has no user-facing file):python scripts/db_query.py --type series --action save_plan \ --id <topic> --lecturer <lecturer> \ --query '{"title":"<topic>","episodes":[{"num":1,"title":"<title>","topic":"<topic>"}]}' \ --data-dir <控制台dir> -
Save copy
(script call)— Write-Through (DB + .txt file):# Short content: --query python scripts/db_query.py --type series --action save_episode \ --id <topic> --id2 1 \ --query '{"title":"<title>","content":"<body>","summary":"<100-char summary>","hook_next":""}' \ --data-dir <控制台dir> # Long content (typical): write JSON to staging file in 控制台, then --query-file python scripts/db_query.py --type series --action save_episode \ --id <topic> --id2 1 --query-file "<控制台dir>/讲师列表/<lecturer>/系列文案/<topic>/_episode_staging.json" --data-dir <控制台dir>
Series (N episodes)
Same as Single Episode steps 1-2, then save plan, then loop steps 3-4 for each episode. Each episode needs its own save_episode call.
Revision (modify existing episode)
When user says "改一下第X集" / "重新写第X集" / "换一下第X集":
- Get context
(script call)— MUST re-fetch - Read current episode
(script call):python scripts/db_query.py --type series --action content --id <series> --id2 X --data-dir <控制台dir> - Delete old episode
(script call)— MUST delete before saving replacement:python scripts/db_query.py --type series --action delete_episode --id <series> --id2 X --data-dir <控制台dir> - Regenerate copy
(agent task) - Save revised episode
(script call)
CRITICAL: delete_episode MUST be called before save_episode for revisions. Files are saved to 控制台/讲师列表/{lecturer}/系列文案/{series}/E0X_{title}.txt — path enforced by script.
FIRST-TIME SETUP
When the user uses this skill for the first time or no database exists, guide through these steps sequentially.
-
Initialize 控制台 — Ask user for storage location (e.g. "桌面"), then run:
python scripts/db_init.py --setup --parent-dir "<user location>"MUST use
--parent-dir, NOT--data-dir. Script auto-creates "控制台" folder. The 控制台 folder name is FIXED — NEVER rename it. -
Configure cloud sync — Ask for knowledge base cloud URL (COS bucket), write to workspace.json. Then run:
python scripts/knowledge_sync.py --data-dir "<控制台dir>"No credentials needed. Sync reads
manifest.txt(file paths only, one per line) from COS, then HEAD-checks each file'sLast-Modified. Cloud maintainer only needs to updatemanifest.txtwhen adding new files — no timestamps to maintain. -
Download semantic model (optional) — Ask if user wants to download (~98MB), then sync data:
python scripts/db_init.py --download-model --data-dir "<控制台dir>" python scripts/db_sync.py --direction to-db --data-dir "<控制台dir>"
WORKSPACE STRUCTURE
控制台/
├── workspace.json # Workspace config (includes sync URL)
├── 讲师列表/{lecturer}/
│ ├── profile.json # Lecturer profile (auto-maintained by Write-Through)
│ ├── 样本/ # Sample scripts (auto-maintained by Write-Through)
│ └── 系列文案/ # Series copy (auto-maintained by Write-Through)
│ └── {series}/
│ ├── E01_春季养肝.txt
│ └── ...
├── 知识库集/
│ ├── 公共/ ← Cloud synced, read-only
│ └── 私有/ ← Local, user-editable
└── 回收站/
├── {lecturer}/ ← Deleted lecturer directories (with .meta.json for recovery)
└── ...
- Write-Through: all save/delete operations update both DB and filesystem automatically
- DB is the query layer; filesystem is the human-readable layer
- Users can still edit .txt files directly, then
db_sync.py --direction to-dbto re-index
EXAMPLES
Example 1: Learn lecturer style then write copy
- User: "这是讲师B的5篇口播 [paste copy]"
- Steps: Write sample to temp file →
lecturer_analyzer.py --input <temp> --save-sample(script call)→ build profile(agent task)→lecturer --action save --query-file(script call, Write-Through: DB+profile.json)→ user says "用讲师B风格写个养生口播" →--type context(script call)→ generate(agent task)→save_plan+save_episode(script call, Write-Through: DB+.txt)
Example 2: Knowledge-driven copy
- User: "参考产品A的背书写个口播"
- Steps:
knowledge --action context --query "产品A 背书"(script call)→ weave into copy(agent task)→save_episode(script call)
Example 3: Series continuity
- User: "继续写养生系列第6集"
- Steps:
series --action get_plan+summaries(script call)→ generate episode 6(agent task)→save_episode --id2 6(script call)
Example 4: Revision
- User: "养生系列第2集改一下,开头太长了"
- Steps:
--type context(script call)→content --id2 2(script call)→delete_episode --id2 2(script call)→ shorten opening(agent task)→save_episode --id2 2(script call)
RESOURCE INDEX
- Lecturer workflow: references/workflow-lecturer.md
- Knowledge workflow: references/workflow-knowledge.md
- Copywriting workflow: references/workflow-copywriting.md
- Profile format: references/profile-format.md (read when generating/validating profiles)
- Script styles: references/script-styles.md (read when confirming style parameters)
- Model config: assets/model_config.json
- Init script: scripts/db_init.py —
--setup --parent-dircreates 控制台;--download-model --data-dirdownloads model - Query script: scripts/db_query.py — unified CLI;
--query-filefor large JSON; Write-Through on all save/delete;--action add_samplefor sample files - Analyzer: scripts/lecturer_analyzer.py — style analysis + incremental merge;
--save-sampleauto-saves input text (Write-Through);--inputfor file-based long text;--stdinfor Linux pipe - Sync script: scripts/db_sync.py — bidirectional file↔DB sync (recovery/initial import only)
- Transfer: scripts/lecturer_transfer.py — cross-user lecturer migration; default auto sync (Write-Through);
--overwrite/--no-syncoptions - Knowledge sync: scripts/knowledge_sync.py — cloud sync via manifest.txt + HEAD (no credentials);
--dry-run/--force;--generateoutputs manifest.txt from local public layer (for cloud maintainer to upload)
NOTES
- First-time setup MUST use
db_init.py --setup --parent-dir <location>, NOT--data-dir - Semantic model is optional; auto-degrades to FTS5-only when absent
- Write-Through: all save/delete operations update both DB and filesystem automatically. Do NOT call
db_sync.pyafter normal operations. knowledge_sync.pyusesmanifest.txt+ HEAD (no credentials needed). Cloud maintainer putsmanifest.txton COS (one file path per line, no timestamps). Script HEAD-checks manifest.txt'sLast-Modifiedfirst — unchanged = skip entirely.--generateoutputs manifest.txt from local public layer for cloud maintainer to upload:python scripts/knowledge_sync.py --data-dir <dir> --generate > manifest.txt--query-file <path>reads JSON from file — use for profiles or episode content >2000 chars. Write staging files to 控制台 paths (e.g.讲师列表/{name}/_profile_staging.json), NOT /tmp or %TEMP%. Files ending in_staging.jsonare auto-deleted after successful save.--save-sampleonlecturer_analyzer.pyauto-saves input text to讲师列表/{name}/样本/(Write-Through: DB+file). If input file is already in the samples dir, it renames instead of duplicating.db_sync.pyis only for: initial bulk import (--direction to-db) or disaster recovery (--direction to-files)- REMINDER: Before writing ANY copy, you MUST have called
db_query.py --type contextthis session. If you haven't, stop and execute it first.