---
name: zero-loss-methodology
description: "Zero-Loss Research & Planning Methodology v2.1 — a self-configuring, executable algorithm for hallucination-free, traceable AI-assisted research and planning. Use this skill whenever performing multi-document research, gap analysis, deliverable generation, translation, document consolidation, or verification tasks. Also triggers on: research planning, document review, critical analysis, gap filling, source validation, document fusion, translation verification, compliance artifacts, traceability matrix, or any task requiring zero content loss and full source traceability. This skill should be used for ANY research or planning task involving multiple source documents, even if the user doesn't explicitly mention 'methodology'."
---
# Zero-Loss Research & Planning Methodology v2.1
A self-configuring, domain-agnostic executable algorithm for AI-assisted research and planning that guarantees zero content loss, zero hallucination, and full traceability.
## Core Guarantees
1. **Zero Content Loss**: No source content is dropped, summarized, or silently omitted. Word count of output ≥ word count of source content.
2. **Zero Hallucination**: Every fact is either (a) directly from a source document, (b) from a verified external source with citation, or (c) explicitly marked as an assumption.
3. **Full Traceability**: Every deliverable traces back to a user request, a gap in the analysis, or a decision point. Every fact traces to a source. The full decision chain is logged.
4. **Reproducibility**: Build scripts archived alongside deliverables; any output can be regenerated. Given the same inputs and algorithm, another AI agent should produce substantially equivalent output.
## When to Apply This Methodology
Apply the **full P00–P10 pipeline** when:
- Working with 3+ source documents
- Deliverables will be used for decision-making, funding, or compliance
- Translation or multilingual content is involved
- Document consolidation/fusion is needed
Apply a **lightweight subset** (P00 + P01 + P02 + P05 + P06) when:
- Working with 1–2 source documents
- Quick turnaround needed
- Lower-stakes deliverables
- Note: P00 is always included — even lightweight runs need domain detection and scaffold
## The 11 Processes
### P00: Project Bootstrap (Self-Configuration)
**INPUT**: User request R, this methodology document M
**OUTPUT**: Project scaffold S with domain profile D
ALGORITHM:
  # STEP 1: DOMAIN DETECTION
  1. Analyze user request R to extract:
  a. Domain (e.g., healthcare, fintech, legal, SaaS, manufacturing)
  b. Project type (research, planning, audit, compliance, product dev)
  c. Output language(s) (for deliverables and sources)
  d. Regulatory context (jurisdiction, applicable standards)
  2. Build Domain Profile D:
  D = {domain, project\_name, review\_dimensions, regulatory\_standards,
  source\_languages, output\_language, authority\_sources}
 
  # STEP 2: REVIEW DIMENSIONS (auto-selected per domain)
  3. Select review dimensions based on D.domain:
  Healthcare/MedTech: \[Clinical, Regulatory, Technical, Security, Business, Data Privacy]
  FinTech: \[Regulatory, Security, Technical, Business, Compliance, Financial]
  SaaS: \[Technical, Business, Security, Scalability, Compliance, Financial]
  Legal: \[Regulatory, Compliance, Risk, Contractual, Jurisdictional, Financial]
  Manufacturing: \[Technical, Safety, Regulatory, Supply Chain, Quality, Financial]
  General: \[Technical, Business, Regulatory, Strategic, Financial, Operational]
 
  # STEP 3: DIRECTORY SCAFFOLD
  4. Create project directory structure:
  {project\_name}/
  ├── Deliverables/ ← P05/P09 outputs go here
  ├── ProcessArtifacts/ ← compliance artifacts go here
  │ ├── Source-Inventory (created empty, filled by P01)
  │ ├── Traceability-Matrix (created empty, filled live during P05)
  │ ├── Source-Registry (created empty, filled live during P05)
  │ ├── Validated-Claims (created empty, filled by P03)
  │ ├── Spot-Check-Guide (created at P10, for human reviewer)
  │ ├── Verification-Report (created by P08, if translation used)
  │ ├── Process-History (created live, updated each phase)
  │ ├── Manifest (created at P10)
  │ └── BuildScripts/ (archived at P10)
  └── Sources/ ← user-provided input files
 
  # STEP 4: DOMAIN-SPECIFIC L1 AUTHORITY SOURCES
  5. Pre-populate D.authority\_sources based on domain:
  Healthcare: \[WHO, EMA, FDA, MDR, GDPR, national health authority]
  FinTech: \[ECB, SEC, FCA, PSD2, MiFID II, national financial regulator]
  SaaS: \[GDPR, SOC2, ISO 27001, NIST, CCPA, national data authority]
  Legal: \[National legislation, bar association, court rulings, treaties]
  These pre-populated sources become the MINIMUM L1 checklist for P03.
 
  6. Present D and scaffold to user for confirmation
  \[GATE G0] User confirms domain profile and structure? Y → proceed to P01.
  N → adjust domain profile per user feedback.
### P01: Source Ingestion
**INPUT**: Set of files F = {f1, f2, ..., fn}
**OUTPUT**: Source Inventory Table I
FOR each file f in F:
  1. Detect format (docx, pdf, xlsx, csv, md, txt, image)
  2. Detect language (auto-detect, confirm with user if <90% confidence)
  3. Extract raw text content
  4. Count: words, paragraphs, tables, images, headings
  5. Extract: title, date, author (if present)
  6. Extract: key topics (top 5 by frequency)
  7. Store in I: {filename, format, language, word\_count,
  structure\_counts, metadata, topics}
END FOR
8\. Cross-reference: identify overlapping topics between documents
  Build overlap matrix O:
  O\[i]\[j] = {shared\_topics: \[...], contradiction\_count: N,
  overlap\_pct: %, primary\_owner: doc\_with\_most\_content}
  Use O to detect redundancy and assign section ownership during P05.
9\. Flag: contradictions in metadata (dates, versions, authors)
10\. Present I to user for confirmation
\[GATE G1] User confirms inventory is complete? Y → proceed. N → add missing files.
### P02: Critical Review
**INPUT**: Source Inventory I, Domain context D
**OUTPUT**: Gap List G with priority scores
1\. Define review dimensions based on domain (from P00 D.review\_dimensions):
  \[Technical, Business, Regulatory, Strategic, Financial, Operational]
2\. FOR each document d in I:
  FOR each dimension dim:
  a. Check: Does d contain content for dim?
  b. If yes: Assess completeness (0-100%) and accuracy
  c. If yes: Flag unverified claims → queue for P03
  d. If no: Record as gap
  END FOR
END FOR
3\. Cross-document consistency check:
  a. Compare overlapping topics for contradictions
  b. Compare numerical data (costs, timelines, headcount)
  c. Flag discrepancies with magnitude (e.g., "4x cost difference")
4\. External benchmark check:
  a. Identify applicable standards (ISO, GDPR, industry-specific)
  b. Check each document against applicable standards
  c. Flag missing compliance areas
5\. Compile G: {gap\_id, source\_doc, dimension, description,
  severity(CRITICAL/HIGH/MEDIUM/LOW), evidence}
6\. Present G to user
\[GATE G2] User confirms gap list? Y → proceed to P04. N → revise.
### P03: Source Validation
**INPUT**: Claims queue CQ from P02
**OUTPUT**: Validated Claims Table (VCT)
Authority hierarchy for verification:
- **L1**: Official/government source (legislation, standards body)
- **L2**: Peer-reviewed/industry report (journal, analyst report)
- **L3**: Vendor documentation (official docs, release notes)
- **L4**: News/blog (reputable outlet, dated)
- **L5**: Unverifiable (no external source found)
FOR each claim c in CQ:
  1. Classify type: \[Statistic, Date/Deadline, Technical Spec, Market Data,
  Legal Requirement, Cost Estimate, Performance Claim]
  2. Identify authoritative source hierarchy (L1–L5)
  3. Search for verification (minimum 2 independent sources for L3+)
  4. Compare source claim vs found data
  5. Assign status:
  +==============+===================================================+
  | VERIFIED | Claim matches ≥2 authoritative sources |
  | CORRECTED | Claim was wrong; replacement data provided |
  | OUTDATED | Claim was true but data has since changed |
  | UNVERIFIABLE | No authoritative source found; marked in output |
  | REFUTED | Claim contradicted by authoritative source(s) |
  +==============+===================================================+
  6. Record in VCT: {claim, source\_doc, claim\_type, status,
  verification\_source, verification\_date, notes}
END FOR
 
CORRECTED and REFUTED require: {original\_claim, correction,
correction\_source, deliverable\_updated, update\_description}
 
CRITICAL RULE: REFUTED and CORRECTED claims MUST be addressed
in deliverables. Never silently drop a refuted claim.
### P04: Priority Triage
**INPUT**: Gap List G, Project context
**OUTPUT**: Prioritized Work Packages WP = {P0, P1, P2, P3}
Priority tiers:
  P0 (Immediate): Blocking launch or causing legal/safety risk
  P1 (Short-term): Required within 1-2 sprints for viability
  P2 (Medium-term): Required before MVP but can be parallelized
  P3 (Pre-launch): Required before go-live but not for core dev
 
FOR each gap g in G:
  a. Assess: deadline pressure (is there a hard date?)
  b. Assess: dependency (does other work block on this?)
  c. Assess: severity (what happens if ignored?)
  d. Assign tier: P0/P1/P2/P3
Group gaps into deliverables, define format, estimate effort.
\[GATE G3] User approves work packages? Y → proceed. N → re-triage.
### P05: Deliverable Generation
**INPUT**: Work package spec WPi, VCT, source documents
**OUTPUT**: Professional document(s)
FOR each deliverable:
  Phase A — Research:
  1. Extract all relevant source content
  2. Identify knowledge gaps requiring external research
  3. Conduct external research, add sources to Source-Registry
  4. Validate all external facts via P03 source hierarchy
  Phase B — Structure:
  Define outline, map source content to sections (traceability),
  update Traceability-Matrix live
  Phase C — Write:
  Write each section with source annotations,
  ensure numerical consistency, mark assumptions explicitly
  Phase D — Format:
  Apply consistent formatting, generate output file, validate
END FOR
**ANTI-HALLUCINATION RULES** (these are absolute):
- Never invent statistics. If data not found → "Data not available."
- Never invent URLs. Only link to URLs you have visited and confirmed.
- Never invent timelines. Base on source data or industry benchmarks.
- If unsure → mark with [NEEDS VERIFICATION] for user review.
### P06: Plan-vs-Content Verification
**INPUT**: Plan document, Deliverable set
**OUTPUT**: Gap Report
1\. Parse plan into checklist items
2\. FOR each item: locate in deliverable, assess coverage:
  FULL | PARTIAL | MISSING | DIVERGED
3\. Calculate coverage: count(FULL) / count(ALL) \* 100
4\. If coverage < 100% → return to P07
5\. If coverage = 100% → mark VERIFIED
### P07: Multi-Pass Gap Closure
**INPUT**: Gap Report from P06
**OUTPUT**: Zero-gap deliverable set
pass\_count = 0, MAX\_PASSES = 5
WHILE gaps remain:
  pass\_count += 1
  IF pass\_count > MAX\_PASSES: ESCALATE to user
  FOR each gap:
  Diagnose why missed, fix using P05 rules, insert at correct location
  Re-run P06
END WHILE
### P08: Translation & Localization
**INPUT**: Source document (language A)
**OUTPUT**: Translated document (language B) + Verification Report
Phase A — Structure Extraction:
  Parse into structural units, extract all numbers/dates, proper nouns/URLs
Phase B — Translation:
  Translate preserving: all numbers unchanged, proper nouns/URLs unchanged,
  table structure, technical terminology in original where standard
Phase C — 4-Check Verification:
  Check 1 (Structural Parity): paragraph/table/heading counts match
  Check 2 (Numeric Exact Match): all numbers present and unchanged
  Check 3 (Proper Noun/URL Preservation): all preserved unchanged
  Check 4 (Semantic Mapping): every source paragraph has translation
IF any check fails: fix and re-run Phase C
### P09: Document Fusion (Zero-Loss Merge)
**INPUT**: Ordered list of .docx files
**OUTPUT**: Single merged .docx
CRITICAL RULE: NEVER regenerate content. Use XML-level merge only.
 
Why: Content regeneration (reading + rewriting) ALWAYS loses content.
Observed in practice: 33-85% content loss from agent-based fusion.
XML merge copies raw nodes: mathematically impossible to lose content.
THIS IS THE SINGLE MOST IMPORTANT RULE IN THIS METHODOLOGY.
 
Use docxcompose (Python) or equivalent XML-level merge tool.
After merge: verify word count ≥ 99% of source total.
### P10: Final Consolidation
**INPUT**: All deliverables, verification reports, build scripts
**OUTPUT**: Organized 3-directory structure with manifest and compliance artifacts
ALGORITHM:
  1. Create 3-directory output structure (matching P00 scaffold):
  {project\_name}/
  ├── Deliverables/ — project outputs (docs, spreadsheets, reports)
  ├── ProcessArtifacts/ — compliance artifacts and process records
  │ └── BuildScripts/ — archived build/generation scripts
  └── Sources/ — original user-provided input files
 
  2. Move verified deliverables to Deliverables/
  3. Copy original source files to Sources/ (preserve originals)
  4. Generate compliance artifacts in ProcessArtifacts/:
  a. SOURCE-INVENTORY — P01 output: catalogued source documents
  b. TRACEABILITY-MATRIX — every deliverable traced to origin
  c. SOURCE-REGISTRY — all external sources classified L1–L5
  d. VALIDATED-CLAIMS — all claims assessed with status
  e. SPOT-CHECK-GUIDE — 10 verifiable facts for human reviewer
  f. VERIFICATION-REPORT — translation 4-check results (if applicable)
  g. PROCESS-HISTORY — chronological record of all phases and decisions
  h. MANIFEST — full file inventory with sizes and status
  5. Archive build scripts in ProcessArtifacts/BuildScripts/
  Include: README with execution order, dependencies, expected outputs
  6. Verify all files open correctly
  7. Generate MANIFEST with:
  {filename, directory, size\_kb, word\_count, status, completeness\_check}
  8. Present manifest to user for final approval
## Master Flow
START
│
├→ P00: Project Bootstrap (auto-detect domain, create scaffold)
│ │
│ \[GATE G0: User confirms domain profile + structure]
│ │
├→ P01: Source Ingestion \[GATE G1: inventory complete?]
├→ P02: Critical Review
│ ├→ P03: Source Validation (parallel, for flagged claims)
│ \[GATE G2: gap list confirmed?]
├→ P04: Priority Triage \[GATE G3: work packages approved?]
├→ FOR each tier (P0→P1→P2→P3):
│ ├→ P05: Deliverable Generation \[GATE G4: plan approved?]
│ ├→ P06: Verification
│ ├→ P07: Gap Closure (loops back to P06)
│ \[GATE G5: tier complete?]
│ END FOR
├→ P08: Translation (if multilingual)
├→ P09: Document Fusion (if consolidation needed)
│ \[GATE G6: fusion verified?]
├→ P10: Final Consolidation
END
## Master Executable Algorithm
FUNCTION execute\_research\_and\_planning(user\_request, source\_files):
  """
  Master algorithm for hallucination-free research and planning.
  Self-configuring: auto-detects domain and creates project scaffold.
  Returns: Set of verified, traceable deliverables.
  """
 
  # PHASE 0: BOOTSTRAP (self-configuration)
  domain\_profile = P00\_project\_bootstrap(user\_request, THIS\_METHODOLOGY)
  GATE(G0, user, "Is the domain profile and project structure correct?")
 
  # PHASE 1: UNDERSTAND
  inventory = P01\_source\_ingestion(source\_files)
  GATE(G1, user, "Is the source inventory complete?")
 
  # PHASE 2: ANALYZE
  gap\_list = P02\_critical\_review(inventory)
  validated\_claims = P03\_source\_validation(gap\_list.flagged\_claims)
  GATE(G2, user, "Do you confirm this gap list?")
 
  # PHASE 3: PLAN
  work\_packages = P04\_priority\_triage(gap\_list, validated\_claims)
  GATE(G3, user, "Do you approve these work packages?")
 
  # PHASE 4: EXECUTE + VERIFY (per tier)
  all\_deliverables = \[]
  FOR tier in \[P0, P1, P2, P3]:
  IF tier not in work\_packages: CONTINUE
  plan = generate\_execution\_plan(work\_packages\[tier])
  GATE(G4, user, "Do you approve this plan for {tier}?")
  deliverables = P05\_deliverable\_generation(plan, validated\_claims)
  gap\_report = P06\_verification(plan, deliverables)
  IF gap\_report.has\_gaps:
  deliverables = P07\_gap\_closure(gap\_report, deliverables)
  END IF
  all\_deliverables.extend(deliverables)
  GATE(G5, user, "{tier} complete. Proceed to next tier?")
  END FOR
 
  # PHASE 5: CONSOLIDATE
  IF user\_requests\_translation:
  translated = P08\_translation(source\_docs\_needing\_translation)
  all\_deliverables.extend(translated)
  END IF
  IF user\_requests\_fusion:
  fused\_docs = P09\_document\_fusion(all\_deliverables)
  GATE(G6, user, "Fusion verified at {coverage}%. Approve?")
  END IF
 
  output = P10\_final\_consolidation(all\_deliverables)
  RETURN output
END FUNCTION
 
 
FUNCTION GATE(gate\_id, user, question):
  """Mandatory checkpoint. NEVER skip or auto-approve."""
  present\_current\_state\_to\_user()
  response = ask\_user(question)
  IF response == APPROVED:
  log(gate\_id, "PASSED", timestamp)
  RETURN
  ELSE:
  feedback = get\_user\_feedback()
  apply\_feedback(feedback)
  GATE(gate\_id, user, question) // Retry
  END IF
END FUNCTION
 
 
FUNCTION assert\_sourced(content, source\_registry):
  """Run on every paragraph before including in a deliverable."""
  FOR each fact f in content:
  IF f.source NOT IN source\_registry:
  IF f.type == "number" OR f.type == "statistic":
  REJECT(f, "Unsourced numerical claim")
  ELSE IF f.type == "url":
  REJECT(f, "Unvisited URL")
  ELSE:
  MARK(f, "\[NEEDS VERIFICATION]")
  END IF
  END IF
  END FOR
END FUNCTION
## Decision Gates (7 total: G0–G6)
Every transition that changes scope, format, or priority requires explicit user approval. The AI agent NEVER proceeds past a gate without user confirmation. This is not optional — it is a safety mechanism against scope drift and silent errors.
| Gate | Location | Question |
|---|---|---|
| G0 | After P00 | Is the domain profile and project structure correct? |
| G1 | After P01 | Is the source inventory complete? |
| G2 | After P02/P03 | Do you confirm this gap list? |
| G3 | After P04 | Do you approve these work packages? |
| G4 | Before each tier's P05 | Do you approve this plan for {tier}? |
| G5 | After each tier's P06/P07 | {tier} complete. Proceed to next tier? |
| G6 | After P09 | Fusion verified at {coverage}%. Approve? |
## Invariant Rules (Never Violate)
| ID | Rule | Rationale |
|---|---|---|
| IR-01 | Never invent data | If a statistic, URL, date, or claim cannot be verified, mark it explicitly rather than guessing. |
| IR-02 | Never regenerate to merge | Document fusion must use XML-level merge. Content regeneration always loses content. |
| IR-03 | Never skip verification | Every deliverable must pass Plan-vs-Content verification before being marked complete. |
| IR-04 | Never proceed past a gate without user approval | Decision gates exist to prevent wasted work. Respect them. |
| IR-05 | Every number must trace to a source | Financial figures, timelines, headcounts, percentages — all must cite their origin. |
| IR-06 | Assumptions must be explicit | If no source exists, mark as ASSUMPTION with rationale. Never present assumptions as facts. |
| IR-07 | Verification requires quantitative evidence | Word counts, coverage percentages, check pass/fail — never claim verification without measurement. |
| IR-08 | Error logs are permanent | Every error encountered must be logged in Process History. Never silently fix and forget. |
## Error Taxonomy
| Error Type | Example | Detection | Prescribed Response |
|---|---|---|---|
| Content Regeneration Loss | Agent rewriting source docs during fusion → 33-85% content loss | Word count < 99% of source | NEVER regenerate content. Always use XML-level merge tools. Hard rule, no exceptions. |
| Hallucinated Data | Inventing statistics, URLs, or timelines not in sources | Source traceability check | Mark all unverified data with [NEEDS VERIFICATION]. Never present uncertain data as fact. |
| Cross-Reference Breakage | xlsx formula pointing to wrong row after insert | Formula audit after structural change | After modifying spreadsheet structure: verify ALL formulas, not just changed cells. |
| API Mismatch | Using wrong parameter name for library function | Runtime error | Check library documentation before using APIs. Do not assume parameter names. |
| Translation Drift | Numbers, URLs, or proper nouns altered during translation | 4-check verification (P08) | Run all 4 checks. Treat any Check 2/3 failure as blocking. |
| Scope Creep | Adding unrequested sections to a deliverable | Plan-vs-Content verification (P06) | Only produce what was requested. Flag additional ideas for user decision. |
| Silent Verification | Claiming "verified" without running actual checks | Require quantitative evidence | Every verification must produce measurable evidence. "Looks good" is not verification. |
| Batch Processing Failure | Agent crash mid-batch, partial output | File count check after batch | Process sequentially when stability matters. Parallelize only when items are independent. |
### Error Escalation Protocol
When any error from the taxonomy above is detected, apply this 3-step escalation:
ESCALATION PROTOCOL:
  1. RETRY with same approach (max 2 attempts)
  — Fix the specific issue, re-run the same process
  — If same error recurs: escalate to step 2
  2. SWITCH APPROACH
  — Use alternative method (e.g., different library, manual steps)
  — Document: {original\_approach, failure\_reason, new\_approach}
  — If alternative also fails: escalate to step 3
  3. ASK USER
  — Present: error description, what was tried, options available
  — User decides: skip, manual intervention, or accept partial result
  — Log decision in Process History
 
  NEVER: silently skip a failed step or mark it as complete.
  ALWAYS: log every escalation in Process History with timestamp.
## Traceability Matrix Template
Every project must maintain a live traceability matrix (in ProcessArtifacts/).
| Deliverable | Triggered By | Source Docs | External Sources | Verified By | Status |
|---|---|---|---|---|---|
| [filename] | [gap_id or user request] | [list of source docs used] | [list of URLs/references] | [P06 pass count, coverage %] | [VERIFIED / PENDING] |
## Source Registry Template
| ID | Source | Type | Authority Level | Access Date | Used In |
|---|---|---|---|---|---|
| S001 | [URL or document name] | [Official/Journal/Vendor/News] | [L1–L5] | [date accessed] | [deliverable filenames] |
## Process History Format
Every project must maintain a live Process History document (in ProcessArtifacts/). Updated in real-time, not retroactively.
PROCESS HISTORY REQUIRED SECTIONS:
  1. Phase Log (one entry per methodology phase executed):
  {phase\_number, phase\_name, started, completed, status,
  inputs\_used, outputs\_produced, errors\_encountered}
  2. Decision Log (one entry per significant decision):
  {decision\_id, phase, description, options\_considered,
  chosen\_option, rationale, user\_approved: Y/N}
  3. Error Log (one entry per error, maps to Error Taxonomy):
  {error\_id, phase, error\_type, description, escalation\_steps,
  resolution, time\_cost}
  4. File Timeline (one entry per file created/modified):
  {filename, phase\_created, phases\_modified, current\_status}
 
LIVE-UPDATE RULE: Process History MUST be updated at the END of
every phase, not retroactively. If a phase fails or is restarted,
both the failure and restart are logged (never overwrite history).
This artifact is referenced by IR-08 (error logs are permanent).
## Spot-Check Guide Format
Generated at P10 for human reviewer. Minimum 10 items.
Format per item:
  {item\_id, document, section, claim\_or\_fact,
  verification\_method, expected\_result, actual\_result, PASS/FAIL}
 
Scoring threshold:
  ≥8/10 PASS = acceptable quality
  <8/10 PASS = re-verify ALL deliverables before acceptance
## Scaling Rules
| Project Size | Source Documents | Deliverables | Verification Depth | Gate Strictness |
|---|---|---|---|---|
| Small (1–3 deliverables) | 1–5 sources | 1–3 outputs | Single-pass P06 sufficient | G1, G4 mandatory. Others optional. |
| Medium (4–12 deliverables) | 5–15 sources | 4–12 outputs | Multi-pass P06/P07 required | All gates mandatory. |
| Large (13+ deliverables) | 15+ sources | 13+ outputs | Multi-pass + independent re-verification | All gates + intermediate user checkpoints. |