Last used: 2026-03-24 Memory references: 18 Status: Active

insight-engine

Data-driven insights from operational logs: collect → stats → LLM interpretation → Notion.

Architecture

collect (Python stats only)
  ├── Langfuse OTEL traces/scores/observations
  ├── OpenClaw/gateway logs
  ├── Git activity
  └── Control plane scores
↓
build_*_data_packet()  ← all stats computed in Python before LLM call
↓
call_claude(system_prompt, structured_json)  ← LLM interprets, doesn't compute
↓
write_*_reflection() → Notion

See references/architecture.md for full design rationale.

Quick start

# Install deps
pip install anthropic requests pyyaml

# Configure
cp scripts/config/analyst.yaml.example config/analyst.yaml
# Edit config/analyst.yaml — set langfuse URL, notion IDs, model choices

# Dry run (local Ollama, no Notion write)
python3 scripts/src/engine.py --mode daily --dry-run

# Print data packet + prompt to stdout (for agent consumption, no API calls)
python3 scripts/src/engine.py --mode daily --data-only

# Live run
python3 scripts/src/engine.py --mode daily
python3 scripts/src/engine.py --mode weekly
python3 scripts/src/engine.py --mode monthly

Required env vars

ANTHROPIC_API_KEY=sk-ant-...    # Anthropic API key
NOTION_API_KEY=secret_...       # Notion integration token
LANGFUSE_BASE_URL=http://localhost:3100   # Langfuse server URL
LANGFUSE_PUBLIC_KEY=pk-lf-...   # Langfuse public key
LANGFUSE_SECRET_KEY=sk-lf-...   # Langfuse secret key
NOTION_ROOT_PAGE_ID=<uuid>      # Root Notion page for reports
NOTION_DAILY_DB_ID=<uuid>       # Notion database for daily entries

Or configure in config/analyst.yaml.

Key design principles

Stats before LLM — Python computes all numbers. The LLM interprets, doesn't aggregate.
Citation-enforcing prompts — System prompts require every claim to cite a specific number.
No hallucinated trends — < 7 data points → report "insufficient data (n=X)"
Dry-run mode — Uses local Ollama (free) to preview output; skip Notion write.
Data-only mode — Outputs the full data packet + prompts for agent/subagent use.

Cron setup (LaunchAgent example)

<!-- ~/Library/LaunchAgents/com.yourname.insight-engine-daily.plist -->
<key>StartCalendarInterval</key>
<dict>
  <key>Hour</key><integer>23</integer>
  <key>Minute</key><integer>0</integer>
</dict>
<key>ProgramArguments</key>
<array>
  <string>/usr/bin/python3</string>
  <string>/path/to/insight-engine/scripts/src/engine.py</string>
  <string>--mode</string><string>daily</string>
</array>

Extending to new data sources

Add a collector in scripts/src/collectors/:

Create my_source.py with a fetch_*() function returning a plain dict
Import and call it in build_daily_data_packet() in engine.py
Reference the new key in prompts/daily_analyst.md under "Data sources"

insight-engine

Safety Notice

Copy this and send it to your AI assistant to learn

insight-engine

Architecture

Quick start

Required env vars

Key design principles

Cron setup (LaunchAgent example)

Extending to new data sources

See also

Source Transparency

Related Skills

Gpu Cluster Manager

Local Llm Router

yuhang

Venn Nino