Crab Catch — Research Skill Orchestration

Prerequisites

CLAWBOT must run these commands at session start (not just read them):

which agent-browser || npm install -g agent-browser
agent-browser install

Data Sources

CLAWBOT is the research driver — responsible for analysis, judgment, iteration, and report generation. The APIs, scripts, and tools below are data-fetching methods only.

API Base URL: https://crab-skill.opsat.io

Signature Authentication

All API requests except /api/health require Crab signature headers.

Run node skills/scripts/crab-sign.js once at session start to get headers JSON. (First run auto-generates credentials; cached signature reused if still valid within 24h.)
Store the output and attach these four headers to all subsequent API requests: X-Crab-Timestamp, X-Crab-Signature, X-Crab-Key, X-Crab-Address.
Only re-run with --refresh if API returns auth_expired.

Twitter & Social Data (see `twitter-analysis/SKILL.md` for full params)

Category	Key endpoints	Purpose
Profile	`/api/twitter/user`, `tweets`, `replies`	Basic info, content, interactions
Risk signals	`/api/twitter/deleted-tweets`, `follower-events`	Removed content, follow/unfollow patterns
Reply threads	`/api/readx/tweet-detail-conversation-v2`	Primary comment source (fast, raw data)
Quote tweets	`/api/readx/tweet-quotes`	KOL commentary, community opinions with context
Engagement data	`/api/readx/tweet-detail-v2`	Views/source — detect bot-inflation
Deleted content	`/api/readx/tweet-results-by-ids`	Batch fetch deleted tweet snapshots
Long-form	`/api/readx/tweet-article`	Technical analyses, roadmaps published as articles
Relationships	`/api/readx/following-light`, `friendships-show`	Inner circle, team relationship verification
Credibility	`/api/twitter/kol-followers`, `/api/readx/user-verified-followers`	Who credible follows them (`verified-followers` needs `user_id` not username)
Search	`/api/twitter/search`, `/api/readx/search2`	Risk signals, disputes, community discussions

GitHub Code (see `github-analysis/SKILL.md`)

Local script skills/scripts/github_analyze.js — no external API. convertToMarkdown(url, options) or analyzeRepository(url, options).

On-chain Data (see `onchain-audit/SKILL.md`)

Binance API — address + chainName (uppercase: BSC/ETHEREUM/BASE/SOLANA):

Endpoint	Description
`/api/onchain/audit`	Contract audit (dual-source)
`/api/onchain/token-info`	Token metadata and market dynamics
`/api/onchain/wallet`	Wallet positions (BSC/BASE/SOLANA only)
`/api/onchain/token-search`	Token search (requires `keyword`)

Bitget API — chain + contract (lowercase: bnb/eth/base/sol):

Endpoint	Description
`/api/onchain-2/token-info`	Token details
`/api/onchain-2/token-price`	Token price
`/api/onchain-2/tx-info`	Transaction statistics
`/api/onchain-2/liquidity`	Liquidity pool info
`/api/onchain-2/security-audit`	Security audit

Onchain Explorer API — chain + address (see API_EXPLORER.md for full params):

Endpoint	Chain	Description
`/api/explorer/contract`	ETH, BSC	Contract ABI, source code, compiler info, proxy detection
`/api/explorer/token-history`	ETH, BSC, SOL	Token transfer history with pagination
`/api/explorer/sol-address`	SOL	SOL/SPL balances + recent transfer records

Website Content (see `agent-browser/SKILL.md`)

CLAWBOT uses agent-browser CLI to open and inspect websites.

Language Preference

Output language matches the user's input language; default Chinese (zh-CN). Raw API data (usernames, tickers, addresses, code) stays in original form.

Orchestration Flow

Callback-driven: each module's output triggers queries in other modules. Modules keep feeding each other until no new high-value leads remain.

User provides URL / Ticker / contract address + research intent
  │
  ▼
Step 1 — Parse input, initialize entity queue
  Extract: Twitter links, GitHub repos, contract addresses, tickers, chain
  Aggregator URLs → extract entities from path (see rules below)

  Initialize:
    entity_queue  = [{ entity, type, depth: 0 }]
    processed     = set()
    claims        = []    # official claims to verify later
    fund_trace    = []    # addresses to trace fund flow
    team_members  = []    # { handle, role, source }
    MAX_DEPTH     = 2
  │
  ▼
Step 2 — Multi-module collection

  While entity_queue is not empty:
    pop → skip if processed or depth > MAX_DEPTH → route by type:
      URL      → 2a Website
      Twitter  → 2b Social
      GitHub   → 2c Code
      Contract → 2d Chain
      Ticker   → 2d token-search first
    After each module: extract new entities → queue at depth+1
    (see Cross-module Callback Summary below for full routing)

  ── 2a. Website exploration ──────────────────────────────────

  **Use `agent-browser` CLI** (see agent-browser/SKILL.md for commands).
  agent-browser renders JS, captures interactive elements, and allows
  clicking through pages — essential for DApp testing and dynamic sites.
  Fallback to WebFetch only when agent-browser fails (e.g. install issue).

  Visit pages in order:
    Landing → Docs/Whitepaper → Team/About → DApp → Tokenomics → Footer

  Extract from each page:
    - Official claims → append to claims[] ("audited by X", "100M supply",
      "decentralized", "LP locked", partnerships, etc.)
    - Team names + social links → team_members[] + queue 2b
    - Contract addresses → queue 2d
    - GitHub repos → queue 2c

  DApp proactive testing (key investigation step):
    - Open DApp via agent-browser, wait for load
    - Does the UI render real data or just a mock shell?
    - Are core functions visible and interactive?
    - Check network requests: broken APIs? Suspicious external calls?
    - If DApp shows on-chain values → cross-check against 2d data
    - Screenshot as evidence

  Security check: SSL, domain age, redirects, suspicious popups.
  Fallback: blank/Cloudflare → retry with `--headed`. No website → flag as risk.

  ── 2b. Social data collection (Twitter) ─────────────────────

  Purpose: collect project claims, discover team, find community disputes.
  NOT the investigation core — feeds into 2a/2c/2d for verification.

  For project official account:
    1. /api/twitter/user + tweets + replies + deleted-tweets (parallel)
    2. Pick 1-2 high-value tweets → conversation-v2 + quotes
    3. /api/readx/following-light → identify team members from following list
       (mutual follows, bio mentions project, new account only posts about project)
       → add to team_members[], queue 2b at depth+1
    4. Risk search: search2 "{project} scam OR rug OR hack OR exploit"

  For team member accounts (depth 1+):
    1. /api/twitter/user + tweets (parallel)
    2. Only retain project-related tweets → append to claims[]
       (team member statements carry same weight as official claims)
    3. friendships-show with other known team members
       (all isolated = fake team red flag)

  ── 2c. Code analysis (GitHub) ───────────────────────────────

  github-analysis → analyzeRepository / convertToMarkdown

  Focus: claim verification + security scan
    - "Open source" → repo public? Code complete or stub?
    - "Audited" → audit report in repo? Code matches?
    - Hardcoded addresses (admin, treasury) → queue 2d + fund_trace[]
    - Suspicious patterns: obfuscation, eval(), wallet-draining code,
      backdoors, malicious dependencies, clipboard hijacking
    - Contributor identities → try resolve to Twitter → team_members[]
    - Freshness: last commit, bus factor, fork-of-fork detection

  ── 2d. On-chain analysis (investigation core) ───────────────

  Phase 1 — Token & contract basics (parallel):
    Binance: audit, token-info, wallet
    Bitget:  token-info, token-price, tx-info, liquidity, security-audit
    Cross-verify between sources.

  Phase 2 — Contract deep inspection (ETH/BSC):
    /api/explorer/contract → ABI + source code
    - Read ABI: identify owner-only functions (pause, mint, blacklist,
      upgrade, setFee, transferOwnership)
    - If proxy contract: queue implementation address (recursive 2d)
    - If source verified: scan for backdoor patterns in code
    - If NOT verified: flag as risk (cannot audit)

  Phase 3 — Fund flow tracing:
    Triggered by: fund_trace[], deployer discovery, large holder detection
    /api/explorer/token-history → trace address transaction history

    Tracing logic (recursive within depth limit):
      1. Fetch token-history for the address
      2. Identify significant transfers:
         - Large outflows to unknown wallets → trace recipient
         - Inflows from deployer → insider?
         - Flows to/from known exchanges → cash-out pattern?
         - Circular flows (A→B→C→A) → wash trading?
      3. For each significant counterparty:
         - New address → add to fund_trace[] at depth+1
         - Known exchange → note cash-out
         - Mixer/bridge → flag as risk signal
      4. Stop when: depth limit / no significant new flows

    SOL specific:
    - /api/explorer/sol-address → balance snapshot + SPL tokens
    - /api/explorer/token-history (SOL) → filter by type/source
      SWAP on Jupiter/Raydium = trading; TRANSFER = fund movement

  │
  ▼
Step 3 — Verify claims & resolve contradictions
  Goal: every official claim gets a verdict. Contradictions are the story.
  If verification needs data not yet collected → callback to Step 2.

  Process claims[] collected during Step 2:

    | Claim | Verify with | How |
    |-------|-------------|-----|
    | "Decentralized" | Explorer ABI + on-chain | pause/mint/blacklist? EOA or multisig? |
    | "Audited by X" | Website + GitHub + firm | Link valid? Code matches audited version? |
    | "Max supply N" | Explorer source code | Uncapped mint()? Owner can mint? |
    | "Locked liquidity" | On-chain LP lock | Lock verified? Duration? Amount? |
    | "Open source" | GitHub + Explorer | Public? Verified? ABI matches? |
    | "Partnerships" | Partner channels (browser) | Partner acknowledges? One-sided? |

    Priority: verify claims affecting user funds first.
    Mark each: ✅ Verified / ⚠️ Unverified / ❌ Contradicted

  For each ❌ or anomaly → dispute analysis:
    1. Project claim vs actual data (on-chain, code) → cite both
    2. Community analysis → search2 + conversation threads
    3. On-chain evidence → tx hashes, fund flow from fund_trace[]
    4. Synthesize: claim → reality → community → verdict
       🔴 → full analysis / 🟡 → summary only
  │
  ▼
Step 4 — Hypothesis-driven deep dig
  Follow high-value leads from Steps 2-3. May callback to any module.

  Key hypotheses:
    - Contract upgradable → who holds proxy admin?
    - Large holder → tokens from deployer? Insider?
    - Deleted tweets → timing vs on-chain events?
    - Deployer has other contracts → same pattern? Previous rugs?

  Team verification:
    - Identity: Twitter vs website claims vs GitHub commits
    - History: search2 "{name} founder OR CEO", wallet history
    - Red flags: account age = project age? No pre-project history?

  Any new lead → callback to Step 2 (respecting MAX_DEPTH).
  Stop when: no new leads or sufficient for judgment.

  ─── END OF DATA COLLECTION ───
  │
  ▼
Step 5 — Distill (no fetching)
  Rank by impact. Discard noise. Connect dots. Reconstruct timeline.
  │
  ▼
Step 6 — Produce report (see REPORT_TEMPLATE.md)
  Curated intelligence, NOT a data dump. Focus on:
    1. Contradictions & anomalies
    2. Claim verification results
    3. Fund flow analysis
    4. Proactive test results (DApp, website)
    5. Security findings
  Omit routine confirmations. [[N]](url) citations required.
  Language follows user input; default zh-CN.

Cross-module Callback Summary

Each module feeds discoveries into other modules:

  ┌──────────┐     handles, claims      ┌──────────┐
  │  Website  │ ──────────────────────→  │  Twitter  │
  │   (2a)    │ ◀────────────────────── │   (2b)    │
  └─────┬─────┘   URLs from tweets      └─────┬─────┘
        │ contracts, repos                     │ addresses, accusations
        ▼                                      ▼
  ┌──────────┐     hardcoded addrs      ┌──────────┐
  │  GitHub   │ ──────────────────────→  │ On-chain  │
  │   (2c)    │ ◀── code vs claims ──── │   (2d)    │
  └──────────┘                          └─────┬─────┘
                                               │ recursive
                                               ▼
                                          (2d again)

Source	Discovers	Triggers
Website	Twitter handles, claims, contracts, repos	→ team_members[]/claims[]/2b/2c/2d
Twitter	URLs, addresses, accusations, team members, statements	→ 2a/2d/fund_trace[]/claims[]/team_members[]
GitHub	Contributors, hardcoded addrs, code contradictions, trojans	→ team_members[]/2d/fund_trace[]/claims[]
On-chain	Proxy impl, deployer contracts, large holders, data contradictions	→ 2d recursive/fund_trace[]/claims[]

Depth control: 0 = user input → 1 = discovered → 2 = max, high-value only → beyond: note only

Failure Handling

Failure type	Action
Timeout / 502-504	Retry once after 3s
429 (rate limit)	Retry once after `Retry-After` or 10s
401 / 403 / 400	Do not retry; skip
Other errors	Do not retry; skip

On failure: skip source, continue. Include Data Coverage note in report. Omit sections with no data; never halt for a single failure.

Entity Extraction Rules

Entity Type	Identification
Twitter profile	`x.com/{username}` or `twitter.com/{username}`
Twitter post	`x.com/{username}/status/{id}`
GitHub repo	`github.com/{owner}/{repo}`
EVM contract	`0x` + 40 hex chars
Solana address	base58 32–44 chars + contextual keywords (below)
Ticker	`$XXX` or `ticker/symbol/token: XXX`
Chain	URL domain / path keywords / page text

Solana keywords (at least one must be present): solana, sol, raydium, jupiter, orca, meteora, pump.fun, moonshot, birdeye, solscan, solana.fm, spl token, program id No keyword → flag as "unresolved address".

Aggregator URL Parsing

Platform	Path	Parsed result
clawhub.ai	`/owner/repo`	→ GitHub repo (use `github-analysis`, skip browser)
dexscreener.com	`/chain/address`	→ contract + chain
dextools.io	`/app/chain/pair/address`	→ contract + chain
pump.fun	`/address`	→ Solana contract
gmgn.ai	`/chain/address`	→ contract + chain
birdeye.so	`/token/address`	→ contract
defined.fi	`/chain/address`	→ contract + chain

Data Display Rules

Skip any metric that returned an error or timed out — leave it out entirely.
Do not display API latency unless it was actually measured successfully.

Local Memory & Report Storage

Save report as PDF to ~/.crab-catch/reports/{project_name}_{YYYY-MM-DD}.pdf
Maintain index ~/.crab-catch/reports/index.json: { "project": "name", "date": "YYYY-MM-DD", "file": "filename.pdf", "entry": "original input" }

Report Output

Use REPORT_TEMPLATE.md as the report structure.

Report philosophy: curated intelligence, not data dump

The report should be concise and decision-oriented. The reader wants to know: is this project trustworthy? What are the risks? Where do the claims fall apart?

Five pillars of the report (in order of importance):

Contradictions & anomalies — where different sources tell different stories. This is the most valuable content. Twitter says X, website says Y, on-chain shows Z.
Claim verification — systematic test of every official statement. What the project claims vs what the code/chain actually shows.
Fund flow analysis — where the money goes. Deployer → holders → exchanges. Insider patterns, circular flows, cash-outs.
Proactive testing — DApp functionality, website integrity, code security. Does the product work? Is the website legit? Are there backdoors in the code?
Security findings — contract risks, code trojans, permission hazards. ABI dangerous functions, proxy patterns, obfuscated code.

What to omit: routine data that confirms nothing special. If a metric is normal, don't list it. If a claim checks out cleanly, a single ✅ row is enough — no paragraph. Only expand on findings that change the reader's decision.

Section constraints

Must keep — always present, fixed order:

Header (project name + timestamp)
📌 Basic Information (flexible rows — agent adds/removes based on data, no fixed schema)
🧠 Core Findings (with Executive Summary)
📝 Conclusion & Verdict
📂 References

Default keep — user can request to skip:

🛡️ Verification & Cross-Reference (Claim / Contradictions / Disputes / Gaps)
⚠️ Risk Warning

Data-dependent — skip if no data:

📊 Deep Dive
- 👤 Team & Key Figures
- 💻 GitHub Analysis
- ⛓️ On-chain Security
- 📈 Social Signals
- 📅 Project Timeline

Formatting rules

Citation system (mandatory, like academic papers):

Every factual claim MUST have [[N]](url) citation
No source = mark as ⚠️ Unverified, NOT stated as fact
Sequential numbering, first appearance order
Bidirectional: every [[N]] ↔ References entry

Other:

Numbers: K / M / B; prices: $ prefix
Highlight high-risk signals (honeypot, high tax, upgradable contracts)
Data Coverage note when sources unavailable
DYOR disclaimer
Output language matches user input; default zh-CN

Crab Catch

Safety Notice

Copy this and send it to your AI assistant to learn

Crab Catch — Research Skill Orchestration

Prerequisites

Data Sources

Signature Authentication

Twitter & Social Data (see `twitter-analysis/SKILL.md` for full params)

GitHub Code (see `github-analysis/SKILL.md`)

On-chain Data (see `onchain-audit/SKILL.md`)

Website Content (see `agent-browser/SKILL.md`)

Language Preference

Orchestration Flow

Cross-module Callback Summary

Failure Handling

Entity Extraction Rules

Aggregator URL Parsing

Data Display Rules

Local Memory & Report Storage

Report Output

Report philosophy: curated intelligence, not data dump

Section constraints

Formatting rules

Source Transparency

Related Skills

ERC-800Claw

End-to-end encrypted messaging and EVM crypto wallet for agent identity

Clawpay

Typhoon Starknet Account

Crab Catch

Safety Notice

Copy this and send it to your AI assistant to learn

Crab Catch — Research Skill Orchestration

Prerequisites

Data Sources

Signature Authentication

Twitter & Social Data (see twitter-analysis/SKILL.md for full params)

GitHub Code (see github-analysis/SKILL.md)

On-chain Data (see onchain-audit/SKILL.md)

Website Content (see agent-browser/SKILL.md)

Language Preference

Orchestration Flow

Cross-module Callback Summary

Failure Handling

Entity Extraction Rules

Aggregator URL Parsing

Data Display Rules

Local Memory & Report Storage

Report Output

Report philosophy: curated intelligence, not data dump

Section constraints

Formatting rules

Source Transparency

Related Skills

ERC-800Claw

End-to-end encrypted messaging and EVM crypto wallet for agent identity

Clawpay

Typhoon Starknet Account

Twitter & Social Data (see `twitter-analysis/SKILL.md` for full params)

GitHub Code (see `github-analysis/SKILL.md`)

On-chain Data (see `onchain-audit/SKILL.md`)

Website Content (see `agent-browser/SKILL.md`)