LLM Provider Forensics
Agent-facing forensic skill for identifying what an LLM endpoint most likely is.
Trigger conditions
Use this skill when asked to:
- verify whether a claimed model is genuine
- identify which family an endpoint most resembles
- distinguish focused route vs wrapped route vs aggregation pool
- compare multiple providers claiming to expose the same model
- evaluate primary/fallback/avoid decisions
- deeply audit suspicious gateways for GPT / Claude / Gemini / GLM / Qwen / Kimi / MiniMax / DeepSeek behavior
Core rule
Do not output false certainty. Produce a confidence-based operational judgment.
Coverage
Families:
- OpenAI-compatible protocol layer
- GPT / OpenAI-style
- Claude / Anthropic-style
- Gemini / Google-style
- GLM / Zhipu-style
- Qwen / Tongyi-style
- Kimi / Moonshot-style
- MiniMax-style
- DeepSeek-style
- mixed aggregation pool / compatibility gateway
Dimensions:
- catalog topology
- protocol compatibility
- response schema shape
- repeated stability
- strict formatting control
- family fingerprinting
- long-context retention
- structured-output stress
- refusal/safety style
- randomness / variance profile
- streaming / error fingerprints
- cross-protocol consistency
Current implementation note:
openai-compatiblenow means protocol layer only, not GPT-family proof.- The deepest automatic suite is strongest for OpenAI-compatible / mixed gateway providers.
- Anthropic-native and Gemini-native routes currently have solid protocol/family checks, plus native deep tests, but protocol success alone must not be read as family proof.
- Treat all family conclusions as confidence-based and inspect references before overclaiming.
Investigation workflow
- Identify likely protocol family or families.
- Probe catalog/list endpoints when available.
- Probe minimal inference endpoints for each plausible protocol family.
- Separate protocol-layer conclusion from suspected model family conclusion.
- Run repeated stability tests on the best working route.
- Run strict formatting tests.
- Run deeper advanced dimensions when the user prioritizes realism over speed.
- Inspect family fingerprint evidence and produce a confidence-based judgment.
References to load as needed
- Main checklist:
references/forensics-checklist.md - Advanced dimensions:
references/advanced-dimensions.md - Error/stream/variance:
references/error-stream-variance.md - Protocol specifics:
references/protocol-openai.md,references/protocol-anthropic.md,references/protocol-gemini.md,references/protocol-glm.md - Family fingerprints:
references/fingerprint-*.md - Native deep tests:
references/deep-claude.md,references/deep-gemini.md
Final labels
high-confidence-focused-or-genuine-routemedium-confidence-likely-routed-or-wrappedhigh-confidence-multi-model-aggregation-poollow-confidence-or-unusable
Use high-confidence-focused-or-genuine-route sparingly. It should require:
- stable repeated success
- no strong mixed-pool signal
- coherent family fingerprint
- and no obvious gateway-normalization red flags in deep tests
Agent output contract
Return sections in this order:
- Declared facts
- Availability status
- Protocol-layer findings
- Suspected model-family findings
- Stability findings
- Capability/format findings
- Advanced-dimension findings
- Final judgment
- Need-human-review items
- Recommended operational posture
Preferred execution
python3 scripts/llm_provider_forensics.py --config /root/.openclaw/openclaw.json --providers omgteam ypemc vpsai --model gpt-5.4 --deep