debug

- Find why a run is stuck, retrying, or failing.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "debug" with this command: npx skills add openai/symphony/openai-symphony-debug

Debug

Goals

  • Find why a run is stuck, retrying, or failing.

  • Correlate Linear issue identity to a Codex session quickly.

  • Read the right logs in the right order to isolate root cause.

Log Sources

  • Primary runtime log: log/symphony.log

  • Default comes from SymphonyElixir.LogFile (log/symphony.log ).

  • Includes orchestrator, agent runner, and Codex app-server lifecycle logs.

  • Rotated runtime logs: log/symphony.log*

  • Check these when the relevant run is older.

Correlation Keys

  • issue_identifier : human ticket key (example: MT-625 )

  • issue_id : Linear UUID (stable internal ID)

  • session_id : Codex thread-turn pair (<thread_id>-<turn_id> )

elixir/docs/logging.md requires these fields for issue/session lifecycle logs. Use them as your join keys during debugging.

Quick Triage (Stuck Run)

  • Confirm scheduler/worker symptoms for the ticket.

  • Find recent lines for the ticket (issue_identifier first).

  • Extract session_id from matching lines.

  • Trace that session_id across start, stream, completion/failure, and stall handling logs.

  • Decide class of failure: timeout/stall, app-server startup failure, turn failure, or orchestrator retry loop.

Commands

1) Narrow by ticket key (fastest entry point)

rg -n "issue_identifier=MT-625" log/symphony.log*

2) If needed, narrow by Linear UUID

rg -n "issue_id=<linear-uuid>" log/symphony.log*

3) Pull session IDs seen for that ticket

rg -o "session_id=[^ ;]+" log/symphony.log* | sort -u

4) Trace one session end-to-end

rg -n "session_id=<thread>-<turn>" log/symphony.log*

5) Focus on stuck/retry signals

rg -n "Issue stalled|scheduling retry|turn_timeout|turn_failed|Codex session failed|Codex session ended with error" log/symphony.log*

Investigation Flow

  • Locate the ticket slice:

  • Search by issue_identifier=<KEY> .

  • If noise is high, add issue_id=<UUID> .

  • Establish timeline:

  • Identify first Codex session started ... session_id=... .

  • Follow with Codex session completed , ended with error , or worker exit lines.

  • Classify the problem:

  • Stall loop: Issue stalled ... restarting with backoff .

  • App-server startup: Codex session failed ... .

  • Turn execution failure: turn_failed , turn_cancelled , turn_timeout , or ended with error .

  • Worker crash: Agent task exited ... reason=... .

  • Validate scope:

  • Check whether failures are isolated to one issue/session or repeating across multiple tickets.

  • Capture evidence:

  • Save key log lines with timestamps, issue_identifier , issue_id , and session_id .

  • Record probable root cause and the exact failing stage.

Reading Codex Session Logs

In Symphony, Codex session diagnostics are emitted into log/symphony.log and keyed by session_id . Read them as a lifecycle:

  • Codex session started ... session_id=...

  • Session stream/lifecycle events for the same session_id

  • Terminal event:

  • Codex session completed ... , or

  • Codex session ended with error ... , or

  • Issue stalled ... restarting with backoff

For one specific session investigation, keep the trace narrow:

  • Capture one session_id for the ticket.

  • Build a timestamped slice for only that session:

  • rg -n "session_id=<thread>-<turn>" log/symphony.log*

  • Mark the exact failing stage:

  • Startup failure before stream events (Codex session failed ... ).

  • Turn/runtime failure after stream events (turn_* / ended with error ).

  • Stall recovery (Issue stalled ... restarting with backoff ).

  • Pair findings with issue_identifier and issue_id from nearby lines to confirm you are not mixing concurrent retries.

Always pair session findings with issue_identifier /issue_id to avoid mixing concurrent runs.

Notes

  • Prefer rg over grep for speed on large logs.

  • Check rotated logs (log/symphony.log* ) before concluding data is missing.

  • If required context fields are missing in new log statements, align with elixir/docs/logging.md conventions.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

linear

No summary provided by upstream source.

Repository SourceNeeds Review
980-openai
General

debug

No summary provided by upstream source.

Repository SourceNeeds Review
General

push

No summary provided by upstream source.

Repository SourceNeeds Review
General

land

No summary provided by upstream source.

Repository SourceNeeds Review