codex-skill

Use when user asks to leverage codex, gpt-5, or gpt-5.1 to implement something (usually implement a plan or feature designed by Claude). Provides non-interactive automation mode for hands-off task execution without approval prompts.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "codex-skill" with this command: npx skills add feiskyer/codex-skill

Codex Agent Skill

Operate Codex CLI as a managed coding agent — from worktree setup through PR merge.

Prerequisites

codex --version  # Verify installed
# Install: npm i -g @openai/codex  or  brew install codex
tmux -V          # tmux required for full workflow

CLI Quick Reference

FlagEffect
exec "prompt"Non-interactive one-shot, exits when done
--full-autoAlias for -s workspace-write (auto-approve file edits)
-s workspace-writeRead + write files in workspace
-s read-onlyAnalysis only, no modifications (default for exec)
-s danger-full-accessFull access including network and system
--dangerously-bypass-approvals-and-sandboxSkip all prompts + no sandbox (safe in containers/VMs)
-m <model>Model selection — only use when user explicitly requests a model (e.g. gpt-5.1-codex-max). Omit to use Codex default.
-c "model_reasoning_effort=high"Reasoning effort: low, medium, high
--jsonStructured JSON Lines output
-o <file>Write final output to file
-C <dir> / --cd <dir>Set working directory
--add-dir <dir>Allow writing to additional directories
--skip-git-repo-checkRun in non-git directories
resume --lastResume last session with new prompt

Execution Modes

Quick Mode — Small Tasks

For trivial fixes, one-file changes, or analysis. Use exec (non-interactive):

# Via OpenClaw exec — use background=true + pty=true, NO hard timeout
# pty=true ensures codex CLI flushes output properly (no buffering issues)
# (hard timeout kills the process; instead we poll and extend)
exec(command="codex exec --full-auto 'fix the typo in README.md'",
     workdir="/path/to/project", background=true, pty=true)

# With high reasoning
exec(command="codex exec -c 'model_reasoning_effort=high' --full-auto 'fix the auth bug'",
     workdir="/path/to/project", background=true, pty=true)

Adaptive Timeout (Poll-and-Extend)

Do NOT use timeout= for codex tasks. Instead, use background execution with periodic polling. This prevents premature kills on long-running tasks:

  1. Launch with background=true (no timeout)
  2. Poll every ~5 min with process(action="poll", sessionId=<id>, timeout=300000)
  3. If process is still running → it's making progress, keep waiting
  4. If process exited → check logs, done
  5. Safety net: if no new output for 12 hours, ask user before killing
Poll loop (agent behavior, not a script):

  poll_interval = 5 min (300000 ms)
  max_silent_rounds = 144  (= 12 hours with no new output → ask user)

  repeat:
    result = process(action="poll", sessionId=<id>, timeout=300000)
    if result.completed:
      → check exit code, read logs, report result
      → break
    else:
      new_output = process(action="log", sessionId=<id>, limit=20)
      if new_output changed since last check:
        silent_rounds = 0          # still producing output, keep going
      else:
        silent_rounds += 1
      if silent_rounds >= max_silent_rounds:
        → notify user: "Codex has been silent for 12 hours, kill or keep waiting?"
        → wait for user decision

This way tasks that need 5 min or several hours both work without premature kills.

Quick Mode caveats:

  • Session output is held in memory only — lost on OpenClaw restart (no disk persistence). For truly critical tasks, prefer Full Mode (tmux + log file).
  • In-memory output is capped by PI_BASH_MAX_OUTPUT_CHARS. Very verbose codex tasks may lose early output from process log. Use process log offset:0 limit:50 to check if the beginning is still available; if not, the cap was hit.
  • process is scoped per agent — you can only see sessions you started.

Full Mode — Features, Bugfixes, Refactors

For non-trivial tasks, use the full workflow below. This gives you:

  • Isolated worktree — no conflicts with other work
  • tmux session — mid-task steering without killing the agent
  • Task tracking — know what's running at all times
  • Quality gates — Definition of Done checklist
  • Smart retries — don't waste tokens on repeated failures

Full Workflow: Task → Merged PR

Step 1: Create Worktree

Isolate each task in its own worktree and branch:

TASK_ID="feat-custom-templates"
BRANCH="feat/$TASK_ID"
REPO_ROOT=$(git rev-parse --show-toplevel)
WORKTREE="/tmp/worktrees/$TASK_ID"

git worktree add -b "$BRANCH" "$WORKTREE" origin/main
cd "$WORKTREE"

# Install dependencies (adapt to your stack)
pnpm install   # or: npm install / go mod tidy / pip install -r requirements.txt

Step 2: Launch Agent in tmux

Start Codex in interactive mode (no exec) so you can steer mid-task. Important: Use tmux pipe-pane to log output — do NOT use | tee because it turns stdout into a pipe, which breaks interactive mode (codex detects !isatty(stdout) and may disable interactive features, breaking send-keys steering).

LOG_FILE="/tmp/worktrees/$TASK_ID/codex-output.log"

# 1. Create session (starts a shell — codex not launched yet)
tmux new-session -d -s "$TASK_ID" -c "$WORKTREE"

# 2. Attach logging BEFORE launching codex — prevents losing early output
#    stdbuf -oL = line-buffered writes, so tail -f shows progress in real time
#    (plain cat buffers when writing to a file, causing monitoring lag)
tmux pipe-pane -t "$TASK_ID" -o "stdbuf -oL cat >> $LOG_FILE"

# 3. Launch codex via send-keys — all output captured from the start
#    Exit code is appended to log on completion for reliable status detection
tmux send-keys -t "$TASK_ID" \
  'codex -c "model_reasoning_effort=high" \
   --dangerously-bypass-approvals-and-sandbox \
   '"'"'Your detailed prompt here.

When completely finished:
1. Commit all changes with descriptive messages
2. Push the branch: git push -u origin '"$BRANCH"'
3. Create PR: gh pr create --fill
4. Notify: openclaw system event --text "Done: '"$TASK_ID"'" --mode now'"'"' \
   ; echo "CODEX_EXIT=$?" >> '"$LOG_FILE" Enter

Why this order (session → pipe-pane → send-keys)?

  • No race condition — if you pass the command directly to tmux new-session, output produced before pipe-pane attaches is lost from the log file
  • Exit code capturedecho "CODEX_EXIT=$?" appends the exit code to the log, so you can distinguish success from crash (otherwise tmux discards it on session close)
  • Line-buffered loggingstdbuf -oL ensures tail -f $LOG_FILE works in real time

Why interactive mode (no exec)?

  • Allows mid-task steering via tmux send-keys
  • Agent can be redirected without killing and restarting
  • --dangerously-bypass-approvals-and-sandbox is safe in container/sandbox environments

Step 3: Register Task

Track all active tasks in a JSON registry:

mkdir -p "$REPO_ROOT/.clawd"
TASKS_FILE="$REPO_ROOT/.clawd/active-tasks.json"

# Initialize if not exists
[ -f "$TASKS_FILE" ] || echo '{"tasks":[]}' > "$TASKS_FILE"

# Register
jq --arg id "$TASK_ID" --arg branch "$BRANCH" --arg wt "$WORKTREE" \
  '.tasks += [{
    "id": $id,
    "agent": "codex",
    "branch": $branch,
    "worktree": $wt,
    "tmuxSession": $id,
    "status": "running",
    "startedAt": (now|floor),
    "pr": null,
    "retries": 0,
    "checks": {}
  }]' "$TASKS_FILE" > /tmp/tasks.$$.json && mv /tmp/tasks.$$.json "$TASKS_FILE"

Step 4: Monitor & Steer

# --- Status check ---

# Is the agent still running?
tmux has-session -t "$TASK_ID" 2>/dev/null && echo "running" || echo "done"

# Check exit code (if agent finished — written by the exit-code capture in Step 2)
grep "CODEX_EXIT=" "/tmp/worktrees/$TASK_ID/codex-output.log"

# --- Reading output ---

# Use the LOG FILE, not capture-pane, for long-running tasks.
# tmux capture-pane only holds ~2000 lines of scrollback — earlier output is silently
# dropped. The log file (via pipe-pane) retains everything.

# View recent output (clean — strips ANSI escape codes from colors/spinners)
sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' "/tmp/worktrees/$TASK_ID/codex-output.log" | tail -100

# Follow output in real time (works because of stdbuf -oL in Step 2)
tail -f "/tmp/worktrees/$TASK_ID/codex-output.log"

# Search for errors (strip ANSI first for clean grep results)
sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' "/tmp/worktrees/$TASK_ID/codex-output.log" \
  | grep -i "error\|fail\|panic"

# Quick glance via tmux pane (fine for short tasks, unreliable for long ones)
tmux capture-pane -t "$TASK_ID" -p -S -50

# --- Detecting stuck agents ---

# Check if codex is making file changes (no changes for a long time → may be stuck)
git -C "$WORKTREE" status --short

# Check if the same error appears repeatedly (loop detection)
sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' "/tmp/worktrees/$TASK_ID/codex-output.log" \
  | grep -i "error" | sort | uniq -c | sort -rn | head -5

# --- Mid-task steering (DON'T kill — redirect!) ---

# Agent going the wrong direction?
tmux send-keys -t "$TASK_ID" "Stop. Focus on the API layer first, not the UI." Enter

# Agent missing context?
tmux send-keys -t "$TASK_ID" "The schema is in src/types/template.ts. Use that." Enter

# Agent's context window filling up?
tmux send-keys -t "$TASK_ID" "Focus only on these 3 files: api.ts, handler.ts, types.ts" Enter

# Agent needs test guidance?
tmux send-keys -t "$TASK_ID" "Run 'npm test -- --grep auth' to verify your changes." Enter

Monitoring cadence: Check every 5–10 minutes, not every 30 seconds. Agents need time to work.

Step 5: Definition of Done

A PR is NOT ready for review until all checks pass:

✅ PR created              → gh pr list --head "$BRANCH"
✅ No merge conflicts       → gh pr view $PR_NUM --json mergeable -q '.mergeable'
✅ CI passing               → gh pr checks $PR_NUM
✅ AI code review passed    → at least one cross-model review (see Step 6)
✅ UI screenshots included  → (if applicable) screenshot in PR description

Quick inline check:

PR_NUM=$(gh pr list --head "$BRANCH" --json number -q '.[0].number')
echo "PR: #$PR_NUM"
gh pr checks "$PR_NUM"
gh pr view "$PR_NUM" --json mergeable -q '.mergeable'

Step 6: Multi-Model Code Review

Review with a different model than the one that wrote the code. Different models catch different issues:

DIFF=$(gh pr diff "$PR_NUM")

# Option A: Claude reviews Codex's code (best for security & overengineering checks)
echo "$DIFF" | claude -p \
  --append-system-prompt "You are a senior code reviewer. Be concise, flag only real issues." \
  "Review this PR diff. Focus on: bugs, edge cases, missing error handling,
   race conditions, security issues. Cite file and line numbers.
   Output: list of issues with severity (critical/warning/info)."

# Option B: Different Codex model reviews with analysis focus
echo "$DIFF" | codex exec -s read-only \
  "Review this PR diff for logic errors, performance issues, and missing tests."

Post review results to PR:

gh pr comment "$PR_NUM" --body "## AI Code Review

$REVIEW_OUTPUT"

Update task registry:

jq --arg id "$TASK_ID" \
  '(.tasks[] | select(.id == $id)).checks.codeReviewPassed = true' \
  "$TASKS_FILE" > /tmp/tasks.$$.json && mv /tmp/tasks.$$.json "$TASKS_FILE"

Step 7: Notify

If you included the notify command in the agent prompt (Step 2), the agent self-notifies on completion.

Otherwise, notify after DoD passes:

openclaw system event --text "✅ PR #$PR_NUM ready for review: $TASK_ID — all checks passed" --mode now

Update task status:

jq --arg id "$TASK_ID" --argjson pr "$PR_NUM" \
  '(.tasks[] | select(.id == $id)) |= (.status = "done" | .pr = $pr | .completedAt = (now|floor))' \
  "$TASKS_FILE" > /tmp/tasks.$$.json && mv /tmp/tasks.$$.json "$TASKS_FILE"

Step 8: Cleanup

After PR is merged:

git worktree remove "$WORKTREE" 2>/dev/null
git branch -d "$BRANCH" 2>/dev/null

# Remove from registry
jq --arg id "$TASK_ID" '.tasks = [.tasks[] | select(.id != $id)]' \
  "$TASKS_FILE" > /tmp/tasks.$$.json && mv /tmp/tasks.$$.json "$TASKS_FILE"

Smart Retry Strategy

When an agent fails, analyze the failure and adapt the prompt — don't just re-run blindly.

Failure TypeSymptomRetry Strategy
Context overflowAgent loops, produces garbage, or stops mid-taskNarrow scope: "Focus only on files X, Y, Z"
Wrong directionAgent implements something unrelated to intentCorrect intent: "Stop. Customer wanted X, not Y. Spec: ..."
Missing infoAgent makes wrong assumptions about architectureAdd context: "Auth uses JWT, see src/auth/jwt.ts"
CI failureTests, lint, or typecheck fail after PRAttach CI log: "Fix these test failures: ..."
Build failureDependencies missing or incompatiblePre-install deps before retry

Max 3 retries. After that, escalate to human.

RETRY=$((RETRY + 1))
if [ "$RETRY" -gt 3 ]; then
  openclaw system event --text "🚨 BLOCKED: $TASK_ID failed after 3 retries — needs human help" --mode now
  jq --arg id "$TASK_ID" '(.tasks[] | select(.id == $id)).status = "blocked"' \
    "$TASKS_FILE" > /tmp/tasks.$$.json && mv /tmp/tasks.$$.json "$TASKS_FILE"
  exit 1
fi

# Capture what went wrong — strip ANSI codes for clean error text
LOG_FILE="/tmp/worktrees/$TASK_ID/codex-output.log"
if [ -f "$LOG_FILE" ]; then
  FAILURE_LOG=$(sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' "$LOG_FILE" | tail -500)
else
  FAILURE_LOG=$(tmux capture-pane -t "$TASK_ID" -p -S -200)
fi
CI_LOG=$(gh pr checks "$PR_NUM" 2>/dev/null || echo "no PR yet")
tmux kill-session -t "$TASK_ID" 2>/dev/null

# Mark retry boundary in log (so retries don't blend together)
echo "=== RETRY $RETRY — $(date -Iseconds) ===" >> "$LOG_FILE"

# Respawn: session first, pipe-pane second, send-keys third (same pattern as Step 2)
tmux new-session -d -s "$TASK_ID" -c "$WORKTREE"
tmux pipe-pane -t "$TASK_ID" -o "stdbuf -oL cat >> $LOG_FILE"
tmux send-keys -t "$TASK_ID" \
  'codex -c "model_reasoning_effort=high" \
   --dangerously-bypass-approvals-and-sandbox \
   '"'"'Previous attempt failed. Error output:
'"$FAILURE_LOG"'

CI status: '"$CI_LOG"'

Fix the issues above and complete the original task.
[...your enriched instructions here...]

When done: commit, push, gh pr create --fill, then run:
openclaw system event --text "Done: '"$TASK_ID"' (retry '"$RETRY"')" --mode now'"'"' \
   ; echo "CODEX_EXIT=$?" >> '"$LOG_FILE" Enter

# Update registry
jq --arg id "$TASK_ID" --argjson r "$RETRY" \
  '(.tasks[] | select(.id == $id)) |= (.retries = $r | .status = "running")' \
  "$TASKS_FILE" > /tmp/tasks.$$.json && mv /tmp/tasks.$$.json "$TASKS_FILE"

Parallel Execution

Run multiple agents simultaneously on different tasks:

# Helper: launch codex in tmux with proper logging (session → pipe-pane → send-keys)
launch_codex() {
  local TASK="$1" WORKDIR="$2" PROMPT="$3"
  local LOG="$WORKDIR/codex-output.log"
  tmux new-session -d -s "$TASK" -c "$WORKDIR"
  tmux pipe-pane -t "$TASK" -o "stdbuf -oL cat >> $LOG"
  tmux send-keys -t "$TASK" \
    "pnpm install && codex --dangerously-bypass-approvals-and-sandbox '$PROMPT'; echo \"CODEX_EXIT=\$?\" >> $LOG" Enter
}

# Task 1: Feature
git worktree add -b feat/auth /tmp/worktrees/feat-auth origin/main
launch_codex feat-auth /tmp/worktrees/feat-auth "Implement JWT auth..."

# Task 2: Bugfix
git worktree add -b fix/payments /tmp/worktrees/fix-payments origin/main
launch_codex fix-payments /tmp/worktrees/fix-payments "Fix payment webhook..."

# Dashboard: check all agents (use log files, not capture-pane, for reliable output)
tmux ls
for s in $(tmux ls -F '#{session_name}' 2>/dev/null); do
  LOG="/tmp/worktrees/$s/codex-output.log"
  echo "=== $s ==="
  if tmux has-session -t "$s" 2>/dev/null; then
    sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' "$LOG" 2>/dev/null | tail -5 || echo "(no log yet)"
  else
    EXIT=$(grep "CODEX_EXIT=" "$LOG" 2>/dev/null | tail -1)
    echo "(exited) ${EXIT:-exit code unknown}"
  fi
done

Codex-Specific Features

Reasoning Effort

Control how much the model "thinks" before acting:

# High — for complex logic, multi-file refactors
codex -c "model_reasoning_effort=high" --full-auto "refactor auth module"

# Medium — balanced (default)
codex exec --full-auto "add input validation"

# Low — for trivial/mechanical changes
codex -c "model_reasoning_effort=low" --full-auto "rename all instances of foo to bar"

Sandbox Modes

ModeUse Case
read-onlyCode review, analysis, documentation
workspace-write / --full-autoFeature implementation, bug fixes, refactors
danger-full-accessInstalling dependencies, network access needed
--dangerously-bypass-approvals-and-sandboxFull auto in containers (recommended for tmux workflow)

JSON Output

# Structured output for programmatic processing
codex exec --full-auto --json "implement and test the feature"

# Save to file
codex exec --full-auto -o results.txt "run analysis"

Resume Session

# Resume last session with a follow-up task
codex exec resume --last "now add tests for the feature you just built"

Best Practices

Prompt Quality

  • Include file paths: "The entry point is src/index.ts, config in src/config/"
  • Include schemas/types: Paste relevant type definitions into the prompt
  • Include test commands: "Verify with: npm test -- --grep auth"
  • Include commit convention: "Use conventional commits: feat:, fix:, chore:"
  • Include error logs: When retrying, always attach the failure output

Scope Management

  • One task per agent — don't ask for "refactor everything"
  • Pre-install dependencies before launching the agent
  • Be specific"Add rate limiting to POST /api/users" not "improve the API"
  • Use high reasoning effort for complex tasks, low for mechanical ones

When to Interrupt (Ask Human)

  • Destructive operations (drop tables, force push main)
  • Security decisions (expose credentials, change auth)
  • Ambiguous requirements with significant trade-offs
  • All other decisions: proceed autonomously

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Mcporter Cli

Use the mcporter CLI to list, configure, auth, and call MCP servers/tools directly (HTTP or stdio), including ad-hoc servers, config edits, and CLI/type gene...

Registry SourceRecently Updated
Coding

Agent Browser Skill

Automate web browsing with navigation, form filling, clicking, screenshots, data extraction, and testing using Chrome via OpenClaw browser tool.

Registry SourceRecently Updated
Coding

Github

Interact with GitHub using the `gh` CLI. Use `gh issue`, `gh pr`, `gh run`, and `gh api` for issues, PRs, CI runs, and advanced queries.

Registry SourceRecently Updated
Coding

Agent Browser

A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured co...

Registry SourceRecently Updated