claw-sergeant

Train autonomous OpenClaw AI agents through LLM-guided curriculum design and multi-turn dialogue evaluation. Use this skill whenever the user wants to train, improve, or evaluate an OpenClaw agent's capabilities, design a training curriculum for an AI agent, run a training session with iterative feedback loops, or test an agent's readiness across specific skill areas. Also use when the user mentions "ClawSergeant", "agent training", "openclaw training", or wants to strengthen an AI agent's performance in areas like programming, writing, analysis, or communication.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "claw-sergeant" with this command: npx skills add myismyname/clawsergeant

ClawSergeant: Boosting OpenClaw Agents from AI Feedback

ClawSergeant trains OpenClaw agents through a structured, LLM-driven pipeline. A Trainer LLM designs curriculum, generates training tasks, and adapts its teaching dynamically based on the agent's responses. A separate Evaluator LLM objectively scores each response, creating a feedback loop that drives iterative improvement.

Architecture Overview

User Intent ──────────────────────→ LLM (Curriculum Designer)
                                          ↓
                                   Curriculum JSON (stages, tasks, criteria)
                                          ↓
Training Session Loop:
    Trainer LLM → crafts message → openclaw CLI → Claw Agent → reply
                                                      ↓
                                          Evaluator LLM → score + feedback
                                                      ↓
                              record to .claw_sergeant_accumulated_lessons/ ←──┘
                                          ↓
                                  (if failed) → Trainer LLM retries with feedback
                                          ↓
                                  (if stage passed) → stage summary for memory consolidation
                                          ↓
                    [Curriculum Pattern] → record to .claw_sergeant_accumulated_lessons/

Training Pipeline

Phase 1: Curriculum Design

The user's training intent is passed directly as input. The LLM generates a multi-stage curriculum as structured JSON based on this intent. The user reviews and approves the curriculum before training begins.

Each curriculum contains:

  • Title and overview of the training program
  • Target persona describing the ideal agent after training
  • 3–5 stages, each with:
    • Name, description, and learning objectives
    • 2–4 training tasks with scenario descriptions and expected behaviors
    • Evaluation criteria with passing standards

Phase 2: Training Execution

For each stage and task, the system runs a dialogue loop:

  1. Trainer LLM generates a task message tailored to the agent (it never sees hardcoded prompts — everything is dynamically composed)
  2. Message is sent to the Claw Agent via openclaw agent CLI
  3. Agent's reply is captured and fed back to the Trainer's conversation context
  4. Evaluator LLM scores the reply (1–10) and reports strengths, weaknesses, and improvement suggestions
  5. If the task is not passed and retries remain, the Trainer generates a follow-up message incorporating the evaluation feedback
  6. After a stage passes, the agent receives a summary prompt to internalize lessons learned

Environment Setup

Create a .env file in the project root with:

LLM_API_KEY=<your-api-key>          # Required: API key for the LLM
LLM_BASE_URL=https://api.openai.com/v1  # Optional: OpenAI-compatible endpoint
LLM_MODEL=gpt-4o                    # Optional: model identifier
CLAW_RECIPIENT=+15555550123         # Required: target agent's address

Running the Training

Full Training Session

python main.py "An efficient, rigorous programming assistant"

The training intent is passed as a command-line argument. ClawSergeant designs a curriculum, presents it for approval, and runs the training session automatically. Results are saved to training_results.json.

Phase-by-Phase Testing

Use test_phases.py to verify each component independently before running a full session:

python test_phases.py 1    # Verify LLM API connectivity
python test_phases.py 2    # Test curriculum generation
python test_phases.py 3    # Test Claw agent communication
python test_phases.py 4    # Run a single-task training round
python test_phases.py all  # Run all phases sequentially

Always start with phase 1 to confirm the LLM connection works, then progress through subsequent phases.

Configuration

All training parameters are centralized in config.py:

ParameterDefaultPurpose
STAGE_COUNT_MIN / MAX3 / 5Number of training stages
TASKS_PER_STAGE_MIN / MAX2 / 4Tasks per stage
CURRICULUM_TEMPERATURE0.4LLM temperature for curriculum design
TRAINER_TEMPERATURE0.7LLM temperature for training messages
EVALUATOR_TEMPERATURE0.2LLM temperature for evaluation (low = strict)
MAX_ATTEMPTS_PER_TASK2Retries per task before moving on
STAGE_PASS_THRESHOLD0.6Fraction of tasks needed to pass a stage

Adjust STAGE_PASS_THRESHOLD higher (e.g., 0.8) for stricter training, or lower temperatures for more deterministic evaluations.

Key Components

FileRole
main.pyEntry point — orchestrates curriculum design → approval → training execution
trainer.pyTraining session controller — manages dialogue loop and captures per-task/stage learnings
curriculum.pyCurriculum data model and LLM-based generation
claw_agent.pyWraps openclaw agent CLI for agent communication
llm_handler.pyAsync LLM client with conversation history management
learning_logger.pyStructured experience logger — records training insights and writes to OpenClaw MEMORY.md
config.pyCentralized training parameters
test_phases.pyStep-by-step pipeline verification

Training Results

After a session completes, training_results.json contains:

{
  "curriculum": {
    "title": "...",
    "overview": "...",
    "target_persona": "...",
    "stages_total": 4,
    "stages_passed": 3
  },
  "stage_reports": [
    {
      "stage_id": 1,
      "stage_name": "...",
      "passed": true,
      "overall_feedback": "...",
      "tasks": [
        {
          "task_id": "1.1",
          "passed": true,
          "score": 8,
          "strengths": ["..."],
          "weaknesses": ["..."],
          "feedback": "..."
        }
      ]
    }
  ]
}

Experience Recording

Training experiences are automatically recorded throughout the session. Every task evaluation, stage result, and infrastructure error is logged to .claw_sergeant_accumulated_lessons/ as structured markdown entries for future reference.

After the session completes, a summary is written to ~/.openclaw/workspace/MEMORY.md containing the training timestamp, curriculum details, stage pass/fail results, and a pointer to the full logs. This allows the Claw agent to reference its training history in future sessions. If the OpenClaw workspace is not found, this step is silently skipped.

Troubleshooting

  • LLM connection fails: Run python test_phases.py 1 to verify API key and endpoint. Check LLM_BASE_URL points to a valid OpenAI-compatible API.
  • Claw agent timeout: The default timeout is 120 seconds. If the agent is slow to respond, check network connectivity and the openclaw CLI installation.
  • Curriculum has no stages: The LLM may have returned malformed JSON. Try lowering CURRICULUM_TEMPERATURE or switching to a more capable model.
  • All tasks fail: Review evaluation criteria — they may be too strict. Lower STAGE_PASS_THRESHOLD or increase MAX_ATTEMPTS_PER_TASK in config.py.

Dependencies

  • Python 3.11+
  • httpx — async HTTP client for LLM API calls
  • loguru — structured logging
  • python-dotenv — environment variable management
  • openclaw CLI — must be installed and accessible in PATH

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Harbor Skills

Harbor 镜像仓库综合管理技能。用于 Harbor 日常运维、项目与镜像管理、安全扫描、清理策略、CI/CD 集成、GitOps、复制规则、存储管理、备份恢复、webhook 联动等所有 Harbor 相关操作。当用户提到 Harbor、镜像仓库管理、Docker 镜像、镜像安全扫描、CI/CD 镜像推送/拉...

Registry SourceRecently Updated
Automation

Dynamics Crm

Microsoft Dynamics 365 integration. Manage crm and sales data, records, and workflows. Use when the user wants to interact with Microsoft Dynamics 365 data.

Registry SourceRecently Updated
Automation

Jira

Jira integration. Manage project management and ticketing data, records, and workflows. Use when the user wants to interact with Jira data.

Registry SourceRecently Updated
Automation

Generate Education Ad Creative Brief

Plan campaign visuals and hooks for education promotions. Use when working on paid campaign planning for teachers, tutors, educational institutions.

Registry SourceRecently Updated