ml-ops

Deep MLOps workflow—reproducible training, experiment tracking, packaging, deployment, monitoring (drift, performance), governance, and rollback for ML. Use when shipping models to production or hardening ML pipelines.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ml-ops" with this command: npx skills add clawkk/ml-ops

MLOps (Deep Workflow)

MLOps connects research velocity to production reliability: version data, code, and artifacts together; monitor behavior after deploy.

When to Offer This Workflow

Trigger conditions:

  • First production model; batch or online serving
  • Drift, bias, or latency SLO misses
  • Compliance needs for lineage and explainability

Initial offer:

Use six stages: (1) problem & risk class, (2) data & reproducibility, (3) training & evaluation, (4) packaging & deployment, (5) monitoring & feedback, (6) governance & rollback). Confirm batch vs real-time and regulatory tier.


Stage 1: Problem & Risk Class

Goal: Align ML to decision risk (credit, health vs recommendation).

Exit condition: Offline and online success metrics defined.


Stage 2: Data & Reproducibility

Goal: Snapshot training data; deterministic pipelines; PII handling.

Practices

  • Feature stores optional but valuable for consistency
  • Secrets not in notebooks; orchestrated jobs

Exit condition: Run id reproduces artifact hash within agreed bounds.


Stage 3: Training & Evaluation

Goal: Train/val/test without leakage; time-series splits careful.

Practices

  • Model card with limits and metrics
  • Fairness slices where policy requires

Stage 4: Packaging & Deployment

Goal: Immutable artifacts; canary or shadow before full cutover.

Practices

  • Model + preprocessing code version pinned together

Exit condition: Rollback to previous artifact id documented.


Stage 5: Monitoring & Feedback

Goal: Data drift, concept drift, latency; business KPIs tied to model decisions.

Practices

  • Human review queue for low-confidence predictions when needed

Stage 6: Governance & Rollback

Goal: Approvals for retrain/deploy; audit trail; A/B for big changes.


Final Review Checklist

  • Offline metrics aligned with business risk
  • Data and code reproducibility
  • Packaged artifacts with versioning and rollback
  • Online monitoring and drift strategy
  • Governance and approval path

Tips for Effective Guidance

  • Training-serving skew is a top bug—feature parity tests help.
  • Offline accuracy ≠ online business outcome.
  • Fairness needs explicit slices—not one headline number.

Handling Deviations

  • LLM-heavy products: lean on eval harnesses and prompt versioning (see llm-evaluation).
  • Tiny teams: start with artifact registry + dashboards before a full feature store.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

龙虾婚恋交友

为AI Agent龙虾提供注册、发帖、评论、配对及申请结婚证的婚恋交友服务平台。

Registry SourceRecently Updated
Automation

Skill Lookup

Search, retrieve, and install Agent Skills from the prompts.chat registry using MCP tools. Use when the user asks to find skills, browse skill catalogs, inst...

Registry SourceRecently Updated
Automation

Purpleflea Casino

Purple Flea Agent Casino — provably fair gambling API built exclusively for AI agents. Use this skill when an agent wants to: place bets on casino games (coi...

Registry SourceRecently Updated
Automation

Multi Agent Coordinator Zhuyu28

Coordinate and manage multiple AI agents working together on complex tasks. Provides orchestration, communication patterns, and workflow management for multi...

Registry SourceRecently Updated