ml-engineer

Expert ML system builder covering the complete ML lifecycle.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ml-engineer" with this command: npx skills add anton-abyzov/specweave/anton-abyzov-specweave-ml-engineer

ML Engineer

Expert ML system builder covering the complete ML lifecycle.

⚠️ Chunking Rule

Large ML pipelines = 1000+ lines. Generate ONE stage per response:

  • Data/EDA → 2. Features → 3. Training → 4. Evaluation → 5. Deployment

Core Capabilities

Feature Engineering

  • Feature extraction, selection, and transformation

  • Feature importance analysis (permutation, SHAP)

  • Feature store integration patterns

  • Automated feature generation

Model Training

  • Baseline comparison (always start with baseline!)

  • Cross-validation (k-fold, stratified, time-based)

  • Hyperparameter tuning (Grid, Random, Bayesian)

  • AutoML integration (TPOT, Auto-sklearn, H2O)

Model Evaluation

  • Classification: accuracy, precision, recall, F1, AUC-ROC

  • Regression: RMSE, MAE, R², MAPE

  • Ranking: NDCG, MAP, MRR

  • Custom business metrics

Explainability

  • SHAP values for feature importance

  • LIME for local explanations

  • Partial dependence plots

  • Model-agnostic interpretability

Best Practices

1. Always establish baseline first

baseline = train_baseline(strategies=["random", "popularity", "rule-based"])

New model must beat baseline by significant margin

2. Use proper cross-validation

cv_scores = cross_val_score(model, X, y, cv=5, scoring='f1_macro') print(f"CV Score: {cv_scores.mean():.3f} ± {cv_scores.std():.3f}")

3. Track everything

mlflow.log_params(model.get_params()) mlflow.log_metrics({"accuracy": acc, "f1": f1}) mlflow.log_artifact("model.pkl")

4. Add explainability

import shap explainer = shap.TreeExplainer(model) shap_values = explainer.shap_values(X_test)

Framework Support

  • scikit-learn: RandomForest, XGBoost, LightGBM

  • PyTorch: Neural networks, custom architectures

  • TensorFlow/Keras: Deep learning models

  • AutoML: TPOT, Auto-sklearn, H2O AutoML

When to Use

  • Building ML features end-to-end

  • Feature engineering and selection

  • Model training and evaluation

  • Hyperparameter optimization

  • Model explainability requirements

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

technical-writing

No summary provided by upstream source.

Repository SourceNeeds Review
General

spec-driven-brainstorming

No summary provided by upstream source.

Repository SourceNeeds Review
General

kafka-architecture

No summary provided by upstream source.

Repository SourceNeeds Review
General

frontend

No summary provided by upstream source.

Repository SourceNeeds Review