modal-finetuning

Use this skill for Modal GPU fine-tuning and post-training: LLM SFT, LoRA, QLoRA, diffusion LoRA, YOLO-style vision training, and GRPO. Trigger when the user wants to adapt model weights, resume or checkpoint training, choose GPUs, stage datasets, or persist training artifacts on Modal. Do not use it for inference serving, detached job orchestration, sandbox lifecycle, or from-scratch pretraining.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "modal-finetuning" with this command: npx skills add .

Modal Fine-Tuning

Quick Start

  1. Verify the actual local Modal environment before writing code.
modal --version
python -c "import modal,sys; print(modal.__version__); print(sys.executable)"
modal profile current
  • Do not assume the default python interpreter matches the environment behind the modal CLI.
  1. Classify the training goal before writing code.
  • Adapt a text model with supervised fine-tuning, LoRA, or QLoRA.
  • Train a diffusion LoRA for image generation or style adaptation.
  • Fine-tune a YOLO-style vision model on labeled images.
  • Run GRPO as RL-style post-training.
  1. Read references/training-playbook.md and then exactly one primary workflow reference.
  1. Ground every implementation in the actual task: base model, dataset location and format, GPU type and count, checkpoint destination, secrets, and what should happen after training finishes.

Choose the Workflow

  • Use the LLM path for supervised fine-tuning, LoRA, or QLoRA on a single node. Prefer Unsloth or another PEFT-style workflow unless the user explicitly needs full-parameter updates. Read references/llm-sft-and-lora.md.
  • Use the diffusion path when the goal is image-generation adaptation, style transfer, or a small concept or brand LoRA. Keep checkpoints and sample outputs easy to inspect later. Read references/vision-and-diffusion-finetuning.md.
  • Use the YOLO path when the task is labeled vision training for detection or related downstream evaluation. Keep dataset staging and exported weights separate from later serving. Read references/vision-and-diffusion-finetuning.md.
  • Use GRPO only for reinforcement-learning-style post-training. Prefer TRL for the simpler single-node path, and use verl when the task needs a more explicit rollout or trainer split, or more advanced vLLM-backed RL plumbing. Read references/rl-post-training.md.

Default Rules

  • Prefer PEFT methods such as LoRA or QLoRA before full fine-tuning unless the user explicitly needs weight updates across the whole model.
  • Start with a cheap smoke test on a tiny ungated model and tiny dataset before a long or expensive run. Use it to validate image builds, trainer API compatibility, dataset formatting, checkpoint paths, and one saved sample artifact.
  • Persist datasets, pretrained weights, checkpoints, merged adapters, and sample outputs in Modal Volumes. Do not rely on ephemeral container disk for anything that must survive retries or later inspection.
  • Keep Hugging Face, Weights & Biases, Roboflow, and similar credentials in Modal Secrets.
  • Design long runs so they can resume from checkpoints. Add retries only when resume behavior is correct.
  • Set timeout= intentionally for long training jobs and keep one stateful container per training run by default. Prefer single_use_containers=True when retries should start from a fresh container.
  • Use @app.local_entrypoint or a plain local launcher to expose hyperparameters and dataset switches as CLI arguments instead of hard-coding every experiment.
  • Keep the first version single-node unless the user explicitly asks for clusters. Modal multi-node training is a separate advanced path and is currently a beta workflow.
  • Store final artifacts in a layout that makes handoff obvious: base model cache, dataset cache, checkpoint tree, and final exported weights or adapters.
  • Separate training outputs from later serving. Hand the exported model or adapter off to a serving workflow instead of mixing training and HTTP serving in one artifact.
  • If the task is really about serving the tuned model behind HTTP, OpenAI-compatible APIs, vLLM, or SGLang, stop and use modal-llm-serving.
  • If the task is really about detached job orchestration, queueing, .map, .spawn, or @modal.batched, stop and use modal-batch-processing.
  • If the task is really about sandbox lifecycle, tunnels, Sandbox.exec(...), or secure runtime code execution infrastructure, stop and use modal-sandbox.

Validate

  • Run npx skills add . --list after editing the package metadata or skill descriptions.
  • Keep evals/evals.json and evals/trigger-evals.json aligned with the actual workflow boundaries of the skill.

References

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

modal-sandbox

No summary provided by upstream source.

Repository SourceNeeds Review
General

modal-llm-serving

No summary provided by upstream source.

Repository SourceNeeds Review
General

modal-batch-processing

No summary provided by upstream source.

Repository SourceNeeds Review
General

Xiaohongshu Post (Browser Auto)

Create and publish Xiaohongshu (小红书/RED) notes. Use when: user wants to publish a note to Xiaohongshu, schedule or batch post content, generate Xiaohongshu-s...

Registry SourceRecently Updated