lora-finetune

LoRA fine-tuning pipeline for Stable Diffusion on Apple Silicon — dataset prep, training, evaluation with LLM-as-judge scoring. Use when fine-tuning image generation models for consistent style, custom characters, or domain-specific visuals. Requires Python with torch and diffusers.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "lora-finetune" with this command: npx skills add nissan/lora-finetune

LoRA Fine-Tuning (Apple Silicon)

Train custom LoRA adapters for Stable Diffusion 1.5 on Mac hardware. Tested on M4 24GB — produces 3.1MB weight files in ~15 minutes at 500 steps.

Hardware Requirements

ConfigModelResolutionVRAM
M4 24GBSD 1.5512×512✅ Works
M4 24GBSDXL512×512⚠️ Tight, may OOM
M4 24GBFLUX.1-schnellAny❌ OOMs
M4 Pro 48GBSDXL1024×1024✅ Estimated

Training Pipeline

  1. Prepare dataset: 15-25 images in consistent style, 512×512, with text captions
  2. Train LoRA: 500 steps, learning rate 1e-4, rank 4
  3. Evaluate: Generate test images, compare base vs LoRA vs reference (Gemini/DALL-E)
  4. Score: LLM-as-judge rates each on style consistency, quality, prompt adherence

Quick Start

# Prepare training images in a folder
ls training_data/
# image_001.png  image_001.txt  image_002.png  image_002.txt ...

# Train (see scripts/train_lora.py for full options)
python3 scripts/train_lora.py \
  --data_dir ./training_data \
  --output_dir ./lora_weights \
  --steps 500 \
  --lr 1e-4 \
  --rank 4

Evaluation with LLM-as-Judge

# Compare base model vs LoRA vs commercial (Gemini/DALL-E)
# Pixtral Large scores each image 1-10 on:
# - Style consistency with training data
# - Image quality and coherence
# - Prompt adherence

# Our results: Base 6.8 → LoRA 9.0 → Gemini 9.5
# Lesson: Gemini wins without training, but LoRA closes the gap significantly

Key Lessons

  • float32 required on MPS — float16 silently produces NaN on Apple Silicon for SD pipelines
  • mflux is faster than PyTorch MPS for FLUX (~105s vs ~90min) but doesn't support LoRA training
  • SD 1.5 is the ceiling for 24GB — FLUX LoRA OOMs even with gradient checkpointing
  • 15-25 images is the sweet spot — fewer undertrain, more doesn't help proportionally
  • Gemini (Imagen 4.0) beats fine-tuned SD 1.5 with zero training — use commercial APIs for production, LoRA for experimentation and offline use

Files

  • scripts/train_lora.py — Training script with Apple Silicon MPS support
  • scripts/compare_models.py — LLM-as-judge evaluation comparing base vs LoRA vs reference

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Hippo Video

Hippo Video integration. Manage Persons, Organizations, Deals, Leads, Activities, Notes and more. Use when the user wants to interact with Hippo Video data.

Registry SourceRecently Updated
General

币安资金费率监控

币安资金费率套利监控工具 - 查看账户、持仓、盈亏统计,SkillPay收费版

Registry SourceRecently Updated
General

apix

Use `apix` to search, browse, and execute API endpoints from local markdown vaults. Use this skill to discover REST API endpoints, inspect request/response s...

Registry SourceRecently Updated
0160
dngpng