dspy-simba-optimizer

This skill should be used when the user asks to "optimize with SIMBA", "use Bayesian optimization", "optimize agents with custom feedback", mentions "SIMBA optimizer", "mini-batch optimization", "statistical optimization", "lightweight optimizer", or needs an alternative to MIPROv2/GEPA for programs with rich feedback signals.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "dspy-simba-optimizer" with this command: npx skills add omidzamani/dspy-skills/omidzamani-dspy-skills-dspy-simba-optimizer

DSPy SIMBA Optimizer

Goal

Optimize DSPy programs using mini-batch Bayesian optimization with statistical analysis of feedback signals.

When to Use

  • Need lighter-weight alternative to GEPA
  • Have custom feedback metrics (not just accuracy)
  • Agentic tasks with rich failure signals
  • Budget-conscious optimization (fewer eval calls)
  • Programs where few-shot examples aren't critical

Related Skills

Inputs

InputTypeDescription
programdspy.ModuleProgram to optimize
trainsetlist[dspy.Example]Training examples
metriccallableReturns float or dspy.Prediction(score=..., feedback=...)
max_stepsintNumber of optimization steps
bsizeintMini-batch size

Outputs

OutputTypeDescription
optimized_programdspy.ModuleSIMBA-optimized program

Workflow

Phase 1: Understand SIMBA

SIMBA (Stochastic Introspective Mini-Batch Ascent):

  • Iterative prompt optimization with mini-batch sampling
  • Identifies challenging examples with high output variability
  • Generates self-reflective rules or adds successful demonstrations
  • Lighter than GEPA (no reflection LM)
  • More flexible than Bootstrap (uses feedback)

Comparison:

  • MIPROv2: Best accuracy, lots of data
  • GEPA: Agentic systems, expensive
  • SIMBA: Custom feedback, budget-friendly
  • Bootstrap: Simplest, demo-based

Phase 2: Basic SIMBA Optimization

import dspy

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))

# Program to optimize
class QAPipeline(dspy.Module):
    def __init__(self):
        self.generate = dspy.ChainOfThought("question -> answer")

    def forward(self, question):
        return self.generate(question=question)

# Metric (can return just score or (score, feedback))
def qa_metric(example, pred, trace=None):
    correct = example.answer.lower() in pred.answer.lower()
    return 1.0 if correct else 0.0

# SIMBA optimizer
optimizer = dspy.SIMBA(
    metric=qa_metric,
    max_steps=10,  # Optimization iterations
    bsize=5  # Mini-batch size
)

program = QAPipeline()
compiled = optimizer.compile(program, trainset=trainset)
compiled.save("qa_simba.json")

Phase 3: SIMBA with Feedback Signals

SIMBA works best with rich feedback:

import dspy

def detailed_metric(example, pred, trace=None):
    """Metric with feedback signal."""
    expected = example.answer.lower()
    actual = pred.answer.lower()

    if expected == actual:
        return dspy.Prediction(score=1.0, feedback="Perfect match")
    elif expected in actual:
        return dspy.Prediction(score=0.7, feedback=f"Contains answer but verbose: '{actual}'")
    else:
        overlap = len(set(expected.split()) & set(actual.split()))
        if overlap > 0:
            return dspy.Prediction(score=0.3, feedback=f"Partial overlap: {overlap} words")
        return dspy.Prediction(score=0.0, feedback=f"No match. Expected '{expected}'")

optimizer = dspy.SIMBA(
    metric=detailed_metric,
    max_steps=20,  # Optimization iterations
    bsize=8  # Mini-batch size
)

compiled = optimizer.compile(program, trainset=trainset)

Phase 4: Production Agent Optimization

import dspy
from dspy.evaluate import Evaluate
import logging

logger = logging.getLogger(__name__)

# Define tools as functions
def search(query: str) -> str:
    """Search knowledge base for relevant information."""
    retriever = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
    results = retriever(query, k=3)
    return "\n".join([r['text'] for r in results])

def calculate(expr: str) -> str:
    """Evaluate Python expressions safely."""
    try:
        with dspy.PythonInterpreter() as interp:
            return str(interp.execute(expr))
    except Exception as e:
        return f"Error: {e}"

class ResearchAgent(dspy.Module):
    def __init__(self):
        self.agent = dspy.ReAct(
            "question -> answer",
            tools=[search, calculate]
        )

    def forward(self, question):
        return self.agent(question=question)

def agent_metric(example, pred, trace=None):
    """Rich metric for agent optimization."""
    expected = example.answer.lower().strip()
    actual = pred.answer.lower().strip() if pred.answer else ""

    # Exact match
    if expected == actual:
        return dspy.Prediction(score=1.0, feedback="Correct answer")

    # Partial match
    if expected in actual:
        return dspy.Prediction(score=0.7, feedback="Answer contains expected result")

    # Check key terms
    expected_terms = set(expected.split())
    actual_terms = set(actual.split())
    overlap = len(expected_terms & actual_terms)

    if overlap >= len(expected_terms) * 0.5:
        return dspy.Prediction(score=0.5, feedback=f"50%+ term overlap")

    return dspy.Prediction(score=0.0, feedback=f"Incorrect: expected '{example.answer}'")

def optimize_agent(trainset, devset):
    """Full SIMBA optimization pipeline."""
    dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))

    agent = ResearchAgent()

    # Baseline evaluation
    eval_metric = lambda ex, pred, trace: agent_metric(ex, pred, trace).score
    evaluator = dspy.Evaluate(devset=devset, metric=eval_metric, num_threads=4)
    baseline = evaluator(agent)
    logger.info(f"Baseline: {baseline:.2%}")

    # SIMBA optimization
    optimizer = dspy.SIMBA(
        metric=agent_metric,
        max_steps=25,  # Optimization iterations
        bsize=6  # Mini-batch size
    )

    compiled = optimizer.compile(agent, trainset=trainset)

    # Evaluate optimized
    optimized = evaluator(compiled)
    logger.info(f"SIMBA optimized: {optimized:.2%}")

    compiled.save("research_agent_simba.json")
    return compiled

Configuration

optimizer = dspy.SIMBA(
    metric=metric_fn,
    max_steps=20,                          # Optimization iterations
    bsize=32,                              # Mini-batch size (default: 32)
    num_candidates=6,                      # Candidates per iteration (default: 6)
    max_demos=4,                           # Max demos per predictor (default: 4)
    temperature_for_sampling=0.2,          # Sampling temperature (default: 0.2)
    temperature_for_candidates=0.2         # Candidate selection temperature (default: 0.2)
)

Best Practices

  1. Use feedback signals - SIMBA benefits from dspy.Prediction(score=..., feedback=...) objects
  2. Balance parameters - Adjust bsize (default 32) and max_steps (default 8) based on dataset size
  3. Patience - SIMBA is slower than Bootstrap, faster than GEPA
  4. Custom metrics - Best for scenarios with nuanced scoring (not binary)
  5. Tune temperatures - Lower temperatures (0.1-0.3) for exploitation, higher (0.5-1.0) for exploration

Limitations

  • Newer optimizer, less battle-tested than MIPROv2
  • Requires thoughtful metric design (garbage in, garbage out)
  • Not as thorough as GEPA for agent optimization
  • Mini-batch sampling adds variance to results
  • No automatic prompt reflection like GEPA

Official Documentation

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

dspy-evaluation-suite

No summary provided by upstream source.

Repository SourceNeeds Review
General

dspy-rag-pipeline

No summary provided by upstream source.

Repository SourceNeeds Review
General

dspy-custom-module-design

No summary provided by upstream source.

Repository SourceNeeds Review
General

dspy-debugging-observability

No summary provided by upstream source.

Repository SourceNeeds Review