setup

/ar:setup — Create New Experiment

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "setup" with this command: npx skills add alirezarezvani/claude-skills/alirezarezvani-claude-skills-setup

/ar:setup — Create New Experiment

Set up a new autoresearch experiment with all required configuration.

Usage

/ar:setup # Interactive mode /ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower /ar:setup --list # Show existing experiments /ar:setup --list-evaluators # Show available evaluators

What It Does

If arguments provided

Pass them directly to the setup script:

python {skill_path}/scripts/setup_experiment.py
--domain {domain} --name {name}
--target {target} --eval "{eval_cmd}"
--metric {metric} --direction {direction}
[--evaluator {evaluator}] [--scope {scope}]

If no arguments (interactive mode)

Collect each parameter one at a time:

  • Domain — Ask: "What domain? (engineering, marketing, content, prompts, custom)"

  • Name — Ask: "Experiment name? (e.g., api-speed, blog-titles)"

  • Target file — Ask: "Which file to optimize?" Verify it exists.

  • Eval command — Ask: "How to measure it? (e.g., pytest bench.py, python evaluate.py)"

  • Metric — Ask: "What metric does the eval output? (e.g., p50_ms, ctr_score)"

  • Direction — Ask: "Is lower or higher better?"

  • Evaluator (optional) — Show built-in evaluators. Ask: "Use a built-in evaluator, or your own?"

  • Scope — Ask: "Store in project (.autoresearch/) or user (~/.autoresearch/)?"

Then run setup_experiment.py with the collected parameters.

Listing

Show existing experiments

python {skill_path}/scripts/setup_experiment.py --list

Show available evaluators

python {skill_path}/scripts/setup_experiment.py --list-evaluators

Built-in Evaluators

Name Metric Use Case

benchmark_speed

p50_ms (lower) Function/API execution time

benchmark_size

size_bytes (lower) File, bundle, Docker image size

test_pass_rate

pass_rate (higher) Test suite pass percentage

build_speed

build_seconds (lower) Build/compile/Docker build time

memory_usage

peak_mb (lower) Peak memory during execution

llm_judge_content

ctr_score (higher) Headlines, titles, descriptions

llm_judge_prompt

quality_score (higher) System prompts, agent instructions

llm_judge_copy

engagement_score (higher) Social posts, ad copy, emails

After Setup

Report to the user:

  • Experiment path and branch name

  • Whether the eval command worked and the baseline metric

  • Suggest: "Run /ar:run {domain}/{name} to start iterating, or /ar:loop {domain}/{name} for autonomous mode."

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

aws-solution-architect

No summary provided by upstream source.

Repository SourceNeeds Review
General

social-media-analyzer

No summary provided by upstream source.

Repository SourceNeeds Review
General

senior-frontend

No summary provided by upstream source.

Repository SourceNeeds Review
General

senior-backend

No summary provided by upstream source.

Repository SourceNeeds Review