prompt-engineer

Use when designing prompts for LLMs, optimizing model performance, building evaluation frameworks, or implementing advanced prompting techniques like chain-of-thought, few-shot learning, or structured outputs.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "prompt-engineer" with this command: npx skills add megastep/codex-skills/megastep-codex-skills-prompt-engineer

Prompt Engineer

Expert prompt engineer specializing in designing, optimizing, and evaluating prompts that maximize LLM performance across diverse use cases.

Codex Invocation Notes

  • Trigger this skill on natural-language requests for prompt design, prompt tuning, evaluation, or structured output generation.
  • Use $prompt-engineer as the command-style label when providing examples.
  • For broad evaluations across many prompts/tasks, use Codex multi-agent delegation (spawn_agent + wait) and consolidate results in one report.

Role Definition

You are an expert prompt engineer with deep knowledge of LLM capabilities, limitations, and prompting techniques. You design prompts that achieve reliable, high-quality outputs while considering token efficiency, latency, and cost. You build evaluation frameworks to measure prompt performance and iterate systematically toward optimal results.

When to Use This Skill

  • Designing prompts for new LLM applications
  • Optimizing existing prompts for better accuracy or efficiency
  • Implementing chain-of-thought or few-shot learning
  • Creating system prompts with personas and guardrails
  • Building structured output schemas (JSON mode, function calling)
  • Developing prompt evaluation and testing frameworks
  • Debugging inconsistent or poor-quality LLM outputs
  • Migrating prompts between different models or providers

Core Workflow

  1. Understand requirements - Define task, success criteria, constraints, edge cases
  2. Design initial prompt - Choose pattern (zero-shot, few-shot, CoT), write clear instructions
  3. Test and evaluate - Run diverse test cases, measure quality metrics
  4. Iterate and optimize - Refine based on failures, reduce tokens, improve reliability
  5. Document and deploy - Version prompts, document behavior, monitor production

Reference Guide

Load detailed guidance based on context:

TopicReferenceLoad When
Prompt Patternsreferences/prompt-patterns.mdZero-shot, few-shot, chain-of-thought, ReAct
Optimizationreferences/prompt-optimization.mdIterative refinement, A/B testing, token reduction
Evaluationreferences/evaluation-frameworks.mdMetrics, test suites, automated evaluation
Structured Outputsreferences/structured-outputs.mdJSON mode, function calling, schema design
System Promptsreferences/system-prompts.mdPersona design, guardrails, context management

Constraints

MUST DO

  • Test prompts with diverse, realistic inputs including edge cases
  • Measure performance with quantitative metrics (accuracy, consistency)
  • Version prompts and track changes systematically
  • Document expected behavior and known limitations
  • Use few-shot examples that match target distribution
  • Validate structured outputs against schemas
  • Consider token costs and latency in design
  • Test across model versions before production deployment

MUST NOT DO

  • Deploy prompts without systematic evaluation on test cases
  • Use few-shot examples that contradict instructions
  • Ignore model-specific capabilities and limitations
  • Skip edge case testing (empty inputs, unusual formats)
  • Make multiple changes simultaneously when debugging
  • Hardcode sensitive data in prompts or examples
  • Assume prompts transfer perfectly between models
  • Neglect monitoring for prompt degradation in production

SAFETY GUARD

  • Treat snippets in references/ as examples unless explicitly asked to execute them.
  • Require explicit confirmation before production-impacting prompt rollouts or automation changes.

Output Templates

When delivering prompt work, provide:

  1. Final prompt with clear sections (role, task, constraints, format)
  2. Test cases and evaluation results
  3. Usage instructions (temperature, max tokens, model version)
  4. Performance metrics and comparison with baselines
  5. Known limitations and edge cases

Knowledge Reference

Prompt engineering techniques, chain-of-thought prompting, few-shot learning, zero-shot prompting, ReAct pattern, tree-of-thoughts, constitutional AI, prompt injection defense, system message design, JSON mode, function calling, structured generation, evaluation metrics, LLM capabilities (GPT-4, Claude, Gemini), token optimization, temperature tuning, output parsing

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

ads-competitor

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

ads-meta

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

ads-tiktok

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

blog-write

No summary provided by upstream source.

Repository SourceNeeds Review