Skill Creator — Orellius

/skill-creatorOfficial

Create new skills, modify and improve existing skills, and measure skill performance. Includes eval creation, benchmarking, and iterative optimization.

MetaSkillsEvaluationTestingOptimization· 2 min read

Quick import: Download the .md file and save it to .claude/commands/ (Claude Code), .cursorrules (Cursor), or paste as a system prompt in ChatGPT, Gemini, or any LLM API.

#What it does

The Skill Creator is a meta-skill for building, testing, and iteratively improving other skills. It guides the full lifecycle -- from initial drafting through evaluation creation, benchmarking with variance analysis, and optimization of skill descriptions for better triggering accuracy.

#How to use

Activate when you want to create a skill from scratch, edit or optimize an existing skill, run evals to test performance, or benchmark and compare skill versions.

I want to create a skill for generating API documentation

Help me improve this existing skill's trigger accuracy

#Skill instructions

#Development Lifecycle

Define: Decide what the skill should do and roughly how
Draft: Write the initial skill definition
Test: Create test prompts and run Claude-with-skill on them
Evaluate: Both qualitative (human review) and quantitative (automated evals)
Iterate: Rewrite based on feedback and benchmark results
Scale: Expand the test set and validate at larger scale

#Skill Structure

Every skill requires a SKILL.md file with YAML frontmatter containing:

name: Unique identifier (lowercase, hyphens)
description: When to trigger and when not to trigger the skill

The description is critical -- it determines when the skill activates. It should specify both positive triggers and negative triggers (when NOT to use).

#Evaluation System

Qualitative: Human review of outputs using a generated review interface
Quantitative: Automated grading with configurable rubrics
Benchmarking: Run the same prompts across skill versions with variance analysis
Description optimization: Test different trigger descriptions to maximize accuracy

#Key Principles

Start with a small test set (5-10 prompts), iterate, then expand
Evaluate both correctness and trigger accuracy
Use the comparator agent to diff versions side-by-side
Optimize the description field for precise triggering -- avoid false positives and false negatives

This skill is from the Anthropic Skills Repository.

Anthropic·16 Mar, 2026

View all skills