# Claude Skill Optimizer

<div align="center">

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![GEPA](https://img.shields.io/badge/GEPA-Genetic_Pareto-purple.svg)](https://github.com/rartzi/gepa)
[![Claude Code](https://img.shields.io/badge/Claude-Code-orange.svg)](https://claude.ai)
[![GitHub Issues](https://img.shields.io/github/issues/rartzi/ClaudeSkills-Optimizer-GEFA)](https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA/issues)

**Automatically optimize any Claude skill using GEPA reflective text evolution**

[Getting Started](#getting-started) •
[Examples](#examples) •
[Documentation](#documentation) •
[Contributing](#contributing) •
[Report Issue](https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA/issues/new/choose)

</div>

---

## Overview

Claude Skill Optimizer is a framework that **automatically improves Claude skills** through iterative optimization:

1. **Execute** skills on diverse test cases via Claude Code
2. **Capture** real execution traces (reasoning, code, errors)
3. **Reflect** on failures to propose targeted improvements
4. **Evolve** skill instructions through multi-objective optimization
5. **Select** Pareto-optimal candidates that balance quality dimensions

Built on [GEPA (Genetic-Pareto)](https://github.com/rartzi/gepa), which achieved **+10% improvement on AIME math** and **67%→93% on MATH benchmark** through reflective prompt evolution.

```
┌─────────────────────────────────────────────────────────────────────┐
│                         GEPA Optimization Loop                      │
│                                                                     │
│   ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐  │
│   │  Skill   │────▶│  Claude  │────▶│ Evaluate │────▶│ Reflect  │  │
│   │  v1.0    │     │   Code   │     │ Outputs  │     │ & Mutate │  │
│   └──────────┘     └──────────┘     └──────────┘     └──────────┘  │
│        ▲                                                    │      │
│        └────────────────────────────────────────────────────┘      │
│                          Pareto Selection                          │
│                                                                     │
│   Output: Optimized Skill v2.0 (+15-25% quality improvement)       │
└─────────────────────────────────────────────────────────────────────┘
```

## Features

| Feature | Description |
|---------|-------------|
| **Works with Any Skill** | PPTX, DOCX, XLSX, code generation, or custom skills |
| **Multi-Objective Optimization** | Balance quality, speed, reliability simultaneously |
| **Real Execution Traces** | Captures Claude's actual reasoning and decisions |
| **Auto Test Generation** | Generate test cases from skill description |
| **Extensible Metrics** | Built-in evaluators + custom metric support |
| **Template Workflows** | Optimize brand compliance, template usage, etc. |
| **CLI & Library** | Use from command line or integrate in Python |
| **Meta-Skill** | Install as a skill so Claude can optimize skills |

## Meta-Skill: Let Claude Optimize Skills

The key feature: **install this as a Claude skill** so Claude can optimize other skills.

```bash
# Install the meta-skill
./install-skill.sh

# Now in Claude Code, just ask:
```

**Example prompts Claude can handle:**

| Prompt | What Happens |
|--------|--------------|
| "Optimize my PPTX skill" | Analyzes, tests, and improves the skill |
| "Why does my docx skill fail on tables?" | Runs diagnostics, finds root cause |
| "Make my skill handle edge cases better" | Generates edge case tests, fixes failures |
| "Improve brand compliance in presentations" | Adds brand validation, color checking |
| "Test my custom skill thoroughly" | Generates comprehensive test suite |

The meta-skill ([`SKILL.md`](skill/SKILL.md)) contains complete instructions for Claude to:
1. Load any skill from standard locations
2. Generate appropriate test cases
3. Run GEPA optimization loop
4. Apply targeted improvements
5. Validate the optimized skill

---

## Getting Started

### Prerequisites

- Python 3.9 or higher
- [Claude Code CLI](https://claude.ai/code) installed and configured
- [uv](https://docs.astral.sh/uv/) (recommended) or pip

### Installation

**With uv (recommended - no residual environments):**

```bash
# Clone the repository
git clone https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA.git
cd ClaudeSkills-Optimizer-GEFA/claude-skill-optimizer

# Run directly with uv (inline dependencies, no venv created)
uv run python -m src.core.skill_optimizer --help

# Or run example scripts directly
uv run examples/pptx-optimization/template-workflow/run_optimization.py --mock
```

**With pip (creates virtual environment):**

```bash
# Clone the repository
git clone https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA.git
cd ClaudeSkills-Optimizer-GEFA/claude-skill-optimizer

# Install in development mode
pip install -e .
```

> **Note:** We recommend [uv](https://docs.astral.sh/uv/) for a cleaner experience. Install with:
> ```bash
> curl -LsSf https://astral.sh/uv/install.sh | sh
> ```

### Install as a Claude Skill (Meta-Skill)

This repository includes a **meta-skill** that Claude can use to optimize other skills:

```bash
# Install the skill-optimizer skill to Claude Code
chmod +x install-skill.sh
./install-skill.sh

# Now you can ask Claude Code:
# "Help me optimize my PPTX skill"
# "Analyze why my docx skill fails on edge cases"
# "Improve my custom skill's error handling"
```

The meta-skill will be installed to `~/.claude/skills/skill-optimizer/`

### Quick Start

```bash
# 1. Initialize optimization for a skill
skill-optimizer init \
    --skill-path ~/.claude/skills/my-skill \
    --output-dir ./my-optimization

# 2. Generate test cases automatically
skill-test-gen \
    --skill-path ~/.claude/skills/my-skill \
    --output ./my-optimization/test_cases.yaml \
    --count 20

# 3. Run optimization
skill-optimizer optimize \
    --config ./my-optimization/config.yaml \
    --max-iterations 10

# 4. Apply the optimized skill
cp ./my-optimization/optimized_skill/SKILL.md \
   ~/.claude/skills/my-skill/SKILL.md
```

### Supported Skill Locations

| Environment | Path |
|-------------|------|
| Claude Code (User) | `~/.claude/skills/{skill-name}/` |
| Claude Code (Project) | `./.claude/skills/{skill-name}/` |
| Claude.ai (Public) | `/mnt/skills/public/{skill-name}/` |
| Claude.ai (User) | `/mnt/skills/user/{skill-name}/` |

---

## Examples

The repository includes ready-to-run examples demonstrating skill optimization for various use cases.

### PPTX Skill Optimization

Comprehensive examples for optimizing PowerPoint generation:

| Example | Description | Directory |
|---------|-------------|-----------|
| [**Template Workflow**](examples/pptx-optimization/template-workflow/) | Optimize working with corporate templates | [`examples/pptx-optimization/template-workflow/`](examples/pptx-optimization/template-workflow/) |
| [**Brand Compliance**](examples/pptx-optimization/brand-compliance/) | Ensure brand colors, fonts, logo placement | [`examples/pptx-optimization/brand-compliance/`](examples/pptx-optimization/brand-compliance/) |
| [**Image Embedding**](examples/pptx-optimization/image-embedding/) | Optimize image placement and sizing | [`examples/pptx-optimization/image-embedding/`](examples/pptx-optimization/image-embedding/) |
| [**Charts & Tables**](examples/pptx-optimization/charts-tables/) | Improve data visualization quality | [`examples/pptx-optimization/charts-tables/`](examples/pptx-optimization/charts-tables/) |

```bash
# Run a specific example (with uv - no setup needed)
uv run examples/pptx-optimization/template-workflow/run_optimization.py --mock

# Or with pip after installation
cd examples/pptx-optimization/brand-compliance
python run_optimization.py
```

### Custom Skills

Use the custom skill template as a starting point for optimizing your own skills:

| Example | Description | Directory |
|---------|-------------|-----------|
| [**Custom Skill Template**](examples/custom-skill/) | Template for optimizing any custom skill | [`examples/custom-skill/`](examples/custom-skill/) |

```bash
# Optimize a custom skill (with uv)
uv run examples/custom-skill/run_optimization.py --skill-path /path/to/your/skill

# Or with pip after installation
cd examples/custom-skill
python run_optimization.py --skill-path /path/to/your/skill
```

---

## Metrics & Evaluation

### Built-in Evaluators

| Evaluator | Type | Description |
|-----------|------|-------------|
| `BinaryEvaluator` | binary | Success/failure |
| `FileExistsEvaluator` | file_exists | Expected files created |
| `FileValidityEvaluator` | file_validity | Files are valid (not corrupt) |
| `ContentMatchEvaluator` | content_match | Expected content present |
| `ErrorRateEvaluator` | error_rate | Based on error count |
| `EfficiencyEvaluator` | efficiency | Execution time/tokens |
| `BrandComplianceEvaluator` | brand | Color/font/style compliance |
| `VisualQualityEvaluator` | visual | Layout, overflow, contrast |

### Custom Evaluators

```python
from src.evaluators.base import BaseEvaluator, EvaluationResult

class MyCustomEvaluator(BaseEvaluator):
    def evaluate(self, trace, test_case=None):
        score = self._calculate_score(trace)
        return EvaluationResult(
            metric_name=self.name,
            score=score,
            details={"custom": "data"},
            issues=[]
        )
```

### Expected Results

Based on GEPA research and testing:

| Metric | Typical Improvement |
|--------|---------------------|
| Task Completion | +10-15% |
| Output Quality | +15-25% |
| Error Rate | -30-50% |
| Edge Case Handling | +20-40% |
| Brand Compliance | +25-35% |

---

## Architecture

```
claude-skill-optimizer/
├── SKILL.md                     # Meta-skill (root copy)
├── skill/                       # Installable skill package
│   └── SKILL.md                 # Meta-skill for Claude Code
├── install-skill.sh             # Install meta-skill to ~/.claude/skills/
├── src/
│   ├── core/
│   │   ├── skill_loader.py      # Load skills from any path
│   │   ├── claude_executor.py   # Execute via Claude Code CLI
│   │   └── skill_optimizer.py   # Main GEPA optimization loop
│   ├── evaluators/
│   │   ├── base.py              # Base evaluator classes
│   │   ├── visual_evaluators.py # Visual quality checks
│   │   └── brand_evaluators.py  # Brand compliance checks
│   ├── generators/
│   │   └── test_generator.py    # Auto-generate test cases
│   └── adapters/
│       └── gepa_adapter.py      # GEPA library integration
├── examples/                    # Ready-to-run examples
│   ├── pptx-optimization/       # PPTX skill examples
│   └── custom-skill/            # Template for custom skills
├── tests/                       # Test suite
├── docs/                        # Documentation
└── assets/                      # Sample templates
```

## Configuration

### config.yaml

```yaml
skill:
  path: ~/.claude/skills/pptx
  name: pptx
  components:
    - SKILL.md
    - html2pptx.md
    - css.md

optimization:
  max_iterations: 15
  max_evaluations: 150
  population_size: 5

evaluation:
  metrics:
    - name: task_completion
      weight: 0.25
      type: binary
    - name: visual_quality
      weight: 0.30
      type: visual
    - name: brand_compliance
      weight: 0.25
      type: brand
    - name: efficiency
      weight: 0.20
      type: computed

claude:
  model: claude-sonnet-4-20250514
  timeout: 300
```

---

## Documentation

- [Getting Started Guide](docs/guides/getting-started.md)
- [Configuration Reference](docs/guides/configuration.md)
- [Writing Test Cases](docs/guides/test-cases.md)
- [Custom Evaluators](docs/guides/custom-evaluators.md)
- [API Reference](docs/api/README.md)

---

## Contributing

We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

```bash
# Development setup (with uv - recommended)
git clone https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA.git
cd ClaudeSkills-Optimizer-GEFA/claude-skill-optimizer

# Run tests (uv handles dependencies automatically)
uv run pytest tests/

# Run linting
uv run ruff check src/

# Or with pip (creates venv)
pip install -e ".[dev]"
pytest tests/
```

### Reporting Issues

Found a bug or have a feature request? Please open an issue:

- [Report a Bug](https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA/issues/new?template=bug_report.md)
- [Request a Feature](https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA/issues/new?template=feature_request.md)

---

## Citation

```bibtex
@software{claude_skill_optimizer,
  title = {Claude Skill Optimizer: GEPA-based Skill Evolution},
  author = {Artzi, Ronen},
  year = {2025},
  url = {https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA}
}
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- [GEPA](https://github.com/rartzi/gepa) - The underlying optimization algorithm
- [Anthropic](https://anthropic.com) - Claude and Claude Code
- [DSPy](https://dspy.ai) - Inspiration for programmatic prompting

---

<div align="center">

**[Report Bug](https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA/issues/new?template=bug_report.md)** · **[Request Feature](https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA/issues/new?template=feature_request.md)** · **[Discussions](https://github.com/rartzi/ClaudeSkills-Optimizer-GEFA/discussions)**

</div>
