# Custom Skill Optimization Template

Use this template to optimize any Claude skill.

## 🚀 Quick Start

```bash
# 1. Copy this folder
cp -r custom-skill my-skill-optimization

# 2. Edit config.yaml with your skill path
vim my-skill-optimization/config.yaml

# 3. Create test cases
vim my-skill-optimization/test_cases.yaml

# 4. Run optimization
cd my-skill-optimization
python run_optimization.py
```

## 📁 Files to Customize

### config.yaml

```yaml
skill:
  path: ~/.claude/skills/YOUR-SKILL-NAME  # <- Change this
  name: your-skill-name
  components:
    - SKILL.md
    # Add any reference files your skill uses

optimization:
  max_iterations: 10
  max_evaluations: 100
  
  # What parts of the skill to optimize
  optimize:
    - instructions
    - examples
    - workflows

evaluation:
  metrics:
    # Add metrics relevant to your skill
    - name: task_completion
      weight: 0.30
      type: binary
      
    - name: output_quality
      weight: 0.40
      type: llm_judge
      
    - name: error_rate
      weight: 0.20
      type: computed
      
    - name: efficiency
      weight: 0.10
      type: computed
```

### test_cases.yaml

Create test cases specific to your skill:

```yaml
test_cases:
  # Basic functionality
  - id: basic_001
    description: "Basic skill usage"
    prompt: |
      Your test prompt here...
    expected_outputs:
      - type: file
        pattern: "*.output"
    quality_criteria:
      - "Specific criterion 1"
      - "Specific criterion 2"
    tags: [basic]
    complexity: simple

  # Edge cases
  - id: edge_001
    description: "Edge case handling"
    prompt: |
      Edge case prompt...
    tags: [edge_case]
    complexity: edge_case
```

## 🔧 Custom Evaluators

If your skill needs specific evaluation logic, create a custom evaluator:

```python
# evaluators/my_evaluator.py
from src.evaluators.base import BaseEvaluator, EvaluationResult

class MySkillEvaluator(BaseEvaluator):
    """Custom evaluator for my skill."""
    
    def evaluate(self, trace, test_case=None):
        # Your evaluation logic
        score = self._calculate_score(trace, test_case)
        
        return EvaluationResult(
            metric_name=self.name,
            score=score,
            details={"custom": "data"},
            issues=self._find_issues(trace)
        )
    
    def _calculate_score(self, trace, test_case):
        # Implement your scoring logic
        return 1.0 if trace.success else 0.0
    
    def _find_issues(self, trace):
        return trace.errors
```

Then reference it in config.yaml:

```yaml
evaluation:
  metrics:
    - name: my_custom_metric
      weight: 0.30
      type: custom
      evaluator: evaluators.my_evaluator.MySkillEvaluator
```

## 📊 Test Case Categories

Organize your test cases by category:

| Category | Purpose | Count |
|----------|---------|-------|
| Basic | Core functionality | 5-10 |
| Complex | Multi-step tasks | 5-10 |
| Edge Cases | Boundary conditions | 5-10 |
| Integration | Combined features | 3-5 |

## 🎯 Optimization Tips

1. **Start Small**: Begin with 5-10 test cases
2. **Include Failures**: Add tests for known failure modes
3. **Diverse Complexity**: Mix simple and complex tasks
4. **Specific Criteria**: Define clear quality expectations
5. **Iterate**: Review results and add more tests

## 📈 Expected Timeline

| Phase | Duration | Goal |
|-------|----------|------|
| Setup | 1 hour | Config + 10 test cases |
| Initial Run | 2-4 hours | Baseline + first optimization |
| Refinement | 2-4 hours | Add tests, re-optimize |
| Validation | 1-2 hours | Test optimized skill |

## 🔗 Resources

- [Configuration Guide](../../docs/guides/configuration.md)
- [Test Case Guide](../../docs/guides/test-cases.md)
- [Custom Evaluators](../../docs/guides/custom-evaluators.md)
