playbook/outfitter-agents/plugins/outfitter/skills/skills-dev/references/best-practices.md

# Agent Skills Best Practices

Community-sourced patterns, techniques, and pitfalls from practitioners and official documentation.

## Table of Contents

- [Progressive Disclosure Architecture](#progressive-disclosure-architecture)
- [Skill Composition Patterns](#skill-composition-patterns)
- [Description Optimization](#description-optimization)
- [Common Pitfalls](#common-pitfalls)
- [Testing Strategies](#testing-strategies)
- [Advanced Techniques](#advanced-techniques)
- [Security Considerations](#security-considerations)
- [Organization-Wide Patterns](#organization-wide-patterns)

## Progressive Disclosure Architecture

**Three-tier information model**: Discovery → Activation → Execution

### Discovery Layer (~50 tokens)

YAML frontmatter that helps agents find the right skill without loading full content.

```yaml
---
name: pdf-processing
description: Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.
---
```

**Keys to effective discovery**:
- Include WHAT the skill does AND WHEN to use it
- Use third-person voice
- Include specific trigger terms users might mention
- Keep under 100 tokens

### Activation Layer (~2-5K tokens)

Core SKILL.md instructions loaded when skill is invoked.

**Structure**:

```markdown
# Skill Name

<when_to_use>
Clear criteria for when this skill applies
</when_to_use>

<workflow>
Step-by-step process (numbered or structured)
</workflow>

<rules>
- ALWAYS: Mandatory behaviors
- NEVER: Prohibited actions
- PREFER: Recommended approaches
</rules>

<references>
Links to deep-dive docs in references/ subdirectory
</references>
```

**Keys to effective activation**:
- **Assume intelligence**: Claude doesn't need basic concepts explained
- **Be directive, not comprehensive**: Focus on what makes THIS approach different
- **Keep under 500 lines**: Move details to references/
- **Use examples sparingly**: Only for non-obvious patterns

### Execution Layer (dynamic)

Deep-dive content loaded on-demand from references/ subdirectory.

**Pattern from practitioners**:

```
skill-name/
├── SKILL.md                    # Core workflow (500 lines max)
├── references/
│   ├── configuration.md        # Detailed config options
│   ├── error-handling.md       # Edge cases and recovery
│   ├── advanced-patterns.md    # Expert techniques
│   └── examples.md             # Worked examples
└── scripts/                    # Helper utilities
```

**Why this works** (source: Juan C Olamendy, skillmatic-ai):
- Prevents context rot from loading irrelevant information
- Allows targeted follow-up ("show me the advanced patterns")
- Keeps initial load fast and focused
- Scales to complex domains without overwhelming context

## Skill Composition Patterns

### Skills Invoking Skills

**Pattern**: Reference other skills in instructions rather than duplicating methodology.

```markdown
## Error Investigation

Load the **outfitter:debugging** skill using the Skill tool to investigate
this authentication failure systematically.

Pass these parameters to the debugging workflow:
- Error context: [collected error details]
- Hypothesis: Token validation timing issue
```

**Why this works**:
- Reuses established methodologies
- Maintains single source of truth
- Allows skills to evolve independently
- Reduces duplication across skill library

**Anti-pattern**: Embedding another skill's instructions inline.

### Subagent Architecture

For orchestrating specialized work with context isolation, see [claude-code.md](./claude-code.md#master-clone-architecture) for Claude Code-specific patterns.

### Skill + External Service Integration

Skills can integrate with external services (APIs, MCP servers) by separating concerns:
- **External service**: Handles authentication, rate limiting, data access
- **Skill**: Handles business logic, formatting, workflows

This separation enables reuse across similar domains.

## Description Optimization

**Goal**: Help Claude discover your skill without loading it.

### Include Both WHAT and WHEN

❌ **Vague**: "Processes PDFs"
✅ **Specific**: "Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction."

### Use Third-Person Voice

❌ "Use me when you need to debug"
✅ "Debugs issues using systematic root cause analysis. Use when encountering errors, unexpected behavior, or test failures."

### Include Trigger Terms

Think about what users actually say:

```yaml
description: Creates weekly team status reports with wins, challenges, and priorities.
  Use when the user asks for a team update, standup report, weekly summary, or status
  email. Keywords: standup, weekly update, team report, status.
```

### Be Specific About Scope

❌ "Helps with testing"
✅ "Implements test-driven development using Red-Green-Refactor cycles. Use when implementing new features with tests first, refactoring with test coverage, or reproducing bugs as failing tests."

**Source**: Official Anthropic best practices emphasize specificity prevents Claude from loading irrelevant skills.

## Common Pitfalls

### 1. Making SKILL.md Too Verbose

**Symptom**: 1000+ line SKILL.md files with exhaustive explanations.

**Why it's a problem**:
- Wastes context window on every invocation
- Buries key directives in noise
- Assumes Claude needs basic concepts explained

**Fix**:
- Keep SKILL.md under 500 lines
- Move deep dives to references/
- Trust Claude's base knowledge
- Focus on WHAT makes THIS approach unique

**Example** (source: Anthropic best practices):

❌ **Verbose**:

```markdown
## What is Test-Driven Development?

Test-Driven Development (TDD) is a software development methodology where you write
tests before writing the actual code. This approach was popularized by Kent Beck
and has become a cornerstone of modern software engineering practices...

[500 lines of TDD philosophy]
```

✅ **Concise**:

```markdown
## TDD Workflow

1. **Red**: Write a failing test for the next small piece of functionality
2. **Green**: Write minimal code to make the test pass
3. **Refactor**: Improve code while keeping tests green

ALWAYS write the test first. NEVER skip the refactor step.
```

### 2. Negative-Only Constraints

**Symptom**: Instructions full of "NEVER do X" without alternatives.

❌ **Problem**:

```markdown
- NEVER use any types
- NEVER skip error handling
- NEVER commit without tests
```

**Why it's a problem**: Tells Claude what NOT to do but not what TO do.

✅ **Fix**: Pair constraints with positive alternatives:

```markdown
- ALWAYS use strict types; NEVER use `any`
- ALWAYS handle errors with Result types; NEVER let exceptions propagate silently
- ALWAYS run tests before committing; NEVER push untested code
```

### 3. Deeply Nested File References

**Symptom**: Skills referencing files that reference other files 3+ levels deep.

**Why it's a problem**:
- Context explosion
- Circular references
- Hard to maintain

**Fix** (source: skillmatic-ai research):
- Keep references ONE level deep
- Use table of contents in long reference files
- Let Claude request additional detail if needed

❌ **Deep nesting**:

```
SKILL.md → references/patterns.md → references/examples/auth.md → references/examples/auth/jwt.md
```

✅ **Flat structure**:

```
SKILL.md → references/auth-patterns.md (with ToC for JWT, OAuth, etc.)
```

### 4. Not Treating Skills Like Code

**Symptom**: Skills maintained as loose documents without version control, testing, or reviews.

**Why it's a problem**:
- Skills drift from reality
- Breaking changes go unnoticed
- No way to roll back problematic versions

**Fix** (source: blog.sshh.io, Nate's newsletter):
- **Version control**: Skills in git repos with semantic versioning
- **Testing**: Build evaluations to validate skill behavior
- **Reviews**: Treat skill PRs like code reviews
- **Changelog**: Document what changed and why

**Pattern from practitioners**:

```markdown
---
name: api-integration
version: 2.1.0
changelog: |
  2.1.0 - Added retry logic for rate limiting
  2.0.0 - Switched to streaming responses (breaking)
  1.5.0 - Added webhook verification
---
```

### 5. Over-Relying on Auto-Compaction

**Symptom**: Never manually clearing context, letting auto-compaction handle everything.

**Why it's a problem** (source: blog.sshh.io practitioner experience):
- Important context gets compressed or dropped
- Skill instructions get summarized incorrectly
- Debugging becomes harder when full skill isn't visible

**Fix**: Manual context management strategy:
1. Start complex tasks with `/clear` for clean slate
2. Use `/catchup` with explicit context about what skills are active
3. Let auto-compaction handle routine continuations
4. Force reload skills after compaction if behavior seems off

**When to manually clear**:
- Starting new major feature
- Switching between unrelated tasks
- After hitting context limits on complex debugging
- When skill behavior seems inconsistent

### 6. Unclear Skill Boundaries

**Symptom**: Skill tries to do too many unrelated things.

**Example**: "code-helper" that does linting, testing, documentation, deployment, and debugging.

**Why it's a problem**:
- Hard to discover (description too generic)
- Loads unnecessary context
- Becomes maintenance nightmare

**Fix**: **One skill, one job**.

✅ **Well-scoped skills**:
- `linting-workflow`: Code quality checks and fixes
- `tdd`: TDD methodology
- `api-documentation`: API reference generation
- `deployment-automation`: Deploy and rollback workflows
- `debugging`: Root cause investigation

**Exception**: Orchestrator skills that explicitly load other skills (like `feature-development` that loads TDD → documentation → deployment in sequence).

### 7. No Usage Examples

**Symptom**: Skill has abstract instructions but no concrete examples.

**Why it's a problem**: Claude may misinterpret intent without seeing desired output.

**Fix**: Include 1-2 examples in references/examples.md

**Pattern**:

```markdown
# Examples

## Example 1: Simple Case

**Input**: User asks to add login endpoint

**Workflow**:
1. Load TDD skill
2. Write failing test for /login POST
3. Implement minimal auth logic
4. Refactor to service layer

**Output**: [Show actual test code + implementation]

## Example 2: Edge Case

**Input**: User asks to add login with OAuth and JWT and refresh tokens

**Workflow**:
1. Load pathfinding skill to break down requirements
2. Load TDD skill for each component separately
3. OAuth integration → JWT generation → Refresh logic
4. Each gets its own test cycle

**Output**: [Show breakdown and test structure]
```

**Source**: Official Anthropic best practices recommend examples for non-obvious patterns.

## Testing Strategies

### Eval-Driven Development

**Pattern**: Build evaluations BEFORE extensive documentation (source: Nate's newsletter).

**Workflow**:
1. Create minimal skill version
2. Build test suite with target inputs/outputs
3. Iterate skill until evals pass consistently
4. THEN write comprehensive docs

**Why this works**:
- Prevents documenting the wrong approach
- Faster iteration cycles
- Forces clarity about success criteria
- Builds regression test suite automatically

**Implementation** (from Nate's debugging toolkit):

```typescript
// skill-testing-framework pattern
interface SkillEval {
  name: string;
  input: string;
  expectedBehavior: string[];
  forbiddenBehavior: string[];
  targetModels: ('haiku' | 'sonnet' | 'opus')[];
}

const tddSkillEvals: SkillEval[] = [
  {
    name: "basic-tdd-workflow",
    input: "Add a login endpoint",
    expectedBehavior: [
      "Writes test first",
      "Test fails initially (red phase)",
      "Implements minimal solution",
      "Test passes (green phase)",
      "Refactors with tests passing"
    ],
    forbiddenBehavior: [
      "Writes implementation before test",
      "Skips refactor step",
      "Makes test pass by modifying test"
    ],
    targetModels: ['haiku', 'sonnet', 'opus']
  }
];
```

### Multi-Model Testing

**Pattern**: Test skills with all target models.

**Why**: Haiku, Sonnet, and Opus interpret instructions differently:
- **Haiku**: Needs more explicit instructions, less inference
- **Sonnet**: Balanced reasoning, good for most workflows
- **Opus**: Handles complex context, better with ambiguity

**Testing strategy** (source: Anthropic best practices):

| Aspect | Haiku Test | Sonnet Test | Opus Test |
|--------|------------|-------------|-----------|
| Clarity | Do instructions work with minimal reasoning? | Do instructions balance brevity and clarity? | Do instructions leverage advanced reasoning? |
| Context | Works with small context? | Handles moderate references? | Manages large cross-references? |
| Edge cases | Explicit handling? | Reasonable inference? | Sophisticated judgment? |

**Fix pattern**: If Haiku fails but Sonnet passes, instructions likely assume too much inference.

### Real-World Usage Testing

**Pattern**: Test skills with actual users/agents in production-like scenarios.

**Anti-pattern**: Only testing with constructed examples.

**Strategy** (from practitioner experience):
1. **Dogfooding**: Use your own skills for real work
2. **Iteration tracking**: Log when skills are loaded but not followed
3. **Confusion signals**: Detect when Claude asks for clarification (skill might be unclear)
4. **Outcome validation**: Did the skill achieve its intended result?

**Metrics to track**:
- Skill load frequency (is it discoverable?)
- Completion rate (do workflows finish?)
- User satisfaction (did it solve the problem?)
- Iteration count (how many tries to get it right?)

**From blog.sshh.io**: "Built 10 debugging tools after watching 100 people hit the same problems in their first week."

### Systematic Evaluation Framework

**Components** (source: Nate's newsletter, skillmatic-ai research):

1. **skill-debugging-assistant**: Identifies where skills fail
2. **skill-security-analyzer**: Checks for security risks in skill code
3. **skill-gap-analyzer**: Finds missing skills in your library
4. **skill-performance-profiler**: Tracks context usage and latency
5. **prompt-optimization-analyzer**: Improves skill descriptions for discovery
6. **skill-testing-framework**: Automated test runner for skills

**Pattern**: Build tools to test tools.

## Advanced Techniques

### Hook-Based Validation

For platform-specific hook implementation patterns, see [claude-code.md](./claude-code.md#hook-based-validation).

**General principle**: Use hooks to enforce constraints at decision points—prevent destructive operations, enforce testing requirements, validate configuration before deployment.

### Organization-Wide Skill Libraries

**Pattern**: Centralized skill repository as institutional knowledge (source: Juan C Olamendy, Medium).

**Structure**:

```
company-skills/
├── engineering/
│   ├── deployment-workflow/
│   ├── incident-response/
│   └── architecture-review/
├── product/
│   ├── user-story-creation/
│   └── feature-planning/
└── business/
    ├── team-standup/
    └── quarterly-planning/
```

**Benefits**:
- Codifies company processes
- Onboarding material becomes executable
- Process improvements propagate automatically
- Consistency across teams

**Implementation** (from practitioners):
1. **Central registry**: Marketplace or internal skill server
2. **Contribution guidelines**: Templates for creating company skills
3. **Review process**: Skills reviewed like code before publishing
4. **Version management**: Semantic versioning for breaking changes
5. **Deprecation policy**: How to sunset old patterns

**Pattern from blog.sshh.io**:

```markdown
# Company Skill Manifest

## Deployment
- `deployment-staging`: Deploy to staging with rollback plan
- `deployment-production`: Production deploy with checklist
- `deployment-rollback`: Emergency rollback procedures

## Code Review
- `pr-review-backend`: Backend code review checklist
- `pr-review-frontend`: Frontend code review standards
- `security-review`: Security-focused code review

## Documentation
- `api-documentation`: OpenAPI spec generation
- `readme-maintenance`: README updates for features
```

**Anti-pattern**: Every team building their own version of the same workflows.

### Progressive Skill Disclosure in Practice

**Advanced pattern**: Table of contents in reference files for targeted loading.

**Example** (source: skillmatic-ai architecture):

```markdown
# API Integration Patterns

## Table of Contents

- [REST Basics](#rest-basics) - Standard CRUD operations
- [GraphQL](#graphql) - Query and mutation patterns
- [Webhooks](#webhooks) - Event-driven integrations
- [Rate Limiting](#rate-limiting) - Backoff and retry
- [Authentication](#authentication) - OAuth, JWT, API keys
- [Error Handling](#error-handling) - Retry logic and fallbacks

## REST Basics

[Focused content on REST]

## GraphQL

[Focused content on GraphQL]
```

**Usage**: Skill says "See references/api-patterns.md#rate-limiting for retry logic" rather than loading entire file.

**Why it works**:
- Claude can navigate to specific section
- Preserves context for other tasks
- User can request more depth if needed

### Skills as Living Documentation

**Pattern**: Skills replace static documentation that goes stale.

**Traditional docs**: "Here's how to deploy" (written once, outdated quickly)
**Skill**: Executes deployment with current best practices

**Benefits** (source: Juan C Olamendy):
- **Always current**: If process changes, skill changes
- **Executable**: Not just instructions but enforcement
- **Testable**: Verify the process actually works
- **Discoverable**: Claude can find relevant process

**Example transformation**:

❌ **Static doc** (docs/deployment.md):

```markdown
# Deployment Process

1. Run tests
2. Update version number
3. Build production bundle
4. Upload to S3
5. Clear CDN cache
6. Notify team in Slack

[This gets outdated when we switch to Vercel]
```

✅ **Skill** (skills/deployment/SKILL.md):

```markdown
---
name: deployment-production
description: Deploys to production with safety checks
---

# Production Deployment

1. Verify all tests pass: `bun test`
2. Run build: `bun run build`
3. Deploy to Vercel: `vercel --prod`
4. Verify deployment: Check /api/health
5. Notify team: Use Slack MCP to post to #deployments

ALWAYS wait for health check before considering deploy complete.
```

**When process changes**: Update skill, test it, deploy new version. Documentation stays current.

### Skill Chaining for Complex Workflows

**Pattern**: Master skill orchestrates sequence of specialized skills.

**Example** (source: practitioner patterns):

```markdown
---
name: feature-development
description: End-to-end feature development workflow
---

# Feature Development Workflow

## Stage 1: Planning
Load **pathfinding** skill to clarify requirements and architecture.

## Stage 2: Implementation
Load **tdd** skill to implement with tests.

## Stage 3: Documentation
Load **api-documentation** skill to generate API docs.

## Stage 4: Review
Load **code-review** skill to validate implementation.

## Stage 5: Deployment
Load **deployment-staging** skill to deploy for testing.

Each stage must complete successfully before proceeding to next.
```

**Advantage**: Each specialized skill can evolve independently. Feature-development orchestrates but doesn't duplicate.

**Related pattern - Conditional chaining**:

```markdown
## Error Recovery

If tests fail in Stage 2:
  Load **debugging** skill to investigate
  Return to Stage 2 after fixes

If code review finds issues in Stage 4:
  Return to Stage 2 for fixes
  Re-run Stage 3 to update docs
  Re-run Stage 4 to re-review
```

## Security Considerations

**Critical warning** (source: Sid Bharath tutorial, security research): Skills can execute arbitrary code and access files. Only use skills from trusted sources.

### Risks

1. **Code execution**: Skills can include scripts that run on your machine
2. **File access**: Skills can read/write files in project
3. **Network access**: Skills can make HTTP requests
4. **Credential access**: Skills can access environment variables, config files
5. **Social engineering**: Malicious skills disguised as helpful tools

### Protection Strategies

**1. Source verification**:
- Only install skills from trusted authors
- Review skill code before using
- Check community reputation and reviews
- Verify skill matches description (no hidden behavior)

**2. Code review checklist** (from security research):

```markdown
## Skill Security Review

- [ ] Review all scripts in scripts/ directory
- [ ] Check for file system access patterns
- [ ] Verify network requests are legitimate
- [ ] Confirm no credential harvesting
- [ ] Check for obfuscated code
- [ ] Validate external dependencies
- [ ] Test in isolated environment first
```

**3. Sandbox testing**:
- Test new skills in isolated project first
- Use throwaway credentials for initial testing
- Monitor file system and network activity
- Check for unexpected side effects

**4. Minimal permissions**:

```yaml
# Proposed security metadata (from research)
permissions:
  file_read: ['src/**', 'docs/**']
  file_write: ['docs/**']
  network: ['https://api.company.com']
  environment: []
```

**5. Audit logging**:
Track what skills do in production:
- What files were accessed?
- What commands were executed?
- What network requests were made?

**From security papers**: "Skills are code execution with conversational interface. Treat them with same security rigor as any code dependency."

## Organization-Wide Patterns

### Skill as Institutional Knowledge

**Pattern**: Replace tribal knowledge with executable skills (source: Juan C Olamendy).

**Traditional problem**:
- "How do we deploy?" → Ask Sarah, she knows
- "What's the PR review process?" → Different on every team
- "How do we handle incidents?" → Check the wiki (outdated)

**Skill solution**:
- **deployment-production skill**: Encodes Sarah's knowledge
- **pr-review skill**: Standardizes review process
- **incident-response skill**: Current playbook, always up to date

**Implementation strategy**:

1. **Identify critical workflows**: What knowledge is locked in people's heads?
2. **Interview experts**: How do they actually do the work?
3. **Create skills**: Encode process as executable workflow
4. **Test with novices**: Can someone unfamiliar complete the task?
5. **Iterate**: Refine based on real usage
6. **Deprecate docs**: Point to skills instead of wikis

**Example from blog.sshh.io**:

```markdown
---
name: internal-deploy
description: Company deployment process with all safety checks
---

# Internal Deployment Workflow

## Pre-Deploy Checklist
1. Verify Jira ticket is in "Ready for Deploy" status
2. Confirm tests pass in CI: `check-ci-status`
3. Get approval in #deploy-requests Slack channel

## Deploy
1. Run staging deploy: `npm run deploy:staging`
2. Verify staging health: `curl https://staging.company.com/health`
3. Run smoke tests: `npm run smoke-test:staging`
4. Deploy to production: `npm run deploy:prod`
5. Monitor for 5 minutes: Watch Datadog dashboard

## Post-Deploy
1. Verify production health: `curl https://company.com/health`
2. Post to #deployments: "Deployed [feature] to prod"
3. Update Jira ticket to "Deployed"

NEVER skip smoke tests. ALWAYS monitor after deploy.
```

**Benefit**: New team members can deploy safely on day one.

### Contribution Guidelines

**Pattern**: Treat skills like open source contributions.

**Template** (from ComposioHQ awesome-claude-skills):

```markdown
# Contributing Skills

## Before Submitting

1. **Test thoroughly**: Run skill with Haiku, Sonnet, and Opus
2. **Follow structure**: Use provided skill template
3. **Document clearly**: Include description, when to use, examples
4. **Security review**: No malicious code or credential access
5. **License**: MIT or Apache 2.0

## Skill Requirements

- [ ] Descriptive name (kebab-case)
- [ ] Clear description with trigger terms
- [ ] SKILL.md under 500 lines
- [ ] References in references/ subdirectory
- [ ] At least one example in examples/
- [ ] Testing results documented
- [ ] README.md with usage instructions

## Review Process

1. Submit PR with skill in skills/your-skill-name/
2. Maintainers review for quality and security
3. Address feedback
4. Approved skills merged and published
```

### Versioning Strategy

**Pattern**: Semantic versioning for skills (from practitioners).

**Format**: MAJOR.MINOR.PATCH

```yaml
---
name: api-integration
version: 2.1.0
---
```

**Versioning rules**:
- **MAJOR**: Breaking changes (workflow steps changed, different inputs required)
- **MINOR**: New features (additional optional steps, new references added)
- **PATCH**: Bug fixes (typos, clarifications, small improvements)

**Breaking change example**:

```markdown
# Version 1.x: Required user to provide API key
---
name: api-client
version: 1.5.0
description: Make API calls with provided credentials
---

# Version 2.x: Uses MCP server for authentication (breaking)
---
name: api-client
version: 2.0.0
description: Make API calls using Linear MCP server
---
```

**Migration guide pattern**:

```markdown
# Migration Guide: 1.x → 2.0

## Breaking Changes

- No longer accepts `api_key` parameter
- Now requires Linear MCP server configured
- Response format changed from JSON to structured objects

## Migration Steps

1. Install Linear MCP server: `/mcp install linear`
2. Update skill invocations to remove `api_key`
3. Update code expecting JSON to handle structured objects
```

## Summary: Hierarchy of Best Practices

### Essential (Do These Always)

1. **Progressive disclosure**: Keep SKILL.md under 500 lines, use references/
2. **Clear descriptions**: Include what AND when, with trigger terms
3. **Assume intelligence**: Claude doesn't need basics explained
4. **Test with real usage**: Dogfood your own skills
5. **Version control**: Track changes, review like code

### Important (Do These Usually)

6. **Multi-model testing**: Verify Haiku, Sonnet, Opus behavior
7. **Positive constraints**: Say what TO do, not just what NOT to do
8. **Examples for non-obvious**: Show expected behavior
9. **Composition over duplication**: Reference other skills
10. **Security review**: Audit code execution and file access

### Advanced (Do These for Scale)

11. **Eval-driven development**: Build tests before extensive docs
12. **Hook-based enforcement**: Use PreToolUse for quality gates
13. **Organization-wide libraries**: Centralized skill registry
14. **Semantic versioning**: Track breaking changes
15. **Skills as living docs**: Replace static documentation

### Expert (Do These for Excellence)

16. **Systematic evaluation framework**: Build tools to test tools
17. **Master-Clone architecture**: Optimize context usage
18. **Conditional skill chaining**: Orchestrate complex workflows
19. **Audit logging**: Track skill execution in production
20. **Community contribution**: Share patterns, learn from others

## Sources

Research synthesized from:

- **Official Documentation**: Anthropic Claude Agent Skills Best Practices
- **Community Repositories**: ComposioHQ/awesome-claude-skills, skillmatic-ai/awesome-agent-skills
- **Practitioner Blogs**: blog.sshh.io (Claude Code at scale), Juan C Olamendy (Medium), Sid Bharath
- **Research**: Security considerations from academic papers, progressive disclosure architecture
- **Tooling**: Nate's Newsletter (debugging toolkit), evaluation frameworks

Last updated: 2026-01-10