# Claude Code Skills Tests Automated tests for superpowers skills using Claude Code CLI. ## Overview This test suite verifies that skills are loaded correctly and Claude follows them as expected. Tests invoke Claude Code in headless mode (`claude -p`) and verify the behavior. ## Requirements - Claude Code CLI installed and in PATH (`claude --version` should work) - Local superpowers plugin installed (see main README for installation) ## Running Tests ### Run all fast tests (recommended): ```bash ./run-skill-tests.sh ``` ### Run integration tests (slow, 10-30 minutes): ```bash ./run-skill-tests.sh --integration ``` ### Run specific test: ```bash ./run-skill-tests.sh --test test-subagent-driven-development.sh ``` ### Run with verbose output: ```bash ./run-skill-tests.sh --verbose ``` ### Set custom timeout: ```bash ./run-skill-tests.sh --timeout 1800 # 30 minutes for integration tests ``` ## Test Structure ### test-helpers.sh Common functions for skills testing: - `run_claude "prompt" [timeout]` - Run Claude with prompt - `assert_contains output pattern name` - Verify pattern exists - `assert_not_contains output pattern name` - Verify pattern absent - `assert_count output pattern count name` - Verify exact count - `assert_order output pattern_a pattern_b name` - Verify order - `create_test_project` - Create temp test directory - `create_test_plan project_dir` - Create sample plan file ### Test Files Each test file: 1. Sources `test-helpers.sh` 2. Runs Claude Code with specific prompts 3. Verifies expected behavior using assertions 4. Returns 0 on success, non-zero on failure ## Example Test ```bash #!/usr/bin/env bash set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" source "$SCRIPT_DIR/test-helpers.sh" echo "=== Test: My Skill ===" # Ask Claude about the skill output=$(run_claude "What does the my-skill skill do?" 30) # Verify response assert_contains "$output" "expected behavior" "Skill describes behavior" echo "=== All tests passed ===" ``` ## Current Tests ### Fast Tests (run by default) #### test-subagent-driven-development.sh Tests skill content and requirements (~2 minutes): - Skill loading and accessibility - Workflow ordering (spec compliance before code quality) - Self-review requirements documented - Plan reading efficiency documented - Spec compliance reviewer skepticism documented - Review loops documented - Task context provision documented ### Integration Tests (use --integration flag) #### test-subagent-driven-development-integration.sh Full workflow execution test (~10-30 minutes): - Creates real test project with Node.js setup - Creates implementation plan with 2 tasks - Executes plan using subagent-driven-development - Verifies actual behaviors: - Plan read once at start (not per task) - Full task text provided in subagent prompts - Subagents perform self-review before reporting - Spec compliance review happens before code quality - Spec reviewer reads code independently - Working implementation is produced - Tests pass - Proper git commits created **What it tests:** - The workflow actually works end-to-end - Our improvements are actually applied - Subagents follow the skill correctly - Final code is functional and tested ## Adding New Tests 1. Create new test file: `test-.sh` 2. Source test-helpers.sh 3. Write tests using `run_claude` and assertions 4. Add to test list in `run-skill-tests.sh` 5. Make executable: `chmod +x test-.sh` ## Timeout Considerations - Default timeout: 5 minutes per test - Claude Code may take time to respond - Adjust with `--timeout` if needed - Tests should be focused to avoid long runs ## Debugging Failed Tests With `--verbose`, you'll see full Claude output: ```bash ./run-skill-tests.sh --verbose --test test-subagent-driven-development.sh ``` Without verbose, only failures show output. ## CI/CD Integration To run in CI: ```bash # Run with explicit timeout for CI environments ./run-skill-tests.sh --timeout 900 # Exit code 0 = success, non-zero = failure ``` ## Notes - Tests verify skill *instructions*, not full execution - Full workflow tests would be very slow - Focus on verifying key skill requirements - Tests should be deterministic - Avoid testing implementation details