13 KiB
Capability Registry
This registry defines the available capabilities for the harness system. Instead of rigid phases, agents select capabilities based on what the task needs.
This is an advanced capability reference, not the default core harness file list. References to
trace outputs or runtime verification reports describe executor/runtime behavior and should not
cause ecl-harness-engineer to create harness/trace by default.
Design Philosophy
Capability-Driven vs Phase-Driven:
- Old model: "Execute Phase 1, then Phase 2, then Phase 3"
- New model: "Achieve goal by selecting and composing capabilities"
Capabilities are:
- Declarative: Define what they provide, not how they work
- Composable: Can be combined in different ways for different tasks
- Dependency-aware: Declare what they need to run
Capability Graph
┌──────────────────┐
│ discover_project │
│ (entry point) │
└────────┬─────────┘
│
┌──────────────┼──────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌──────────────┐ ┌────────────────┐
│ analyze │ │ classify │ │ audit_harness │
│ architecture│ │ complexity │ │ │
└──────┬──────┘ └──────┬───────┘ └────────┬───────┘
│ │ │
▼ ▼ │
┌─────────────┐ ┌──────────────┐ │
│ create │ │ plan │ │
│ linters │ │ task │ │
└─────────────┘ └──────┬───────┘ │
│ │
▼ │
┌──────────────┐ │
│ execute │ │
│ phase │◄──────────┤
└──────┬───────┘ │
│ │
┌─────────────┼─────────────┐ │
│ │ │ │
▼ ▼ ▼ │
┌───────────┐ ┌───────────┐ ┌──────────┐│
│ validate │ │ verify │ │ verify ││
│ static │ │ runtime │ │functional││
└───────────┘ └───────────┘ └──────────┘│
│
┌──────────────────┘
▼
┌──────────────┐
│ improve │
│ harness │
└──────────────┘
Capability Definitions
discover_project
Entry point capability. Discovers project structure, tech stack, and files.
capability: discover_project
provides:
- project_type # go, typescript, python, java, rust, generic
- tech_stack # {language, framework, package_manager, build_tool}
- file_structure # key directories and entry points
- dependencies # external packages
- adapter # resolved language adapter
requires: [] # No dependencies - this is the entry point
implementation:
script: detect_adapter.py
output: harness/.analysis/project.json
Example output:
{
"project_type": "go",
"tech_stack": {
"language": "go",
"framework": "chi",
"package_manager": "go",
"build_tool": "make"
},
"file_structure": {
"source_dirs": ["cmd/", "internal/"],
"entry_points": ["cmd/server/main.go", "cmd/cli/main.go"],
"config_files": ["go.mod", "Makefile"]
},
"adapter": { ... }
}
analyze_architecture
Analyzes code architecture: imports, layers, interfaces, code paths.
capability: analyze_architecture
provides:
- layer_map # package → layer mapping
- import_graph # who imports whom
- circular_deps # problematic cycles
- key_interfaces # important abstractions
- code_paths # traced execution flows
requires:
- project_type
- file_structure
implementation:
agent: agents/analyzer.md
output: harness/.analysis/architecture.json
classify_complexity
Classifies task complexity to determine execution strategy.
capability: classify_complexity
provides:
- complexity_level # simple, medium, complex
- execution_mode # direct, subagent, worktree
- estimated_phases # how many execution phases expected
requires:
- project_type
classification_rules:
simple:
criteria:
- Single file change OR doc-only change
- No architectural impact
- Clear, isolated scope
execution_mode: direct
verification: static_only
medium:
criteria:
- Multi-file change (2-5 files)
- Clear scope, well-defined deliverable
- May affect multiple layers
execution_mode: subagent
verification: static + runtime + functional
complex:
criteria:
- Major refactoring OR 6+ files
- Cross-cutting concerns
- Architectural changes
execution_mode: worktree
verification: full pyramid
create_documentation
Creates or updates harness documentation.
capability: create_documentation
provides:
- agents_md # AGENTS.md navigation map
- architecture_md # docs/ARCHITECTURE.md
- development_md # docs/DEVELOPMENT.md
- design_docs # docs/design-docs/*.md
requires:
- layer_map
- key_interfaces
- code_paths
implementation:
agent: agents/creator-docs.md
output_dir: docs/
create_linters
Creates architectural linter scripts.
capability: create_linters
provides:
- lint_deps_script # scripts/lint-deps.{ext}
- lint_quality_script # scripts/lint-quality.{ext}
- layer_map_code # Embedded layer definitions
requires:
- layer_map
- tech_stack
implementation:
agent: agents/creator-linters.md
templates: references/linter-templates.md
create_harness_config
Creates harness configuration files.
capability: create_harness_config
provides:
- environment_json # harness/config/environment.json
- setup_scripts # harness/scripts/*.sh
- makefile_targets # Makefile additions
- ci_config # .github/workflows/ci.yml
requires:
- tech_stack
- dependencies
implementation:
agent: agents/creator-config.md
plan_task
Creates execution plan for a development task.
capability: plan_task
provides:
- execution_plan # docs/exec-plans/active/*.md
- phases # Numbered implementation phases
- file_targets # Files to create/modify per phase
requires:
- complexity_level
- layer_map # For architectural context
- key_interfaces # For understanding dependencies
implementation:
coordinator: true # Main agent does this, not subagent
output: docs/exec-plans/active/{date}-{slug}.md
execute_phase
Executes one phase of a development task.
capability: execute_phase
provides:
- code_changes # Files modified
- files_created # New files
- validation_result # Did changes pass basic checks?
requires:
- execution_plan
- phase_number
implementation:
agent: agents/templates/executor-core.md + mixins
output: JSON result with status, files, lessons
validate_static
Runs static validation: build, lint, test.
capability: validate_static
provides:
- build_result # Compilation passed?
- lint_result # Lint checks passed?
- test_result # Tests passed?
- validation_report # Full report
requires:
- tech_stack
implementation:
script: validate.py
output: harness/trace/validation-report.json
verify_runtime
Runs runtime smoke verification.
capability: verify_runtime
provides:
- server_health # Server starts and responds?
- endpoint_tests # API endpoints work?
- cli_tests # CLI commands work?
- verification_report # Full report
requires:
- runtime_verify_plan # Generated by executor/runtime from environment.json + task context
- code_changes # What was changed (for task-specific tests)
implementation:
script: verify.py
output: harness/trace/verify-report.json
verify_functional
Runs deep functional verification with realistic scenarios.
capability: verify_functional
provides:
- scenario_results # Did scenarios pass?
- side_effect_checks # Were side effects correct?
- verification_report # Full report with evidence
requires:
- verify_runtime # Smoke tests must pass first
- code_changes
implementation:
agent: agents/composed/verifier.md
output: harness/trace/verification-report.json
required_for:
- medium tasks
- complex tasks
audit_harness
Audits existing harness quality.
capability: audit_harness
provides:
- harness_score # 0-100 quality score
- dimension_scores # Per-dimension breakdown
- gaps # Missing or weak areas
- improvement_plan # Prioritized fixes
requires:
- project_type
implementation:
agent: agents/auditor.md
output: harness/.analysis/audit.json
improve_harness
Improves harness based on feedback.
capability: improve_harness
provides:
- updated_linters # Improved lint rules
- updated_docs # Fixed/expanded documentation
- updated_config # Better verification config
requires:
- audit_harness
- OR failure_patterns # From harness_critic.py
implementation:
skill: ecl-harness-engineer (with Improve mode)
auto_evolve_harness
Improves harness from accumulated archived change evidence.
capability: auto_evolve_harness
provides:
- evolution_proposal # Evidence-backed harness deltas
- updated_docs # ECL/templates/development guidance when justified
- updated_linters # Only when repeated evidence supports a mechanical rule
- results_log # harness/evolution/results.tsv keep/revert row
requires:
- harness/evolution/pending.md
- harness/changes/INDEX.json
implementation:
skill: ecl-harness-engineer (Auto-Evolve mode)
guardrails:
- create dedicated active change
- read candidate archive summaries before deeper files
- keep only if audit score improves and validation passes
- do not create eval/trace/memory/checkpoints/metrics by default
Selection Logic
The agent selects capabilities based on the task:
New Project Bootstrap
discover_project
→ analyze_architecture
→ create_documentation + create_linters + create_harness_config
→ validate_static
Simple Task (e.g., fix typo)
discover_project
→ classify_complexity (→ simple)
→ execute_phase (direct, no subagent)
→ validate_static
→ done
Medium Task (e.g., add API endpoint)
discover_project
→ classify_complexity (→ medium)
→ analyze_architecture (if not cached)
→ plan_task
→ [execute_phase loop]
→ validate_static
→ verify_runtime
→ verify_functional (MANDATORY)
→ done
Complex Task (e.g., refactor auth system)
discover_project
→ classify_complexity (→ complex)
→ analyze_architecture
→ plan_task (detailed, multi-phase)
→ EnterWorktree (isolation)
→ [execute_phase loop with checkpoints]
→ validate_static
→ verify_runtime
→ verify_functional (MANDATORY, multiple scenarios)
→ audit_harness (post-task)
→ done
Harness Audit/Improvement
discover_project
→ audit_harness
→ improve_harness (if gaps found)
→ validate_static
Caching
Capabilities can cache their output to avoid re-computation:
| Capability | Cache Location | TTL |
|---|---|---|
| discover_project | harness/.analysis/project.json | Until project files change |
| analyze_architecture | harness/.analysis/architecture.json | Until source files change |
| audit_harness | harness/.analysis/audit.json | 24 hours |
Cache invalidation: Compare file mtimes against cache mtime.