18 KiB

Raw Permalink Blame History

Audit Templates

Templates for auditing and improving existing harness infrastructure.

Advanced profile note: eval and observability sections in this reference apply only when the project explicitly enables advanced agent-platform capabilities. Core ECL harness audits should not fail or lose score just because harness/eval, harness/trace, harness/memory, harness/checkpoints, or harness/metrics are absent.

Audit Checklist

Documentation Audit (25%)

Item	Check	Score
AGENTS.md exists	`test -f AGENTS.md`	0/10
AGENTS.md is ~100 lines (not monolithic)	`wc -l AGENTS.md` should be 80-120	0/10
docs/ARCHITECTURE.md exists	`test -f docs/ARCHITECTURE.md`	0/10
Architecture matches reality	Compare layer hierarchy to `go list ./...`	0/20
docs/DEVELOPMENT.md exists	`test -f docs/DEVELOPMENT.md`	0/10
Build commands in DEVELOPMENT.md work	Run them and check	0/10
docs/QUALITY.md exists	`test -f docs/QUALITY.md`	0/10
Design docs cover major components	Check docs/design-docs/	0/10
Reference docs are complete	Check docs/references/	0/10

Total: /100 → Scale to 25%

Linter Audit (20%)

Item	Check	Score
scripts/lint-deps.go exists	`test -f scripts/lint-deps.go`	0/15
Layer map covers all packages	Compare to `go list ./...`	0/20
Introducing violation fails lint	Add bad import, run lint	0/15
scripts/lint-quality.go exists	`test -f scripts/lint-quality.go`	0/15
Quality rules match QUALITY.md	Compare documented rules to linter	0/10
Makefile has lint-arch target	`grep lint-arch Makefile`	0/10
`make lint-arch` passes	Run it	0/15

Total: /100 → Scale to 20%

Observability Audit (15%)

Item	Check	Score
harness/trace/ exists	`test -d harness/trace`	0/25
Trace format covers all tool types	Check ToolTrace struct	0/25
harness/selftest/ exists	`test -d harness/selftest`	0/25
Observability hook registered	Check hook wiring	0/25

Total: /100 → Scale to 15%

Eval Audit (20%)

Item	Check	Score
harness/eval/framework.go exists	`test -f harness/eval/framework.go`	0/10
harness/eval/runner.go exists	`test -f harness/eval/runner.go`	0/10
harness/eval/scorer.go exists	`test -f harness/eval/scorer.go`	0/10
harness/eval/reporter.go exists	`test -f harness/eval/reporter.go`	0/10
file_ops/ has 5+ tasks	Count JSON files	0/10
code_gen/ has 5+ tasks	Count JSON files	0/10
debugging/ has 5+ tasks	Count JSON files	0/10
refactoring/ has 5+ tasks	Count JSON files	0/10
Tasks cover new features	Manual review	0/10
All tasks still work	Run evals	0/10

Total: /100 → Scale to 20%

Quality Automation Audit (10%)

Item	Check	Score
harness/quality/score.go exists	`test -f harness/quality/score.go`	0/25
Quality score calculation works	Run it	0/25
harness/cleanup/tasks.go exists	`test -f harness/cleanup/tasks.go`	0/25
Cleanup tasks find real issues	Run dry-run	0/25

Total: /100 → Scale to 10%

Integration Audit (10%)

Item	Check	Score
`go build ./...` passes	Run it	0/40
`make lint-arch` passes	Run it	0/30
CI runs harness checks	Check CI config	0/30

Total: /100 → Scale to 10%

Scoring Rubric

How to Score Each Item

Binary items (exists/doesn't): 0 or full points
Quality items (matches reality): Partial credit based on accuracy
- 100%: Exact match
- 75%: Minor discrepancies (1-2 items)
- 50%: Moderate discrepancies (3-5 items)
- 25%: Major discrepancies but structure is right
- 0%: Completely wrong or missing

Calculating Overall Score

Overall = (Doc × 0.25) + (Linter × 0.20) + (Obs × 0.15) + (Eval × 0.20) + (Quality × 0.10) + (Integration × 0.10)

Score Interpretation

Score	Status	Action
0-20%	Critical	Use Create Mode — build from scratch
21-40%	Poor	Major gaps — extensive improvement needed
41-60%	Fair	Multiple gaps — targeted improvement
61-80%	Good	Minor gaps — polish and expand
81-100%	Excellent	Maintenance mode — keep current

Gap Analysis Templates

Documentation Drift Report

## Documentation Drift Analysis

### ARCHITECTURE.md Layer Hierarchy

**Documented Layers:**

[Copy from ARCHITECTURE.md]


**Actual Package Structure:**
```bash
go list ./... | grep -v vendor

Discrepancies:

Documented	Actual	Issue
core/types	core/types	✓ Match
core/agent	core/agent	✓ Match
-	core/newpkg	Missing from docs

Tool Catalog

Documented Tools: [count] Actual Tools: [count]

Missing from docs:

ToolA (added in commit abc123)
ToolB (added in commit def456)

Error Codes

Documented Codes: [count] Actual Codes: [count]