playbook/antigravity-awesome-skills/skills/ecl-harness-engineer/SKILL.md

---
name: ecl-harness-engineer
description: "Create or audit ECL Agent Harness infrastructure: AGENTS.md, change tracking, repository guidance, lint checks, CI gates, and agent handoff docs."
category: development
risk: safe
source: community
source_repo: qinghui316/ecl-harness-engineer
source_type: community
date_added: "2026-06-13"
author: qinghui316
tags: [codex, agent-harness, ecl, workflow, ci]
tools: [codex, claude, cursor, gemini, antigravity]
license: MIT
license_source: "https://github.com/qinghui316/ecl-harness-engineer/blob/main/LICENSE"
---

# ECL Harness Engineer
Design and create Harness Engineering infrastructure so AI agents can work reliably in a codebase.

> **Core Philosophy**: "Intelligence without infrastructure is just a demo." The Agent Harness is the Operating System — the LLM is just the CPU. The repository becomes the single source of truth — if an agent can't see it in context, it doesn't exist.

## When to Use This Skill

- Use when a repository needs AI-agent collaboration infrastructure such as `AGENTS.md`, `docs/ECL.md`, `docs/STATUS.md`, harness change tracking, or mechanical validation gates.
- Use when auditing an existing Agent Harness for missing ECL lifecycle docs, change templates, lint checks, environment contracts, or CI integration.
- Use when converting repeated agent workflow failures into repository-local documentation, tests, lint rules, or lightweight auto-evolution checks.
- Do not use for ordinary business feature implementation unless the requested work is specifically about creating or improving the repository harness.

## Limitations

- This skill creates or audits harness infrastructure; it does not replace product requirements, implementation planning, code review, or release approval for the target project.
- The generated ECL docs, linters, scripts, and CI examples must be adapted to the repository's actual stack, security model, and existing contributor workflow before enforcement.
- Auto-evolve recommendations are guidance only. Apply harness changes through normal review, validation, and rollback discipline instead of accepting them as autonomous policy changes.

## Unified Workflow

This skill follows a single unified workflow regardless of project state (empty, existing code, or existing harness). The core idea: **detect the gap between current state and target state, then fill it**.

Default to a **core ECL harness**. Core includes lightweight auto-evolve threshold checking:
closed changes are counted, a pending evolution note is generated when the threshold is reached,
and Codex applies harness improvements only through evidence, validation, scoring, and rollback.
Advanced agent-platform capabilities such as eval datasets, execution traces, durable state,
checkpoints, long-term memory, and metrics remain optional profiles only when the user explicitly
asks for agent evaluation, observability, resumable execution, or long-term memory.

This skill improves the target repository's agent harness. It does **not** implement ordinary
business features, replace the coding agent's plan mode, or create a separate requirements product.
Plan mode is useful for live discussion; ECL artifacts are the repository record that later agents,
linters, CI, and archive history can inspect.

1. **Quick Detection + Intent Confirmation** — what exists, what already passes, and what the user wants.
2. **Analysis** — architecture, harness state, environment, and project identity.
3. **Intake Review + Delta Synthesis** — classify small vs structured work, support requirement-first
   and plan-first inputs, and compute exactly what to create or update.
4. **Creation/Update** — docs, status handoff, linters, ECL/change scripts, environment config, and CI.
5. **Verification + Handoff** — run checks, attribute failures, update STATUS.md, trigger auto-evolve checks, and summarize results.

---

## Phase 1: Quick Detection + Intent Confirmation

**Goal**: In under 5 minutes, understand project state and user intent.

### 1.1 Project State Detection

Run this quick scan:

```bash
# Count files
file_count=$(find . -type f ! -path './.git/*' ! -path './node_modules/*' ! -path './vendor/*' 2>/dev/null | wc -l)
code_files=$(find . -type f \( -name "*.go" -o -name "*.ts" -o -name "*.js" -o -name "*.py" -o -name "*.rs" \) ! -path './.git/*' ! -path './node_modules/*' ! -path './vendor/*' 2>/dev/null | wc -l)

# Check harness components
has_agents_md=$(test -f AGENTS.md && echo "yes" || echo "no")
has_architecture=$(test -f docs/ARCHITECTURE.md && echo "yes" || echo "no")
has_linters=$(ls scripts/lint-* 2>/dev/null | wc -l)
has_harness_dir=$(test -d harness && echo "yes" || echo "no")
has_ecl_doc=$(test -f docs/ECL.md && echo "yes" || echo "no")
has_changes_dir=$(test -d harness/changes && echo "yes" || echo "no")
has_change_templates=$(test -d harness/templates/change && echo "yes" || echo "no")
has_change_script=$(ls scripts/harness-change.* 2>/dev/null | wc -l)
has_evolve_script=$(ls scripts/harness-evolve.* 2>/dev/null | wc -l)
has_ecl_lint=$(ls scripts/lint-ecl.* 2>/dev/null | wc -l)
has_encoding_lint=$(ls scripts/lint-encoding.* 2>/dev/null | wc -l)
has_makefile=$(test -f Makefile && echo "yes" || echo "no")
has_package_json=$(test -f package.json && echo "yes" || echo "no")

# Detect tech stack
if test -f go.mod; then TECH="Go"
elif test -f package.json; then TECH="TypeScript/Node.js"
elif test -f requirements.txt || test -f pyproject.toml; then TECH="Python"
else TECH="Unknown"
fi
```

### 1.2 Classify Project State

Based on detection:

| State | Criteria | Action |
|-------|----------|--------|
| **Empty** | file_count < 5 AND code_files = 0 | Guide user through project choices first |
| **Code Only** | code_files > 0 AND has_agents_md = "no" | Full analysis + core harness creation |
| **Partial Harness** | has_agents_md = "yes" AND (has_linters = 0 OR has_harness_dir = "no") | Gap analysis + fill gaps |
| **Harness Present** | Core harness components exist | Audit + improvement suggestions |

Also classify ECL readiness:

| ECL State | Criteria | Action |
|-----------|----------|--------|
| **ECL Missing** | has_ecl_doc = "no" OR has_changes_dir = "no" | Create ECL docs, change templates, and scripts |
| **ECL Partial** | ECL doc exists but scripts/templates missing | Fill ECL automation gaps |
| **ECL Ready** | docs/ECL.md, harness/changes, templates, harness-change, harness-evolve, lint-ecl, lint-encoding exist | Audit index freshness and workflow quality |

### 1.3 Baseline Verification Snapshot

For existing projects, capture a best-effort baseline before creating or updating harness files.
The baseline is for attribution only: it distinguishes pre-existing project failures from
failures introduced by harness work. It must not be used to weaken default CI.

Run only commands that already exist in the project:

| Ecosystem | Baseline commands |
|-----------|-------------------|
| TypeScript/Node.js | package scripts such as `lint`, `typecheck`, `test`, `build`; include nested package build scripts when detected |
| Go | `go test ./...`, `go build ./...`, existing `make lint` or `make test` |
| Python | existing test/lint scripts, `python -m compileall .` |

Record each command as `pass`, `fail`, or `missing`, with the short failure reason. If a command
fails before harness creation, report it later as **pre-existing project debt**, not as harness
failure. Default CI remains strict and should still include normal business gates unless the user
explicitly asks for a temporary staged rollout.

### 1.4 Intent Confirmation

Before planning changes, classify requested scope:

| Scope | Default? | Includes |
|-------|----------|----------|
| **Core harness** | Yes | AGENTS.md, docs/ECL.md, docs/STATUS.md, docs, ECL changes, lightweight auto-evolve, linters, environment contract, CI |
| **Advanced harness** | No | Core harness plus explicitly requested eval, trace, state, checkpoints, memory, or metrics |
| **Documentation only** | No | AGENTS.md and docs without linters, scripts, or CI |

When a user-confirmation tool is available, confirm scope. In Codex, use `request_user_input`.
On other platforms, use the equivalent user-choice tool. If no such tool is available, use the
detected context and record assumptions.

```json
{
  "question": "What's your priority for this harness setup?",
  "header": "Scope",
  "multiSelect": false,
  "options": [
    {
      "label": "Core harness (Recommended)",
      "description": "Project-first AGENTS.md, ECL changes, STATUS handoff, auto-evolve threshold checks, linters, environment contract, and strict CI"
    },
    {
      "label": "Advanced harness",
      "description": "Core harness plus explicitly requested eval, trace, memory, checkpoint, or metrics infrastructure"
    },
    {
      "label": "Documentation only",
      "description": "AGENTS.md and project docs only; skip linters, scripts, and CI for now"
    }
  ]
}
```

**If Empty project**, also ask for basics:

```json
{
  "question": "What tech stack for this project?",
  "header": "Tech Stack",
  "multiSelect": false,
  "options": [
    {"label": "Go", "description": "CLI tools, high-performance services, system programming"},
    {"label": "TypeScript/Node.js", "description": "Web APIs, full-stack apps, rapid prototyping"},
    {"label": "Python", "description": "Data processing, ML/AI, scripting"}
  ]
}
```

If no user-confirmation tool is available, use detected values and document assumptions:

```markdown
## Auto-Detected Context

| Field | Value | Confidence | Evidence |
|-------|-------|------------|----------|
| Tech Stack | {TECH} | High | Found {config file} |
| Project State | {state} | High | {criteria matched} |
| Scope | Core harness | Default | No user preference specified |

Proceeding with these assumptions. Tell me if any need adjustment.
```

### 1.5 ECL Work Intake Rules

When generating ECL guidance for a target project, keep the process small enough to use:

| Intake type | Criteria | Required ECL handling |
|-------------|----------|-----------------------|
| **Small Change** | Local, low-risk edits such as copy, comments, style-only tweaks, or single-file bug fixes with no interface, data, permission, architecture, or release impact | Active change optional; still record the verification command in the final response or existing task notes |
| **Structured Change** | Cross-file/module behavior, APIs, data model, permissions, architecture, validation chain, unclear requirements, or work likely to exceed 20 minutes | Use active change files and require intake/spec/plan review before implementation |

Decision tree:

1. If an active change already exists, keep using it; do not create a second active context.
2. If the change is copy, comments, README text, formatting, or an obviously local single-file fix
   with no runtime, API, data, permission, architecture, or validation-chain impact, treat it as
   Small Change.
3. If the change touches APIs, data, permissions, architecture, multiple modules, release/runtime
   behavior, or unclear requirements, treat it as Structured Change.
4. If impact is unclear, do read-only investigation first. If uncertainty remains after inspection,
   ask one high-impact question or upgrade to Structured Change; do not assume Small Change.

For structured changes, support both common entry points:

- **Requirement-first input**: extract target users/scenarios, evidence, success criteria,
  acceptance criteria, non-goals, constraints, assumptions, and risks into `spec.md`.
- **Plan-first input**: treat the user's plan as a draft, split WHAT/WHY into `spec.md` and HOW into
  `plan.md`, then ask only about high-impact gaps that affect implementation direction or acceptance.
  If the plan is complete and does not conflict with repository evidence, do not repeat a full
  interview. If it conflicts with code, docs, commands, or existing harness constraints, record the
  conflict and return to Intake Review.

Questions are allowed and expected, but must be bounded: ask at most three high-impact questions per
round. Low-risk unknowns become assumptions; high-impact unknowns become
`[NEEDS CLARIFICATION: ...]` and block implementation until resolved.

For complex structured changes, use a lightweight iteration loop rather than treating the first
spec as final:

```text
Draft Spec -> Draft Plan -> Review Gaps -> Revise Spec/Plan -> Gate -> Tasks
```

Default to at most two loops. If key gaps remain, continue up to five loops; after that, record a
blocker instead of implementing from guesses. `plan.md` must include any planning-discovered spec
gaps, because plans often expose missing acceptance, boundary, permission, data, or validation
requirements.

---

## Phase 2: Analysis

**Goal**: Deeply understand codebase architecture, harness state, and environment requirements.

### 2.1 Execution Mode

Use subagents only when the user authorized delegation and the environment supports it. Otherwise, execute the same responsibilities inline.

If using subagents, assign:

- Code architecture analysis: follow `agents/analyzer.md`; output `harness/.analysis/architecture.json`.
- Harness state audit: follow `agents/auditor.md`; output `harness/.analysis/audit.json`.
- Environment analysis: follow `references/environment-detection-guide.md`; output `harness/.analysis/environment.json`.

If working inline, produce the same three analysis artifacts or equivalent in-memory summaries before Phase 3.

### 2.2 Project Identity Extraction

For existing projects, extract target-project meaning before writing docs:
- One-sentence project identity: what it does and for whom.
- Core workflow or domain model: user/system flow, key entities, API resources, jobs, or commands.
- Primary source entrypoints and where common changes belong.

Use `README.md`, manifests, entrypoints, routes/controllers, schemas/models, and key source
directories. Harness files are not sufficient evidence for project identity.

### 2.3 Adapter Selection

After detecting the tech stack, load the matching adapter before creating linters, scripts, CI,
or environment config. Adapter guidance overrides generic templates for language-specific details.

| Detected stack | Required adapter |
|----------------|------------------|
| TypeScript/Node.js | `references/adapters/typescript.md` |
| Go | `references/adapters/go.md` |
| Python | `references/adapters/python.md` |
| Rust | `references/adapters/rust.md` |
| Java | `references/adapters/java.md` |
| Unknown/mixed | `references/adapters/generic.md` plus any detected language adapters |

For TypeScript/Node.js projects, prefer Node/TS-native outputs: `scripts/lint-deps.mjs` or
equivalent, `scripts/lint-quality.mjs`, npm/package-manager scripts, and Node/TS GitHub Actions.
Do not adapt Go linter or Makefile-only patterns to TypeScript unless the project is actually Go
or already uses Makefile as the primary command surface.

### 2.4 Command Surface Selection

Before creating ECL scripts, select the target project's command surface. Do not assume
PowerShell is the only Windows option. This selection is normally automatic; do not ask the user to
choose a script format unless project evidence conflicts or the user has already expressed a hard
constraint.

Priority:

1. Existing project entrypoints: package-manager scripts, Makefile targets, README commands,
   or CI shell conventions.
2. Explicit user/project constraints. If the project rejects `.ps1`, do not generate PowerShell
   as the only harness entrypoint.
3. Bash profile when allowed. For Windows projects that accept Bash, generate `.sh` scripts and
   document the prerequisite: Git Bash, WSL, MSYS2, or a CI Linux runner.
4. PowerShell profile when the project accepts Windows-native PowerShell. Keep it compatible with
   Windows PowerShell 5.1 and PowerShell 7.
5. Node or Python profiles when those runtimes are already first-class project dependencies.

Default when evidence is sparse: for TypeScript/Node projects choose Node/package-manager scripts;
for Windows projects that allow Bash choose Bash profile and document Git Bash/WSL/MSYS2; otherwise
choose the adapter's native lightweight scripting profile.

All profiles must implement the same ECL invariants and command set. `harness-change`,
`harness-evolve`, `lint-ecl`, and `lint-encoding` may be implemented as `.ps1`, `.sh`, `.mjs`,
or `.py`, but docs, CI, Makefile/package scripts, and verification commands must use the chosen
entrypoint consistently.

### 2.5 Wait for Analysis Completion

When subagents are running, wait for their final reports. While waiting, you can:
- Review any existing documentation
- Prepare templates for Phase 4

### 2.5 For Empty Projects

Skip Phase 2 analysis agents. Instead:
- Use templates from `references/greenfield-templates.md`
- Base decisions on user's tech stack choice
- Design a standard 3-layer architecture

---

## Phase 3: Delta Synthesis

**Goal**: Merge analysis results and compute exactly what needs to be created/updated.

### 3.1 Read Analysis Results

```bash
cat harness/.analysis/architecture.json
cat harness/.analysis/audit.json
cat harness/.analysis/environment.json
```

### 3.2 Compute Delta

Create a delta list:

```markdown
## Delta: What Needs to Be Done

### Core To Create (doesn't exist)
- [ ] AGENTS.md
- [ ] docs/ECL.md
- [ ] docs/STATUS.md
- [ ] docs/ARCHITECTURE.md
- [ ] scripts/lint-deps.go
- [ ] scripts/harness-change.{ps1|sh|mjs|py}
- [ ] scripts/harness-evolve.{ps1|sh|mjs|py}
- [ ] scripts/lint-ecl.{ps1|sh|mjs|py}
- [ ] scripts/lint-encoding.{ps1|sh|mjs|py}
- [ ] harness/changes/{active,parking,archive}
- [ ] harness/templates/change/
- [ ] harness/config/environment.json
- [ ] harness/evolution/{state.json,results.tsv,proposals/} (`pending.md` is generated later only when the archive threshold is reached)

### Optional Advanced (only if explicitly requested)
- [ ] harness/eval/ — agent evaluation datasets and runner inputs
- [ ] harness/trace/ — execution traces for agent runs
- [ ] harness/state/ — executor runtime state
- [ ] harness/checkpoints/ — resumable execution checkpoints
- [ ] harness/memory/ — long-term agent memory experiments
- [ ] harness/metrics/ — execution and quality metrics

### To Update (exists but has gaps)
- [ ] docs/DEVELOPMENT.md — missing build commands
- [ ] scripts/lint-quality.py — missing 3 packages in layer map

### Already Good (no changes needed)
- [x] Makefile — has all required targets
- [x] .github/workflows/ci.yml — properly configured
```

### 3.3 Confirm with User (if confirmation tool is available)

For significant changes:

```json
{
  "question": "I've analyzed the codebase. Ready to proceed with these changes?",
  "header": "Confirm",
  "multiSelect": false,
  "options": [
    {"label": "Yes, proceed with all", "description": "Create/update all identified items"},
    {"label": "Show me the details first", "description": "I'll explain what each change involves"},
    {"label": "Only critical items", "description": "Just P0/P1 items, skip P2/P3 for now"}
  ]
}
```

---

## Phase 4: Creation/Update

**Goal**: Create or update all harness files from the delta.

### 4.1 Execution Mode

Use subagents only when authorized and available. Otherwise, perform the same work inline. Keep write scopes disjoint if using parallel workers.

Creation responsibilities:

- Documentation: follow `agents/creator-docs.md`; create/update AGENTS.md, docs/ECL.md, docs/STATUS.md, docs/ARCHITECTURE.md, docs/DEVELOPMENT.md, and design docs. AGENTS.md is the target project's entry map, not a harness creation record. Keep the first screen project-first, but preserve ECL/current-change priority in context loading: `AGENTS.md` -> `docs/ECL.md` -> active change if present -> auto-evolve pending if present -> otherwise `docs/STATUS.md` -> task-specific project docs.
- Linters: follow `agents/creator-linters.md`; create/update dependency, quality, ECL, and encoding checks.
- Config and scripts: follow `agents/creator-config.md`; create/update environment contract, harness scripts, changes directories/templates, lightweight evolution state, harness-change, harness-evolve, Makefile targets, and CI. Create advanced directories only when the confirmed scope requires them.

ECL change templates must include `summary.md`, `spec.md`, `plan.md`, `tasks.md`, and
`reviews/review.md`. `spec.md` captures WHAT/WHY, `plan.md` captures HOW and planning-discovered
spec gaps, and `tasks.md` is generated only after the spec/plan gate is ready enough for
implementation. Do not require old archived changes to contain `plan.md`; compatibility applies to
history.

Important: do not create static verification config such as `harness/config/verify.json`. Verification plans are generated at runtime by the executor from `environment.json` and the task context.

Strict CI rule: default CI must include normal business quality gates (`lint`, `typecheck`, `test`,
`build`, and backend/package-specific equivalents when available) plus harness checks. Do not remove
or skip business gates because the baseline is red. If the baseline was already red, explain that CI
will be red until the pre-existing project issues are fixed. Generate staged or relaxed CI only when
the user explicitly asks for it.

Command surface rule: create ECL scripts for the selected profile, not a hardcoded shell. If Bash is
selected on Windows, document Git Bash, WSL, MSYS2, or CI Linux shell requirements in the generated
environment/development docs. If PowerShell is selected, detect whether `pwsh` is available; if not,
use `powershell -NoProfile -ExecutionPolicy Bypass`. PowerShell templates must be compatible with
Windows PowerShell 5.1: avoid ambiguous overloads such as `TrimStart(".\")`, and avoid non-ASCII
mojibake marker string literals in `.ps1`; represent markers by Unicode codepoint or another
PS5-safe construction.

### 4.2 For Empty Projects: Also Create Business Code Plan

For empty projects, add one more agent:

```
Agent("create-exec-plan", prompt="""
Create execution plan for business code (harness-executor will implement this):

Tech stack: {TECH}
Project type: {from user choice}
Architecture: 3-layer (Types → Core → Entry Points)

Create: docs/exec-plans/active/bootstrap-code.md

Contents:
- Full source code for initial project structure
- main.go/index.ts/main.py entry point
- Basic types and core logic
- Test files

This is for harness-executor to implement — not ecl-harness-engineer's responsibility.
""")
```

### 4.3 Wait for Creation Completion

Agents will notify when done. Collect any issues they encountered.

---

## Phase 5: Verification + Handoff

**Goal**: Ensure everything works, then hand off or present results.

### 5.1 Run Verification

```bash
# 0. Compare against the baseline snapshot
# Re-run the same existing lint/typecheck/test/build commands captured in Phase 1.

# 1. Harness checks pass
make verify-harness || npm run lint:harness || {generated_harness_lint_command}

# 2. Architecture linters pass
make lint-arch || npm run lint:arch

# 3. Business build/test gates run
go build ./... || npm run build || python -m compileall .

# 4. AGENTS.md size check
wc -l AGENTS.md  # Should be 80-120 lines

# 4b. AGENTS.md content gate
# Confirm it explains project identity, core workflow/domain model, source entrypoints,
# task-based verification, active-change-before-STATUS loading, and contains no
# ECL Harness Engineer internal boundary language.

# 5. All expected files exist
test -f AGENTS.md && echo "✓ AGENTS.md"
test -f docs/ARCHITECTURE.md && echo "✓ ARCHITECTURE.md"
test -f docs/ECL.md && echo "✓ ECL.md"
test -f docs/STATUS.md && echo "✓ STATUS.md"
test -f scripts/lint-deps* && echo "✓ lint-deps"
test -f scripts/harness-change.* && echo "✓ harness-change"
test -f scripts/lint-ecl.* && echo "✓ lint-ecl"
test -f scripts/harness-evolve.* && echo "✓ harness-evolve"
test -d harness/ && echo "✓ harness/"
test -d harness/changes && echo "✓ harness/changes"
test -f harness/evolution/state.json && echo "✓ evolution state"

# 6. Design docs exist (not just index)
find docs/design-docs -name "*.md" ! -name "index.md" | wc -l
```

Classify every verification result:

| Classification | Meaning |
|----------------|---------|
| Harness pass | Harness-created checks/files/scripts work |
| Pre-existing project failure | The same command failed in the Phase 1 baseline |
| New regression | The command passed in Phase 1 and fails after harness creation |
| Not available | The command/script does not exist in this project |

AGENTS.md content gate:
- A new agent can tell what the project does within 30 seconds.
- The core product/system workflow or domain model is visible.
- Main source entrypoints and task-to-directory mapping are visible.
- Verification guidance maps to task type.
- Context loading reads `docs/ECL.md` first, then active change when present.
- If no active change exists and `harness/evolution/pending.md` exists, read it before
  `docs/STATUS.md`, mention it as pending maintenance, and ask whether to handle it now unless the
  user already prioritized the current task. Reading or asking does not start auto-evolve and must
  not block ordinary user work.
- If no active change exists and no pending evolution exists, context loading reads `docs/STATUS.md` before task-specific project docs.
- For structured work, `docs/ECL.md` explains Small Change vs Structured Change, bounded Intake
  Review, plan-first input handling, and the spec/plan review gate.
- Archive history is loaded selectively through `docs/STATUS.md` paths or `harness/changes/INDEX.json`, starting with historical `summary.md` only.
- No skill-internal boundary leaks, such as sections or sentences that describe this skill's own scope limits as target-project rules.

### 5.2 STATUS.md Handoff Update

When a target project uses ECL changes, maintain `docs/STATUS.md` as a lightweight handoff file.
It is not the authority while an active change exists, but it becomes the default recent-history
entry point after the active change is closed.

Close-change handoff protocol:

1. Before running `harness-change close`, read the active change `summary.md`, `spec.md`,
   `plan.md`, `tasks.md`, and relevant `reviews/`; update `docs/STATUS.md` with completed work,
   verification results, residual risks, and the next recommended resume point.
2. Run the close command so the active change moves to `harness/changes/archive/...` and
   `harness/changes/INDEX.json` is rebuilt.
3. After close, update `docs/STATUS.md` again with the final archive path, normally pointing to
   the archived `summary.md`.
4. Run the harness lint command (`npm run lint:harness`, `make verify-harness`, or the generated
   ECL lint command) to confirm STATUS, ECL structure, and INDEX state are consistent.

Hooks and CI may validate `docs/STATUS.md`, but must not auto-write it or move changes.

### 5.3 Auto-Evolve Check

Core harnesses include lightweight auto-evolve by default. The script layer only detects when
enough new archive evidence exists and writes `harness/evolution/pending.md`; Codex performs the
semantic improvement pass.

Trigger model: `harness-change close` and `reindex` run `harness-evolve check`; `new` only reminds
when pending exists. Hooks and CI may warn, but must not modify docs, scripts, STATUS, or changes.
Generated scripts do not call subagents. They only count archive evidence and create pending
context. When no active change exists and Codex notices pending maintenance, it should ask the user
whether to handle it now unless the user already prioritized the current task. Asking does not start
pending evolution.

`harness/evolution/pending.md` is a maintenance reminder, not a hard lock. Reading it for context
does not start pending evolution. Pending evolution starts only when Codex creates or uses an
`auto-evolve-harness-*` change, writes an evolution proposal/result, or edits Harness files based
on the pending evidence. Once started, finish with a proposal, one `harness/evolution/results.tsv`
row, and `harness-evolve mark-complete`; otherwise park or close blocked, not completed.

Apply only the smallest evidence-backed delta that passes review. No independent scorer =
no auto-apply: user approval to handle pending implies permission to request an independent
auditor/subagent when the environment supports it. If the environment still requires explicit
authorization, ask once. If scoring is unavailable, declined, or still unauthorized after asking,
record `noop` with `eval_mode=dry_run`, keep the proposal, run `mark-complete`, and stop.
Machinery repair
(`harness-evolve`, pending templates, lint) does not complete pending evolution by itself; after
repair, still evaluate candidate archives or leave the work parked/blocked.

Detailed proposal format, scoring weights, status values, and complexity budget live in
`references/ecl-harness.md`.

### 5.4 Present Summary

```markdown
## Harness Infrastructure Complete

**Project**: {project-name}
**Tech Stack**: {TECH}
**Files Created/Updated**: {count}

### Created Files
- AGENTS.md ({N} lines)
- docs/ARCHITECTURE.md
- docs/ECL.md
- docs/STATUS.md
- docs/DEVELOPMENT.md
- docs/design-docs/{component}.md
- scripts/lint-deps.{ext}
- scripts/lint-quality.{ext}
- scripts/harness-change.{ps1|sh|mjs|py}
- scripts/lint-ecl.{ps1|sh|mjs|py}
- scripts/lint-encoding.{ps1|sh|mjs|py}
- scripts/harness-evolve.{ps1|sh|mjs|py}
- harness/config/environment.json
- harness/changes/
- harness/evolution/
- harness/templates/change/
- Makefile

### Verification Results
- Harness checks: ✓
- Architecture checks: ✓
- Business gates: ✓ or pre-existing failures listed below
- AGENTS.md size: ✓ ({N} lines)

### Pre-existing Project Failures
- {List baseline-red commands and short reasons, or "None observed."}

### New Regressions Introduced By Harness
- {List commands that passed before and failed after, or "None observed."}

### Next Steps
{For empty projects: "Run harness-executor to implement business code from docs/exec-plans/active/bootstrap-code.md"}
{For existing projects: "The harness is ready. AI agents can now use AGENTS.md as their entry point."}
```

### 5.5 Automatic Handoff (for Empty Projects)

If this was an empty project with a bootstrap exec-plan, invoke harness-executor:

```
Skill(skill="harness-executor")
```

With context: "Implement the bootstrap exec-plan at docs/exec-plans/active/bootstrap-code.md"

---

## Core Principles

### 1. Repository as Single Source of Truth

Agents cannot access Slack, Google Docs, or tribal knowledge. If it's not in the repository, it doesn't exist for the agent.

### 2. AGENTS.md is a Map, Not a Manual

Keep it 80-120 lines. Link to detailed docs, don't embed them.

### 3. Enforce Invariants Mechanically

Linter errors must be agent-actionable:
```
✗ BAD: "Forbidden import in core/types/user.go"

✓ GOOD: "core/types/user.go:15 imports core/config (layer 0 → layer 2).
         Layer 0 packages must have NO internal dependencies.

         Fix options:
         1. Move config-dependent logic to a higher layer
         2. Pass the config value as a parameter
         3. Use dependency injection via an interface"
```

### 4. Build to Delete

Every component should be replaceable. Capabilities that required complex pipelines yesterday may be single prompts tomorrow.

### 5. Start Simple

Atomic, well-documented tools > complex agent choreography. Don't over-engineer.

### 6. Change State Is Explicit

Use a single `harness/changes/active/` task for personal development. Move paused work to `parking/` and closed work to `archive/` with the generated `scripts/harness-change.*` command. Maintain `docs/STATUS.md` as the soft handoff summary after active work is closed. Never hand-edit `harness/changes/INDEX.json`; it is a generated index rebuilt by `park`, `close`, `resume`, and `reindex`. Structured changes use `spec.md` for WHAT/WHY, `plan.md` for HOW, and `tasks.md` for executable work.

### 7. Harness Evolves From Evidence

Every few closed changes, the generated `scripts/harness-evolve.* check` command may create
`harness/evolution/pending.md`. Treat it as a maintenance reminder to improve harness rules from
real archived evidence, not as a hard blocker for unrelated user work. If you start acting on the
pending evidence, first refresh `harness/changes/INDEX.json` and use the current eligible archive
window; the Candidate Archives in an old pending file are a trigger snapshot, not the only evidence.
Then finish with proposal + results.tsv + `mark-complete`, or park/block the work.
Do not turn one-off business bugs into permanent process. Keep only changes that improve the audit
score and pass validation.

---

## Reference Files

| File | When to Read | Contents |
|------|-------------|----------|
| `references/greenfield-templates.md` | Empty projects (Phase 2.5) | Complete Go/TS/Python scaffolding |
| `references/documentation-templates.md` | Phase 4 doc creation | Doc templates with numbered sections |
| `references/linter-templates.md` | Phase 4 linter creation | Linter code templates per language |
| `references/ecl-harness.md` | ECL-aware harness creation | docs/ECL.md, docs/STATUS.md, change lifecycle, INDEX.json, PowerShell script templates |
| `references/darwin-eval-prompts.md` | Skill quality evaluation | Dry-run prompts for darwin-skill review |
| `references/environment-detection-guide.md` | Phase 2 env analysis | Environment ecosystem detection |
| `references/environment-config-guide.md` | Phase 4 config creation | Startup, services, env vars, user-confirmation templates |
| `references/adapters/typescript.md` | TypeScript/Node.js projects | npm scripts, Node linters, package-manager detection, CI defaults |
| `references/adapters/{go,python,rust,java,generic}.md` | Matching detected stacks | Language-specific commands and conventions |

Agent prompts for Phase 2 and Phase 4 subagents are in `agents/`.

For small projects (< 20 files) or when subagents aren't available, execute phases inline instead of spawning agents.