3.7 KiB
3.7 KiB
What problem are you trying to solve?
What does this PR change?
Is this change appropriate for the core library?
What alternatives did you consider?
Does this PR contain multiple unrelated changes?
Existing PRs
- I have reviewed all open AND closed PRs for duplicates or prior art
- Related PRs:
Environment tested
| Harness (e.g. Claude Code, Cursor) | Harness version | Model | Model version/ID |
|---|---|---|---|
Evaluation
- What was the initial prompt you (or your human partner) used to start the session that led to this change?
- How many eval sessions did you run AFTER making the change?
- How did outcomes change compared to before the change?
Rigor
- If this is a skills change: I used
superpowers:writing-skillsand completed adversarial pressure testing (paste results below) - This change was tested adversarially, not just on the happy path
- I did not modify carefully-tuned content (Red Flags table, rationalizations, "human partner" language) without extensive evals showing the change is an improvement
Human review
- A human has reviewed the COMPLETE proposed diff before submission