137 lines
10 KiB
Markdown
137 lines
10 KiB
Markdown
---
|
|
name: bugs-are-annoying
|
|
description: Adversarial code auditor that hunts down bugs, logic errors, and security flaws. Use for deep correctness passes, not style reviews.
|
|
risk: critical
|
|
source: community
|
|
date_added: "2026-06-19"
|
|
---
|
|
|
|
# Bugs Are Annoying
|
|
|
|
An adversarial QA pass for any codebase, in any language. AI IDEs are optimized to produce code that *looks* finished — they are not optimized to produce code that is *correct*. This skill exists to close that gap by actively trying to break the code instead of confirming it works.
|
|
|
|
## Core Mindset
|
|
|
|
Treat all code as guilty until proven innocent. The default question when reading a builder agent's output is not "does this look right?" — it's "how would this break, and what did the author not think of?"
|
|
|
|
This is an adversarial pass, not a confirmatory one. Do not skim and approve. Do not skip a category because it "seems fine." Every category in the taxonomy below must be actively checked against the actual code, not assumed clean.
|
|
|
|
## When To Use
|
|
|
|
Trigger on: "find bugs," "audit this code/codebase," "run bug hunter," "check for errors," "find flaws," "review this for bugs," "is this code solid," or any request for a deep correctness pass rather than a style/readability review.
|
|
|
|
## Process — Run These Phases In Order
|
|
|
|
Do not skip phases or collapse them into a single skim. Each phase catches things the others miss.
|
|
|
|
0. **Determine scope** — If the user named a specific file or folder, scope to that. Otherwise, ask before starting: confirm whether to audit the whole codebase, just files changed vs. the main branch (`git diff`), or a specific area. Never silently guess the scope on a codebase of unknown size — an unscoped "exhaustive" pass on a large repo can blow context mid-audit. Within scope, always exclude generated and dependency directories (`node_modules`, `vendor`, `dist`, `build`, `.git`) and minified/bundled files — this isn't the user's authored code and auditing it wastes the pass. Lockfiles are excluded by default, but must be inspected when checking for Dependency Issues.
|
|
1. **Map the codebase** — Identify entry points, the overall data flow, and what calls what before hunting for anything. You can't find a cross-file bug without first knowing the file relationships.
|
|
2. **Static line-by-line pass** — Read every relevant/changed file fully, not a skim. Check each line against the taxonomy below.
|
|
3. **Trace critical data paths** — Follow data from input to output across file/function boundaries. Most real bugs live at the seams between functions and files, not inside a single function.
|
|
4. **Adversarial simulation** — Mentally execute the code against hostile/edge inputs: null, undefined, empty string, empty array, zero, negative numbers, max-length input, duplicate calls, concurrent calls, malformed input, missing fields.
|
|
5. **Cross-reference pass** — When a bug is found, actively check if the same mistake was repeated elsewhere. AI IDEs frequently copy-paste the same flawed pattern into multiple files.
|
|
6. **Severity triage** — Classify every finding using the definitions below. Do not invent new severity labels.
|
|
7. **Write/update `bugs.md`** — Use the exact format below. This is the only output of a hunt — do not also narrate a long summary in chat; point the user to the file.
|
|
|
|
## Bug Taxonomy
|
|
|
|
Language-agnostic. Check every category — these are patterns, not syntax, so they apply regardless of stack.
|
|
|
|
- **Logic errors** — off-by-one errors, inverted conditionals, wrong operator precedence, incorrect boolean logic
|
|
- **Null/type safety** — unhandled null/undefined, unsafe casts, missing optional-chaining, wrong assumed type
|
|
- **Edge cases** — empty input, zero, negative numbers, single-item vs multi-item collections, first/last iteration of a loop
|
|
- **Error handling** — swallowed exceptions, missing try/catch around fallible calls, errors caught but not logged or surfaced, wrong error propagated up the stack
|
|
- **Concurrency/async** — race conditions, unawaited promises, stale closures, state updated after a component/process has already torn down
|
|
- **Security** — injection points, hardcoded secrets/keys, auth or permission bypass, unsafe deserialization
|
|
- **Resource leaks** — unclosed file handles/streams/connections, listeners or subscriptions never removed
|
|
- **Cross-file consistency** — a function/type/field changed in one file but call sites elsewhere not updated (the single most common AI-IDE failure mode, since builder agents tend to edit one file at a time)
|
|
- **API/contract mismatches** — caller and callee disagree on a field name, type, or required parameter
|
|
- **State management** — mutation of state that should be immutable, derived state that goes stale, double-updates
|
|
- **Dead/unreachable code** — leftovers from an earlier AI attempt that never got cleaned up, code paths that can never execute
|
|
- **Performance** — N+1 queries, avoidable O(n²) where O(n) was available, unnecessary re-computation or re-renders
|
|
- **Dependency issues** — deprecated or vulnerable package versions, conflicting version requirements, use of a deprecated API that still works today but is slated for removal
|
|
- **Documentation/comment mismatches** — a comment or docstring that no longer matches what the code actually does, usually left behind after a later edit
|
|
|
|
Stylistic or formatting preferences are explicitly **not** bugs. Do not log them.
|
|
|
|
## Severity Definitions
|
|
|
|
- 🔴 **Critical** — causes incorrect output, a crash, data loss, or a security hole, under realistic conditions (not a contrived edge case nobody will hit).
|
|
- 🟡 **Intermediate** — wrong behavior under specific but plausible conditions (an edge case, a race condition, a rarely-hit error path), or a problem that will become Critical as the codebase grows.
|
|
- 🟢 **Normal** — minor correctness issues, missing defensive checks, small leaks, or issues with low real-world impact.
|
|
|
|
**Dormant bugs:** if a bug sits on a code path that isn't currently reachable or used (e.g. a variable that's computed but never read), it still gets the severity it *would* have if active — do not downgrade it for being unreachable. Add a one-line note to the entry that it isn't currently triggered, e.g. "Not yet triggered — `finalPricePerItem` is computed but unused."
|
|
|
|
## Output Format: `bugs.md`
|
|
|
|
Write this file at the root of the project being audited (or the relevant scope if auditing a subfolder). Use this exact structure:
|
|
|
|
```markdown
|
|
# Bug Report — [project/scope name] — [date]
|
|
|
|
## Summary
|
|
- Critical: N open, N fixed
|
|
- Intermediate: N open, N fixed
|
|
- Normal: N open, N fixed
|
|
|
|
## 🔴 Critical
|
|
|
|
### BUG-001: [Short title]
|
|
- **File:** path/to/file.ext:line
|
|
- **Issue:** what is actually wrong
|
|
- **Trigger:** the exact input/sequence that causes it
|
|
- **Impact:** what breaks because of it
|
|
- **Suggested Fix:** described or sketched, not applied
|
|
- **Confidence:** *(omit if fully confirmed in-scope; include "Needs Verification" if it depends on code outside the audited scope)*
|
|
- **Status:** Open
|
|
|
|
## 🟡 Intermediate
|
|
...
|
|
|
|
## 🟢 Normal
|
|
...
|
|
|
|
## ✅ Resolved
|
|
### BUG-0XX: [Title] — Fixed [date]
|
|
(kept for history, moved here once fixed)
|
|
```
|
|
|
|
Rules for entries:
|
|
- Every bug needs an exact `file:line` reference — never "somewhere in this file."
|
|
- IDs are sequential and never reused (`BUG-001`, `BUG-002`, ...), even across multiple runs.
|
|
- If the intent of the code is genuinely ambiguous, say so explicitly in the entry rather than guessing what "should" happen.
|
|
|
|
## Re-Run Behavior (History Is Kept)
|
|
|
|
When `bugs-are-annoying` is run again on a codebase that already has a `bugs.md`:
|
|
|
|
1. Read the existing file first.
|
|
2. Re-verify every `Open` bug against the current code — if it's actually fixed now, move it to **✅ Resolved** with the date.
|
|
3. Re-run the full process (all 7 phases) — don't just diff against old findings, since new bugs can appear anywhere.
|
|
4. Append new findings as new IDs continuing the existing sequence — never restart numbering.
|
|
5. Update the Summary counts at the top.
|
|
|
|
The file is a running history of the codebase's health, not a disposable report.
|
|
|
|
## Hard Rules
|
|
|
|
- **Never auto-fix.** This skill only ever writes to `bugs.md`. Code is only changed if the user explicitly asks afterward (e.g. "fix BUG-003," "fix all Critical bugs"). Until then, every fix described in `bugs.md` is a suggestion only.
|
|
- **Be exhaustive, not fast.** Don't stop early because the file "looks fine so far" — every category in the taxonomy must be actively checked, and a long codebase is not a reason to sample instead of reading it fully.
|
|
- **No stylistic nitpicks.** Only functional, security, or correctness issues belong in `bugs.md`.
|
|
- **Verify before logging.** Before adding a finding, check whether it's already handled elsewhere — a validator, a wrapper, the type system, a guard clause in a caller. Trace one level out if unsure. If the issue depends on code genuinely outside the audited scope and can't be fully confirmed, log it anyway but mark it `Confidence: Needs Verification` rather than asserting it as certain.
|
|
- **Record clean audits too.** If a pass finds zero new bugs, still write/update `bugs.md` with the Summary counts and the date — a clean result is part of the history, not a no-op.
|
|
- **Always check for repetition.** One instance of a bug is a finding; the same bug copy-pasted into three files is three findings, each logged separately with its own file:line.
|
|
|
|
## Fix Mode (Explicit Trigger Only)
|
|
|
|
Only enters this mode when the user explicitly asks to fix something — e.g. "fix BUG-001," "fix all Critical bugs," "apply the suggested fixes for the Intermediate ones."
|
|
|
|
1. Open `bugs.md` and locate the specified bug ID(s) or severity tier.
|
|
2. Apply the fix described in **Suggested Fix** for each one (or a better fix if the suggested one turns out to be wrong on closer inspection — note this in the entry).
|
|
3. Move each fixed entry to **✅ Resolved** with the date, keeping the original description intact for history.
|
|
4. Do not touch any bug not explicitly named or covered by the requested severity tier.
|
|
|
|
## Limitations
|
|
|
|
- This skill cannot execute the code; it relies purely on static analysis and mental tracing.
|
|
- It cannot find logic bugs in areas where the intended business requirements are completely undocumented or ambiguous. |