--- name: find-root-causes description: This skill should be used when diagnosing failures, investigating incidents, finding root causes, or when "root cause", "diagnosis", "investigate", or "--rca" are mentioned. agent: debugger context: fork metadata: version: "2.0.0" related-skills: - debugging - codebase-recon - report-findings --- # Root Cause Analysis Delegated investigation: symptom → hypothesis → elimination → root cause → prevention. ## Steps 1. Load the `outfitter:debugging` skill for systematic investigation 2. Apply elimination techniques from this skill's references 3. Document investigation trail using RCA templates 4. Deliver root cause report with prevention recommendations - Diagnosing system failures or unexpected behavior - Investigating incidents or outages - Finding the actual cause vs surface symptoms - Preventing recurrence through understanding - Post-incident reviews requiring formal documentation NOT for: known issues with documented fixes, simple configuration errors, routine debugging (use `debugging` skill directly) This skill extends `debugging` with formal RCA practices: | Aspect | Debugging | Root Cause Analysis | |--------|-----------|---------------------| | Scope | Fix the immediate issue | Understand why it happened | | Output | Working code | RCA report + prevention | | Documentation | Investigation notes | Formal templates | | Goal | Resolution | Prevention of recurrence | Use `debugging` for day-to-day bug fixes. Use `find-root-causes` for incidents requiring formal investigation and documentation. Three core techniques for narrowing to root cause: | Technique | When to Use | Method | |-----------|-------------|--------| | **Binary Search** | Large problem space, ordered changes | Bisect the change range | | **Variable Isolation** | Multiple variables, need causation | Control all but one | | **Process of Elimination** | Finite set of possible causes | Rule out systematically | See [elimination-techniques.md](references/elimination-techniques.md) for detailed methods and examples. ## Investigation Trail Log every step for handoff and pattern recognition: ``` [TIME] STAGE: Action → Result [10:15] DISCOVERY: Gathered error logs → Found NullPointerException [10:22] HYPOTHESIS: User object not initialized [10:28] TEST: Added null check logging → Confirmed user is null ``` ## RCA Report Structure 1. **Summary** — one-sentence root cause 2. **Timeline** — events leading to incident 3. **Impact** — what was affected, duration 4. **Root Cause** — why it happened (not just what) 5. **Contributing Factors** — conditions that enabled it 6. **Prevention** — changes to prevent recurrence 7. **Detection** — how to catch it earlier next time See [documentation-templates.md](references/documentation-templates.md) for full templates. | Trap | Counter | |------|---------| | "I already looked at that" | Re-examine with fresh evidence | | "That can't be the issue" | Test anyway, let evidence decide | | "We need to fix this quickly" | Methodical investigation is faster | | Confirmation bias | Actively seek disconfirming evidence | | Correlation = causation | Test direct causal mechanism | See [pitfalls.md](references/pitfalls.md) for detailed resistance patterns and recovery. ALWAYS: - Load debugging skill for systematic investigation methodology - Use elimination techniques to narrow root cause - Document investigation trail as you go - Produce formal RCA report for incidents - Include prevention recommendations - Identify contributing factors, not just root cause NEVER: - Skip formal documentation for incidents - Stop at "what happened" without "why" - Propose fixes without understanding root cause - Omit prevention recommendations - Blame individuals (focus on systems) - [elimination-techniques.md](references/elimination-techniques.md) — binary search, variable isolation, process of elimination - [pitfalls.md](references/pitfalls.md) — cognitive biases and resistance patterns - [documentation-templates.md](references/documentation-templates.md) — investigation logs and RCA reports