5.4 KiB
5.4 KiB
Elimination Techniques
Systematic methods for narrowing problem scope.
Binary Search
Halving the problem space with each test.
When to Use
- Large problem space
- Changes have clear ordering (time, code versions, config options)
- Tests are quick relative to problem size
Process
1. Identify range: known-good state → known-bad state
2. Test midpoint: does issue exist here?
3. Narrow range: move to half containing issue
4. Repeat: until single change identified
Example: Git Bisect
# Automated binary search through commits
git bisect start
git bisect bad HEAD # Current commit is bad
git bisect good v1.2.0 # Known good version
git bisect run ./test.sh # Automatically find breaking commit
Example: Configuration
50 config options, one causes issue
Round 1: Test with first 25 options only
→ Issue present → problem in first 25
Round 2: Test with first 12 options only
→ Issue absent → problem in options 13-25
Round 3: Test with options 13-18
→ Issue present → problem in 13-18
...continue until single option found
Efficiency
| Problem Size | Binary Search Steps | Linear Search Steps |
|---|---|---|
| 10 items | ~4 | 10 |
| 100 items | ~7 | 100 |
| 1000 items | ~10 | 1000 |
Variable Isolation
Changing one thing at a time.
When to Use
- Multiple variables could be cause
- Interactions between variables possible
- Need to establish clear causation
Process
1. Baseline: measure with all defaults
2. Change X only: measure impact
3. Revert X, change Y only: measure impact
4. Repeat for each variable
5. If interactions suspected: test combinations
Example: Performance Degradation
Suspects: new library version, config change, increased data volume
Test 1: Revert library only → no change → not library
Test 2: Revert config only → improvement → config contributes
Test 3: Reduce data volume → improvement → data also contributes
Test 4: Both config + data → full improvement → both factors
Root cause: Config change + data growth interaction
Common Mistakes
- Changing multiple variables at once
- Not reverting between tests
- Assuming first positive result is complete answer
- Not testing combinations when interactions possible
Process of Elimination
Systematically ruling out possibilities.
When to Use
- Finite set of possible causes
- Can definitively rule things out
- Structured environment
Process
Start with: All possible causes
For each possibility:
- Design test to rule out
- Execute test
- If ruled out: remove from list
- If not ruled out: keep on list
Continue until: single possibility remains
Documentation Format
Possible causes:
✗ Component A — ruled out: reproduced without A present
✗ Component B — ruled out: tested in isolation, worked
✗ External factor — ruled out: reproduced in clean environment
○ Component C — not yet tested
✓ Component D — confirmed: removing D fixes issue
Example: Integration Failure
System: API → Queue → Worker → Database
Test 1: Call API directly, bypass queue
→ Issue persists → not queue-related
Test 2: Worker processes test message
→ Success → worker + database OK
Test 3: Examine API-to-queue handoff
→ Found: message format incorrect
Root cause: API serialization bug
Divide and Conquer
Breaking complex system into testable segments.
When to Use
- Complex multi-component systems
- Don't know which area to focus on
- Want to parallelize investigation
Process
1. Map system components
2. Identify boundaries between components
3. Test at each boundary: is data correct here?
4. Find boundary where data becomes incorrect
5. Focus investigation on that component
Example: Data Pipeline
Source → Ingestion → Transform → Validation → Storage → API
Check at each stage:
- After Ingestion: data correct ✓
- After Transform: data correct ✓
- After Validation: data INCORRECT ✗
Root cause is in Validation stage.
Environment Bisection
Isolating environment-specific factors.
When to Use
- "Works on my machine" situations
- Environment-dependent bugs
- Deployment issues
Process
1. List environment differences (OS, versions, config, resources)
2. Create minimal diff between working and failing
3. Test with progressive alignment
4. Identify minimum difference causing failure
Difference Checklist
| Category | Working | Failing |
|---|---|---|
| OS/Version | ||
| Runtime version | ||
| Dependencies | ||
| Config files | ||
| Environment variables | ||
| Network/ports | ||
| Permissions | ||
| Resource limits |
Technique Selection Guide
| Situation | Recommended Technique |
|---|---|
| Many commits to check | Binary search (git bisect) |
| Multiple config options | Variable isolation |
| Finite component list | Process of elimination |
| Multi-stage pipeline | Divide and conquer |
| "Works elsewhere" | Environment bisection |
| Unknown scope | Start with divide and conquer, then specialize |
Combining Techniques
Often multiple techniques used together:
1. Divide and conquer: narrow to subsystem
2. Process of elimination: rule out components in subsystem
3. Variable isolation: identify specific configuration
4. Binary search: find when it broke
Each technique narrows scope; combine for efficiency.