# Test Quality Metrics Comprehensive guide to measuring and improving test quality through coverage and mutation testing. ## Coverage Metrics ### Line Coverage Percentage of code lines executed during test runs. **Target**: ≥80% overall, ≥90% for critical paths **TypeScript/Bun**: ```bash bun test --coverage # Output Coverage Summary: Statements : 85.2% ( 1420/1667 ) Branches : 78.5% ( 314/400 ) Functions : 82.1% ( 156/190 ) Lines : 85.2% ( 1420/1667 ) ``` **Rust**: ```bash # Using cargo-tarpaulin cargo tarpaulin --out Html --output-dir coverage/ # Using cargo-llvm-cov cargo llvm-cov --html ``` ### Branch Coverage Percentage of decision branches (if/else, switch, ternary) executed. **Target**: ≥75% Example showing uncovered branch: ```typescript function divide(a: number, b: number): number { if (b === 0) { // Branch covered throw new Error('Division by zero') } return a / b // Branch covered } // Test only covers success path test('divides numbers', () => { expect(divide(10, 2)).toBe(5) }) // Coverage: 50% branches (only success branch covered) ``` Fix with both branches: ```typescript test('divides numbers', () => { expect(divide(10, 2)).toBe(5) }) test('throws on division by zero', () => { expect(() => divide(10, 0)).toThrow('Division by zero') }) // Coverage: 100% branches ``` ### Function Coverage Percentage of functions called during tests. **Target**: ≥80% Uncovered functions often indicate: - Dead code that should be removed - Missing test cases - Helper functions only used in uncovered paths ### Interpreting Coverage High coverage ≠ high quality. Coverage shows what's tested, not how well. **Example of misleading coverage**: ```typescript function processPayment(amount: number): Result { if (amount <= 0) { return { type: 'error', code: 'INVALID_AMOUNT' } } const result = chargeCard(amount) return { type: 'success', transactionId: result.id } } // Bad test with 100% coverage test('processes payment', () => { processPayment(100) processPayment(-10) }) // No assertions! 100% coverage but 0% verification ``` Coverage shows code was executed, not that it was verified correct. ## Mutation Testing Mutation testing verifies test quality by introducing small bugs and checking if tests catch them. ### How It Works 1. **Mutant Generation**: Tool mutates source code (e.g., `===` → `!==`, `+` → `-`) 2. **Test Execution**: Run tests against each mutant 3. **Classification**: - **Killed**: Test fails (good — test caught the bug) - **Survived**: Test passes (bad — test missed the bug) - **Timeout**: Mutant caused infinite loop - **No Coverage**: Line not executed by tests ### Mutation Score ``` Mutation Score = (Killed Mutants / Total Mutants) × 100% ``` **Target**: ≥75% ### TypeScript Mutation Testing Using Stryker: **Install**: ```bash bun add -d @stryker-mutator/core @stryker-mutator/typescript-checker ``` **Configuration** (`stryker.conf.json`): ```json { "mutator": "typescript", "packageManager": "bun", "reporters": ["html", "clear-text", "progress"], "testRunner": "bun", "coverageAnalysis": "perTest", "mutate": [ "src/**/*.ts", "!src/**/*.test.ts", "!src/**/*.spec.ts" ], "thresholds": { "high": 80, "low": 60, "break": 50 } } ``` **Run**: ```bash bun x stryker run # Output Mutation testing complete: Killed: 78 Survived: 12 Timeout: 2 No Coverage: 8 Mutation Score: 78.0% ``` **Common Mutations**: | Original | Mutant | Catches | |----------|--------|---------| | `===` | `!==` | Equality assertions | | `>` | `>=` | Boundary tests | | `+` | `-` | Arithmetic verification | | `&&` | `||` | Logic tests | | `true` | `false` | Boolean verification | | `0` | `1` | Zero handling | | `return x` | `return undefined` | Return value tests | **Example Analysis**: ```typescript function calculateDiscount(price: number, isPremium: boolean): number { if (isPremium) { return price * 0.8 // 20% discount } return price } // Weak test test('calculates discount', () => { calculateDiscount(100, true) calculateDiscount(100, false) }) // Mutation: 0.8 → 0.9 // Status: Survived (no assertion) ``` Fix with assertions: ```typescript test('applies 20% discount for premium users', () => { expect(calculateDiscount(100, true)).toBe(80) }) test('no discount for regular users', () => { expect(calculateDiscount(100, false)).toBe(100) }) // Mutation: 0.8 → 0.9 // Status: Killed (test fails with 90 !== 80) ``` ### Rust Mutation Testing Using `cargo-mutants`: **Install**: ```bash cargo install cargo-mutants ``` **Run**: ```bash cargo mutants # Output Mutation testing results: caught: 45 missed: 5 timeout: 1 unviable: 2 score: 90.0% ``` **Common Mutations**: | Original | Mutant | Catches | |----------|--------|---------| | `==` | `!=` | Equality tests | | `>` | `>=` | Boundary tests | | `&&` | `||` | Logic tests | | `Some(x)` | `None` | Option handling | | `Ok(x)` | `Err(...)` | Result handling | | `+` | `-` | Arithmetic verification | **Example**: ```rust fn calculate_discount(price: i32, is_premium: bool) -> i32 { if is_premium { price * 80 / 100 // 20% discount } else { price } } // Weak test #[test] fn test_discount() { calculate_discount(100, true); calculate_discount(100, false); } // Mutation: 80 → 90 // Status: missed (no assertion) ``` Fix: ```rust #[test] fn applies_discount_for_premium() { assert_eq!(calculate_discount(100, true), 80); } #[test] fn no_discount_for_regular() { assert_eq!(calculate_discount(100, false), 100); } // Mutation: 80 → 90 // Status: caught (assertion fails) ``` ## Quality Standards Matrix | Metric | Minimum | Good | Excellent | |--------|---------|------|-----------| | Line Coverage | 70% | 80% | 90% | | Branch Coverage | 65% | 75% | 85% | | Function Coverage | 75% | 85% | 95% | | Mutation Score | 60% | 75% | 85% | | Test Execution Time | <10s | <5s | <2s | ## Improving Test Quality ### Weak Assertion Detection **Problem**: Tests execute code but don't verify results ```typescript // ❌ Weak - no verification test('processes order', () => { processOrder({ items: [item1, item2] }) }) ``` **Solution**: ```typescript // ✓ Strong - verifies result test('processes order', () => { const result = processOrder({ items: [item1, item2] }) expect(result.type).toBe('success') expect(result.total).toBe(150) }) ``` ### Missing Edge Cases Use mutation testing to find gaps: ```typescript function validateAge(age: number): boolean { return age >= 18 // Mutant: >= → > } // Current test test('validates age', () => { expect(validateAge(20)).toBe(true) expect(validateAge(16)).toBe(false) }) // Mutation survived: >= → > // Missing: boundary test for exactly 18 ``` Add boundary test: ```typescript test('accepts exactly 18', () => { expect(validateAge(18)).toBe(true) }) // Now mutation is caught ``` ### Test Redundancy Multiple tests verifying same thing: ```typescript // Redundant tests test('validates positive number', () => { expect(isPositive(5)).toBe(true) }) test('validates another positive number', () => { expect(isPositive(10)).toBe(true) }) test('validates yet another positive number', () => { expect(isPositive(100)).toBe(true) }) ``` Consolidate: ```typescript test.each([5, 10, 100])('validates positive number %i', (num) => { expect(isPositive(num)).toBe(true) }) ``` ## Continuous Quality Monitoring ### CI/CD Integration **TypeScript**: ```yaml # .github/workflows/test.yml - name: Run tests with coverage run: bun test --coverage - name: Check coverage thresholds run: | coverage=$(bun test --coverage --json | jq '.coverage.total.statements.pct') if (( $(echo "$coverage < 80" | bc -l) )); then echo "Coverage $coverage% below 80% threshold" exit 1 fi - name: Run mutation testing run: bun x stryker run # Fail if mutation score < 75% ``` **Rust**: ```yaml # .github/workflows/test.yml - name: Run tests with coverage run: cargo tarpaulin --fail-under 80 - name: Run mutation testing run: cargo mutants continue-on-error: true # Warning only initially ``` ### Tracking Over Time Monitor trends: ```bash # Generate coverage badge coverage=$(bun test --coverage --json | jq '.coverage.total.statements.pct') echo "Coverage: $coverage%" > coverage.txt # Track mutation score mutation=$(bun x stryker run --json | jq '.mutationScore') echo "Mutation Score: $mutation%" > mutation.txt ``` ## Advanced Techniques ### Differential Coverage Only measure coverage on changed code: ```bash # Get changed files git diff --name-only main... > changed.txt # Run coverage on changed files bun test --coverage --changed-files changed.txt ``` ### Coverage Ratcheting Prevent coverage from decreasing: ```bash # Save current coverage current=$(bun test --coverage --json | jq '.coverage.total.statements.pct') echo "$current" > .baseline-coverage # On future runs, compare baseline=$(cat .baseline-coverage) if (( $(echo "$current < $baseline" | bc -l) )); then echo "Coverage decreased from $baseline% to $current%" exit 1 fi ``` ### Mutation Testing Optimization Run only on changed code: ```bash # Stryker incremental mode bun x stryker run --incremental # cargo-mutants on specific files cargo mutants --file src/auth/mod.rs ``` ## Common Pitfalls ### Pitfall 1: Chasing 100% Coverage **Problem**: Diminishing returns past 90%, testing trivial code ```typescript // Trivial getter - not worth testing class User { get email(): string { return this._email } } ``` **Solution**: Focus on behavior, not line count. Exclude trivial code from coverage requirements. ### Pitfall 2: Gaming Metrics **Problem**: Tests that execute code without verification ```typescript // ❌ High coverage, zero value test('calls all functions', () => { func1() func2() func3() }) ``` **Solution**: Use mutation testing to catch weak assertions. ### Pitfall 3: Slow Mutation Testing **Problem**: Full mutation testing takes hours **Solution**: Run incrementally or in CI only: ```bash # Local: Quick feedback on changed files bun x stryker run --mutate "src/auth/**/*.ts" # CI: Full suite bun x stryker run ``` ## Quality Metrics Dashboard Example report format: ``` Test Quality Report =================== Coverage: Statements: 85.2% ░░░░░░░░▓▓ Branches: 78.5% ░░░░░░░▓▓▓ Functions: 82.1% ░░░░░░░░▓▓ Mutation Testing: Score: 78.0% ░░░░░░░▓▓▓ Killed: 78 Survived: 12 No Cov: 8 Performance: Unit Tests: 2.3s ✓ Total: 8.7s ✓ Status: ✓ All thresholds met ``` ## Actionable Improvement Plan 1. **Week 1**: Establish baseline - Run coverage analysis - Run mutation testing - Document current state 2. **Week 2-3**: Fix critical gaps - Add tests for uncovered critical paths - Fix survived mutants in high-risk code - Target 80% coverage, 75% mutation score 3. **Week 4**: Automate - Add CI coverage checks - Set up coverage ratcheting - Schedule weekly mutation testing 4. **Ongoing**: Maintain - Review coverage on each PR - Run mutation testing monthly - Gradually raise thresholds ## Resources TypeScript: - [Stryker Documentation](https://stryker-mutator.io) - [Bun Test Coverage](https://bun.sh/docs/cli/test#coverage) Rust: - [cargo-tarpaulin](https://github.com/xd009642/tarpaulin) - [cargo-llvm-cov](https://github.com/taiki-e/cargo-llvm-cov) - [cargo-mutants](https://github.com/sourcefrog/cargo-mutants)