playbook/antigravity-awesome-skills/skills/analyze-project/examples/sample_session_analysis_rep...

4.4 KiB
Raw Blame History

Sample Output: session_analysis_report.md

Generated by /analyze-project skill on a ~3-week project with ~50 substantive sessions.

(Trimmed for demo; real reports include full per-conversation breakdown and more cohorts.)

📊 Session Analysis Report — Sample AI Video Studio

Generated: 2026-03-13
Conversations Analyzed: 54 substantive (with artifacts)
Date Range: Feb 18 Mar 13, 2026

Executive Summary

Metric Value Rating
First-Shot Success Rate 52% 🟡
Completion Rate 70% 🟢
Avg Scope Growth +58% 🟡
Replan Rate 30% 🟢
Median Duration ~35 min 🟢
Avg Revision Intensity 4.8 versions 🟡
Abandoned Rate 22% 🟡

Narrative: High velocity with strong completion on workflow-driven tasks. Main friction is post-success human scope expansion — users add "while we're here" features after initial work succeeds, turning narrow tasks into multi-phase epics. Not primarily prompt or agent issues — more workflow discipline.

Root Cause Breakdown (non-clean sessions only)

Root Cause % Notes
Human Scope Change 37% New features/epics added mid-session after success
Legitimate Task Complexity 26% Multi-phase builds with expected iteration
Repo Fragility 15% Hidden coupling, pre-existing bugs
Verification Churn 11% Late test/build failures
Spec Ambiguity 7% Vague initial ask
Agent Architectural Error 4% Rare wrong approach

Confidence: High for top two (direct evidence from version diffs).

Scope Change Analysis Highlights

Human-Added (most common): Starts narrow → grows after Phase 1 succeeds (e.g., T2E QA → A/B testing + demos + editor tools).
Necessary Discovered: Hidden deps, missing packages, env issues (e.g., auth bcrypt blocking E2E).
Agent-Introduced: Very rare (1 case of over-creating components).

Rework Shape Summary

  • Clean execution: 52%
  • Progressive expansion: 18% (dominant failure mode)
  • Early replan → stable: 11%
  • Late verification churn: 7%
  • Exploratory/research: 7%
  • Abandoned mid-flight: 4%

Pattern: Progressive expansion often follows successful implementation — user adds adjacent work in same session.

Friction Hotspots (top areas)

Subsystem Sessions Avg Revisions Main Cause
production.py + domain 8 6.2 Hidden coupling
fal.py (model adapter) 7 5.0 Legitimate complexity
billing.py + tests 6 5.5 Verification churn
frontend/ build 5 7.0 Missing deps/types
Auth/bcrypt 3 4.7 Blocks E2E testing

Non-Obvious Findings (top 3)

  1. Post-Success Expansion Dominates — Most scope growth happens after initial completion succeeds, not from bad planning. (High confidence)
  2. File Targeting > Acceptance Criteria — Missing specific files correlates more with replanning (44% vs 12%) than missing criteria. Anchors agent research early. (High)
  3. Frontend Build is Silent Killer — Late TypeScript/import failures add 24 cycles repeatedly. No pre-flight check exists. (High)

Recommendations (top 4)

  1. Split Sessions After Phases — Start new conversation after successful completion to avoid context bloat and scope creep. Expected: +13% first-shot success. (High)
  2. Enforce File Targeting — Add pre-check in prompt optimizer to flag missing file/module refs. Expected: halve replan rate. (High)
  3. Add Frontend Preflight — Run npm run build early in frontend-touching sessions. Eliminates common late blockers. (High)
  4. Fix Auth Test Fixture — Seed test users with plain passwords or bypass bcrypt for local E2E. Unblocks browser testing. (High)

This sample shows the forensic style: evidence-backed, confidence-rated, focused on actionable patterns rather than raw counts.