--- name: improve-codebase-architecture description: Scan a codebase for deepening opportunities, present them as a visual HTML report, then grill through whichever one you pick. disable-model-invocation: true category: "development" risk: "safe" source: "community" source_repo: "mattpocock/skills" source_type: "community" date_added: "2026-06-19" author: "Matt Pocock" license: "MIT" license_source: "https://github.com/mattpocock/skills/blob/main/LICENSE" tags: - engineering - workflow - coding-agents tools: - claude-code - codex-cli - cursor --- # Improve Codebase Architecture ## When to Use Use when this workflow matches the user request: Scan a codebase for deepening opportunities, present them as a visual HTML report, then grill through whichever one you pick. _Source: [mattpocock/skills](https://github.com/mattpocock/skills) (MIT)._ Surface architectural friction and propose **deepening opportunities** — refactors that turn shallow modules into deep ones. The aim is testability and AI-navigability. This command is _informed_ by the project's domain model and built on a shared design vocabulary: - Run the `/codebase-design` skill for the architecture vocabulary (**module**, **interface**, **depth**, **seam**, **adapter**, **leverage**, **locality**) and its principles (the deletion test, "the interface is the test surface", "one adapter = hypothetical seam, two = real"). Use these terms exactly in every suggestion — don't drift into "component," "service," "API," or "boundary." - The domain language in `CONTEXT.md` gives names to good seams; ADRs in `docs/adr/` record decisions this command should not re-litigate. ## Process ### 1. Explore Read the project's domain glossary (`CONTEXT.md`) and any ADRs in the area you're touching first. Then use the Agent tool with `subagent_type=Explore` to walk the codebase. Don't follow rigid heuristics — explore organically and note where you experience friction: - Where does understanding one concept require bouncing between many small modules? - Where are modules **shallow** — interface nearly as complex as the implementation? - Where have pure functions been extracted just for testability, but the real bugs hide in how they're called (no **locality**)? - Where do tightly-coupled modules leak across their seams? - Which parts of the codebase are untested, or hard to test through their current interface? Apply the **deletion test** to anything you suspect is shallow: would deleting it concentrate complexity, or just move it? A "yes, concentrates" is the signal you want. ### 2. Present candidates as an HTML report Write a self-contained HTML file to the OS temp directory so nothing lands in the repo. Resolve the temp dir from `$TMPDIR`, falling back to `/tmp` (or `%TEMP%` on Windows), and write to `/architecture-review-.html` so each run gets a fresh file. Open it for the user — `xdg-open ` on Linux, `open ` on macOS, `start ` on Windows — and tell them the absolute path. The report uses **Tailwind via CDN** for layout and styling, and **Mermaid via CDN** for diagrams where a graph/flow/sequence reliably communicates the structure. Mix Mermaid with hand-crafted CSS/SVG visuals — use Mermaid when relationships are graph-shaped (call graphs, dependencies, sequences), and hand-built divs/SVG when you want something more editorial (mass diagrams, cross-sections, collapse animations). Each candidate gets a **before/after visualisation**. Be visual. For each candidate, render a card with: - **Files** — which files/modules are involved - **Problem** — why the current architecture is causing friction - **Solution** — plain English description of what would change - **Benefits** — explained in terms of locality and leverage, and how tests would improve - **Before / After diagram** — side-by-side, custom-drawn, illustrating the shallowness and the deepening - **Recommendation strength** — one of `Strong`, `Worth exploring`, `Speculative`, rendered as a badge End the report with a **Top recommendation** section: which candidate you'd tackle first and why. **Use CONTEXT.md vocabulary for the domain, and the `/codebase-design` vocabulary for the architecture.** If `CONTEXT.md` defines "Order," talk about "the Order intake module" — not "the FooBarHandler," and not "the Order service." **ADR conflicts**: if a candidate contradicts an existing ADR, only surface it when the friction is real enough to warrant revisiting the ADR. Mark it clearly in the card (e.g. a warning callout: _"contradicts ADR-0007 — but worth reopening because…"_). Don't list every theoretical refactor an ADR forbids. See [HTML-REPORT.md](HTML-REPORT.md) for the full HTML scaffold, diagram patterns, and styling guidance. Do NOT propose interfaces yet. After the file is written, ask the user: "Which of these would you like to explore?" ### 3. Grilling loop Once the user picks a candidate, run the `/grilling` skill to walk the design tree with them — constraints, dependencies, the shape of the deepened module, what sits behind the seam, what tests survive. Side effects happen inline as decisions crystallize — run the `/domain-modeling` skill to keep the domain model current as you go: - **Naming a deepened module after a concept not in `CONTEXT.md`?** Add the term to `CONTEXT.md`. Create the file lazily if it doesn't exist. - **Sharpening a fuzzy term during the conversation?** Update `CONTEXT.md` right there. - **User rejects the candidate with a load-bearing reason?** Offer an ADR, framed as: _"Want me to record this as an ADR so future architecture reviews don't re-suggest it?"_ Only offer when the reason would actually be needed by a future explorer to avoid re-suggesting the same thing — skip ephemeral reasons ("not worth it right now") and self-evident ones. - **Want to explore alternative interfaces for the deepened module?** Run the `/codebase-design` skill and use its design-it-twice parallel sub-agent pattern. ## Limitations - Requires the upstream tool, account, API key, or local setup when the workflow names one. - Does not authorize destructive, production, paid, or external-message actions without explicit user approval. - Validate generated artifacts or recommendations against the user's real sources before treating them as final.