📦 deps(skills): sync superpowers

🐛 fix(playbook): address reported repo issues
normalize Windows path-like TOML config values, regenerate .agents/index.md on sync, and keep the SKILLS.md superpowers section route-only. also update ignore rules and refresh repo-issues.md to reflect the current status and verification evidence.
2026-03-10 01:59:03 +00:00 · 2026-03-10 09:57:22 +08:00
24 changed files with 1148 additions and 160 deletions
--- a/.gitea/ci/sync_superpowers.sh
+++ b/.gitea/ci/sync_superpowers.sh
@ -60,58 +60,7 @@ done
 printf "%s\n" "${names[@]}" | sort > "$SUPERPOWERS_LIST"
-update_block() {
+git add codex/skills "$SUPERPOWERS_LIST"
  local file="$1"
  local start="<!-- superpowers:skills:start -->"
  local end="<!-- superpowers:skills:end -->"
  local tmp
  tmp="$(mktemp)"
  {
    echo "### Third-party Skills (superpowers)"
    echo ""
    echo "$start"
    while IFS= read -r name; do
      [ -n "$name" ] || continue
      echo "- $name"
    done < "$SUPERPOWERS_LIST"
    echo "$end"
  } > "$tmp"
  if grep -q "$start" "$file"; then
    awk -v start="$start" -v end="$end" -v block="$tmp" '
      BEGIN {
        while ((getline line < block) > 0) { buf[++n] = line }
        close(block)
        inblock=0
        replaced=0
      }
      {
        if (!replaced && $0 == start) {
          for (i=1; i<=n; i++) print buf[i]
          inblock=1
          replaced=1
          next
        }
        if (inblock) {
          if ($0 == end) { inblock=0 }
          next
        }
        print
      }
    ' "$file" > "${file}.tmp"
    mv "${file}.tmp" "$file"
  else
    echo "" >> "$file"
    cat "$tmp" >> "$file"
  fi
  rm -f "$tmp"
 }
 update_block "SKILLS.md"
 git add codex/skills SKILLS.md "$SUPERPOWERS_LIST"
 if git diff --cached --quiet; then
  echo "No changes to sync."
--- a/.gitignore
+++ b/.gitignore
@ -21,3 +21,7 @@ tags
 reports/
 .worktrees/
 scripts/__pycache__
 tests/__pycache__
 tests/cli/__pycache__
--- a/SKILLS.md
+++ b/SKILLS.md
@ -159,32 +159,6 @@ python docs/standards/playbook/scripts/playbook.py -config playbook.toml
 ## 9. Third-party Skills (superpowers)
 来源：`codex/skills/.sources/superpowers.list`（第三方来源清单）。
 本节仅列出 superpowers 体系 skills，与本 Playbook 原生 skills 分离。
 ### Third-party Skills (superpowers)
 ### Third-party Skills (superpowers)
 ### Third-party Skills (superpowers)
 ### Third-party Skills (superpowers)
 <!-- superpowers:skills:start -->
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 - \
 <!-- superpowers:skills:end -->
 ---
--- a/codex/skills/brainstorming/SKILL.md
+++ b/codex/skills/brainstorming/SKILL.md
@ -5,44 +5,118 @@ description: "You MUST use this before any creative work - creating features, bu
 # Brainstorming Ideas Into Designs
 ## Overview
 Help turn ideas into fully formed designs and specs through natural collaborative dialogue.
-Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design in small sections (200-300 words), checking after each section whether it looks right so far.
+Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval.
 <HARD-GATE>
 Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity.
 </HARD-GATE>
 ## Anti-Pattern: "This Is Too Simple To Need A Design"
 Every project goes through this process. A todo list, a single-function utility, a config change — all of them. "Simple" projects are where unexamined assumptions cause the most wasted work. The design can be short (a few sentences for truly simple projects), but you MUST present it and get approval.
 ## Checklist
 You MUST create a task for each of these items and complete them in order:
 1. **Explore project context** — check files, docs, recent commits
 2. **Offer visual companion** (if topic will involve visual questions) — this is its own message, not combined with a clarifying question. See the Visual Companion section below.
 3. **Ask clarifying questions** — one at a time, understand purpose/constraints/success criteria
 4. **Propose 2-3 approaches** — with trade-offs and your recommendation
 5. **Present design** — in sections scaled to their complexity, get user approval after each section
 6. **Write design doc** — save to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` and commit
 7. **Transition to implementation** — invoke writing-plans skill to create implementation plan
 ## Process Flow
 ```dot
 digraph brainstorming {
    "Explore project context" [shape=box];
    "Visual questions ahead?" [shape=diamond];
    "Offer Visual Companion\n(own message, no other content)" [shape=box];
    "Ask clarifying questions" [shape=box];
    "Propose 2-3 approaches" [shape=box];
    "Present design sections" [shape=box];
    "User approves design?" [shape=diamond];
    "Write design doc" [shape=box];
    "Invoke writing-plans skill" [shape=doublecircle];
    "Explore project context" -> "Visual questions ahead?";
    "Visual questions ahead?" -> "Offer Visual Companion\n(own message, no other content)" [label="yes"];
    "Visual questions ahead?" -> "Ask clarifying questions" [label="no"];
    "Offer Visual Companion\n(own message, no other content)" -> "Ask clarifying questions";
    "Ask clarifying questions" -> "Propose 2-3 approaches";
    "Propose 2-3 approaches" -> "Present design sections";
    "Present design sections" -> "User approves design?";
    "User approves design?" -> "Present design sections" [label="no, revise"];
    "User approves design?" -> "Write design doc" [label="yes"];
    "Write design doc" -> "Invoke writing-plans skill";
 }
 ```
 **The terminal state is invoking writing-plans.** Do NOT invoke frontend-design, mcp-builder, or any other implementation skill. The ONLY skill you invoke after brainstorming is writing-plans.
 ## The Process
 **Understanding the idea:**
 - Check out the current project state first (files, docs, recent commits)
- Ask questions one at a time to refine the idea
+- Before asking detailed questions, assess scope: if the request describes multiple independent subsystems (e.g., "build a platform with chat, file storage, billing, and analytics"), flag this immediately. Don't spend questions refining details of a project that needs to be decomposed first.
 - If the project is too large for a single spec, help the user decompose into sub-projects: what are the independent pieces, how do they relate, what order should they be built? Then brainstorm the first sub-project through the normal design flow. Each sub-project gets its own spec → plan → implementation cycle.
 - For appropriately-scoped projects, ask questions one at a time to refine the idea
 - Prefer multiple choice questions when possible, but open-ended is fine too
 - Only one question per message - if a topic needs more exploration, break it into multiple questions
 - Focus on understanding: purpose, constraints, success criteria
 **Exploring approaches:**
 - Propose 2-3 different approaches with trade-offs
 - Present options conversationally with your recommendation and reasoning
 - Lead with your recommended option and explain why
 **Presenting the design:**
 - Once you believe you understand what you're building, present the design
- Break it into sections of 200-300 words
+- Scale each section to its complexity: a few sentences if straightforward, up to 200-300 words if nuanced
 - Ask after each section whether it looks right so far
 - Cover: architecture, components, data flow, error handling, testing
 - Be ready to go back and clarify if something doesn't make sense
 **Design for isolation and clarity:**
 - Break the system into smaller units that each have one clear purpose, communicate through well-defined interfaces, and can be understood and tested independently
 - For each unit, you should be able to answer: what does it do, how do you use it, and what does it depend on?
 - Can someone understand what a unit does without reading its internals? Can you change the internals without breaking consumers? If not, the boundaries need work.
 - Smaller, well-bounded units are also easier for you to work with - you reason better about code you can hold in context at once, and your edits are more reliable when files are focused. When a file grows large, that's often a signal that it's doing too much.
 **Working in existing codebases:**
 - Explore the current structure before proposing changes. Follow existing patterns.
 - Where existing code has problems that affect the work (e.g., a file that's grown too large, unclear boundaries, tangled responsibilities), include targeted improvements as part of the design - the way a good developer improves code they're working in.
 - Don't propose unrelated refactoring. Stay focused on what serves the current goal.
 ## After the Design
 **Documentation:**
- Write the validated design to `docs/plans/YYYY-MM-DD-<topic>-design.md`
+
 - Write the validated design (spec) to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`
  - (User preferences for spec location override this default)
 - Use elements-of-style:writing-clearly-and-concisely skill if available
 - Commit the design document to git
-**Implementation (if continuing):**
+**Spec Review Loop:**
- Ask: "Ready to set up for implementation?"
+After writing the spec document:
- Use superpowers:using-git-worktrees to create isolated workspace
+
- Use superpowers:writing-plans to create detailed implementation plan
+1. Dispatch spec-document-reviewer subagent (see spec-document-reviewer-prompt.md)
 2. If Issues Found: fix, re-dispatch, repeat until Approved
 3. If loop exceeds 5 iterations, surface to human for guidance
 **Implementation:**
 - Invoke the writing-plans skill to create a detailed implementation plan
 - Do NOT invoke any other skill. writing-plans is the next step.
 ## Key Principles
@ -50,5 +124,24 @@ Start by understanding the current project context, then ask questions one at a
 - **Multiple choice preferred** - Easier to answer than open-ended when possible
 - **YAGNI ruthlessly** - Remove unnecessary features from all designs
 - **Explore alternatives** - Always propose 2-3 approaches before settling
- **Incremental validation** - Present design in sections, validate each
+- **Incremental validation** - Present design, get approval before moving on
 - **Be flexible** - Go back and clarify when something doesn't make sense
 ## Visual Companion
 A browser-based companion for showing mockups, diagrams, and visual options during brainstorming. Available as a tool — not a mode. Accepting the companion means it's available for questions that benefit from visual treatment; it does NOT mean every question goes through the browser.
 **Offering the companion:** When you anticipate that upcoming questions will involve visual content (mockups, layouts, diagrams), offer it once for consent:
 > "Some of what we're working on might be easier to explain if I can show it to you in a web browser. I can put together mockups, diagrams, comparisons, and other visuals as we go. This feature is still new and can be token-intensive. Want to try it? (Requires opening a local URL)"
 **This offer MUST be its own message.** Do not combine it with clarifying questions, context summaries, or any other content. The message should contain ONLY the offer above and nothing else. Wait for the user's response before continuing. If they decline, proceed with text-only brainstorming.
 **Per-question decision:** Even after the user accepts, decide FOR EACH QUESTION whether to use the browser or the terminal. The test: **would the user understand this better by seeing it than reading it?**
 - **Use the browser** for content that IS visual — mockups, wireframes, layout comparisons, architecture diagrams, side-by-side visual designs
 - **Use the terminal** for content that is text — requirements questions, conceptual choices, tradeoff lists, A/B/C/D text options, scope decisions
 A question about a UI topic is not automatically a visual question. "What does personality mean in this context?" is a conceptual question — use the terminal. "Which wizard layout works better?" is a visual question — use the browser.
 If they agree to the companion, read the detailed guide before proceeding:
 `skills/brainstorming/visual-companion.md`
--- a/codex/skills/brainstorming/spec-document-reviewer-prompt.md
+++ b/codex/skills/brainstorming/spec-document-reviewer-prompt.md
@ -0,0 +1,50 @@
 # Spec Document Reviewer Prompt Template
 Use this template when dispatching a spec document reviewer subagent.
 **Purpose:** Verify the spec is complete, consistent, and ready for implementation planning.
 **Dispatch after:** Spec document is written to docs/superpowers/specs/
 ```
 Task tool (general-purpose):
  description: "Review spec document"
  prompt: |
    You are a spec document reviewer. Verify this spec is complete and ready for planning.
    **Spec to review:** [SPEC_FILE_PATH]
    ## What to Check
    | Category | What to Look For |
    |----------|------------------|
    | Completeness | TODOs, placeholders, "TBD", incomplete sections |
    | Coverage | Missing error handling, edge cases, integration points |
    | Consistency | Internal contradictions, conflicting requirements |
    | Clarity | Ambiguous requirements |
    | YAGNI | Unrequested features, over-engineering |
    | Scope | Focused enough for a single plan — not covering multiple independent subsystems |
    | Architecture | Units with clear boundaries, well-defined interfaces, independently understandable and testable |
    ## CRITICAL
    Look especially hard for:
    - Any TODO markers or placeholder text
    - Sections saying "to be defined later" or "will spec when X is done"
    - Sections noticeably less detailed than others
    - Units that lack clear boundaries or interfaces — can you understand what each unit does without reading its internals?
    ## Output Format
    ## Spec Review
    **Status:** ✅ Approved | ❌ Issues Found
    **Issues (if any):**
    - [Section X]: [specific issue] - [why it matters]
    **Recommendations (advisory):**
    - [suggestions that don't block approval]
 ```
 **Reviewer returns:** Status, Issues (if any), Recommendations
--- a/codex/skills/brainstorming/visual-companion.md
+++ b/codex/skills/brainstorming/visual-companion.md
@ -0,0 +1,260 @@
 # Visual Companion Guide
 Browser-based visual brainstorming companion for showing mockups, diagrams, and options.
 ## When to Use
 Decide per-question, not per-session. The test: **would the user understand this better by seeing it than reading it?**
 **Use the browser** when the content itself is visual:
 - **UI mockups** — wireframes, layouts, navigation structures, component designs
 - **Architecture diagrams** — system components, data flow, relationship maps
 - **Side-by-side visual comparisons** — comparing two layouts, two color schemes, two design directions
 - **Design polish** — when the question is about look and feel, spacing, visual hierarchy
 - **Spatial relationships** — state machines, flowcharts, entity relationships rendered as diagrams
 **Use the terminal** when the content is text or tabular:
 - **Requirements and scope questions** — "what does X mean?", "which features are in scope?"
 - **Conceptual A/B/C choices** — picking between approaches described in words
 - **Tradeoff lists** — pros/cons, comparison tables
 - **Technical decisions** — API design, data modeling, architectural approach selection
 - **Clarifying questions** — anything where the answer is words, not a visual preference
 A question *about* a UI topic is not automatically a visual question. "What kind of wizard do you want?" is conceptual — use the terminal. "Which of these wizard layouts feels right?" is visual — use the browser.
 ## How It Works
 The server watches a directory for HTML files and serves the newest one to the browser. You write HTML content, the user sees it in their browser and can click to select options. Selections are recorded to a `.events` file that you read on your next turn.
 **Content fragments vs full documents:** If your HTML file starts with `<!DOCTYPE` or `<html`, the server serves it as-is (just injects the helper script). Otherwise, the server automatically wraps your content in the frame template — adding the header, CSS theme, selection indicator, and all interactive infrastructure. **Write content fragments by default.** Only write full documents when you need complete control over the page.
 ## Starting a Session
 ```bash
 # Start server with persistence (mockups saved to project)
 ${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/start-server.sh --project-dir /path/to/project
 # Returns: {"type":"server-started","port":52341,"url":"http://localhost:52341",
 #           "screen_dir":"/path/to/project/.superpowers/brainstorm/12345-1706000000"}
 ```
 Save `screen_dir` from the response. Tell user to open the URL.
 **Note:** Pass the project root as `--project-dir` so mockups persist in `.superpowers/brainstorm/` and survive server restarts. Without it, files go to `/tmp` and get cleaned up. Remind the user to add `.superpowers/` to `.gitignore` if it's not already there.
 **Codex behavior:** In Codex (`CODEX_CI=1`), `start-server.sh` auto-switches to foreground mode by default because background jobs may be reaped. Use `--background` only if your environment reliably preserves detached processes.
 **If background processes are reaped in your environment:** run in foreground from a persistent terminal session:
 ```bash
 ${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/start-server.sh --project-dir /path/to/project --foreground
 ```
 In `--foreground` mode, the command stays attached and serves until interrupted.
 If the URL is unreachable from your browser (common in remote/containerized setups), bind a non-loopback host:
 ```bash
 ${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/start-server.sh \
  --project-dir /path/to/project \
  --host 0.0.0.0 \
  --url-host localhost
 ```
 Use `--url-host` to control what hostname is printed in the returned URL JSON.
 ## The Loop
 1. **Write HTML** to a new file in `screen_dir`:
   - Use semantic filenames: `platform.html`, `visual-style.html`, `layout.html`
   - **Never reuse filenames** — each screen gets a fresh file
   - Use Write tool — **never use cat/heredoc** (dumps noise into terminal)
   - Server automatically serves the newest file
 2. **Tell user what to expect and end your turn:**
   - Remind them of the URL (every step, not just first)
   - Give a brief text summary of what's on screen (e.g., "Showing 3 layout options for the homepage")
   - Ask them to respond in the terminal: "Take a look and let me know what you think. Click to select an option if you'd like."
 3. **On your next turn** — after the user responds in the terminal:
   - Read `$SCREEN_DIR/.events` if it exists — this contains the user's browser interactions (clicks, selections) as JSON lines
   - Merge with the user's terminal text to get the full picture
   - The terminal message is the primary feedback; `.events` provides structured interaction data
 4. **Iterate or advance** — if feedback changes current screen, write a new file (e.g., `layout-v2.html`). Only move to the next question when the current step is validated.
 5. **Unload when returning to terminal** — when the next step doesn't need the browser (e.g., a clarifying question, a tradeoff discussion), push a waiting screen to clear the stale content:
   ```html
   <!-- filename: waiting.html (or waiting-2.html, etc.) -->
   <div style="display:flex;align-items:center;justify-content:center;min-height:60vh">
     <p class="subtitle">Continuing in terminal...</p>
   </div>
   ```
   This prevents the user from staring at a resolved choice while the conversation has moved on. When the next visual question comes up, push a new content file as usual.
 6. Repeat until done.
 ## Writing Content Fragments
 Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, selection indicator, and all interactive infrastructure).
 **Minimal example:**
 ```html
 <h2>Which layout works better?</h2>
 <p class="subtitle">Consider readability and visual hierarchy</p>
 <div class="options">
  <div class="option" data-choice="a" onclick="toggleSelect(this)">
    <div class="letter">A</div>
    <div class="content">
      <h3>Single Column</h3>
      <p>Clean, focused reading experience</p>
    </div>
  </div>
  <div class="option" data-choice="b" onclick="toggleSelect(this)">
    <div class="letter">B</div>
    <div class="content">
      <h3>Two Column</h3>
      <p>Sidebar navigation with main content</p>
    </div>
  </div>
 </div>
 ```
 That's it. No `<html>`, no CSS, no `<script>` tags needed. The server provides all of that.
 ## CSS Classes Available
 The frame template provides these CSS classes for your content:
 ### Options (A/B/C choices)
 ```html
 <div class="options">
  <div class="option" data-choice="a" onclick="toggleSelect(this)">
    <div class="letter">A</div>
    <div class="content">
      <h3>Title</h3>
      <p>Description</p>
    </div>
  </div>
 </div>
 ```
 **Multi-select:** Add `data-multiselect` to the container to let users select multiple options. Each click toggles the item. The indicator bar shows the count.
 ```html
 <div class="options" data-multiselect>
  <!-- same option markup — users can select/deselect multiple -->
 </div>
 ```
 ### Cards (visual designs)
 ```html
 <div class="cards">
  <div class="card" data-choice="design1" onclick="toggleSelect(this)">
    <div class="card-image"><!-- mockup content --></div>
    <div class="card-body">
      <h3>Name</h3>
      <p>Description</p>
    </div>
  </div>
 </div>
 ```
 ### Mockup container
 ```html
 <div class="mockup">
  <div class="mockup-header">Preview: Dashboard Layout</div>
  <div class="mockup-body"><!-- your mockup HTML --></div>
 </div>
 ```
 ### Split view (side-by-side)
 ```html
 <div class="split">
  <div class="mockup"><!-- left --></div>
  <div class="mockup"><!-- right --></div>
 </div>
 ```
 ### Pros/Cons
 ```html
 <div class="pros-cons">
  <div class="pros"><h4>Pros</h4><ul><li>Benefit</li></ul></div>
  <div class="cons"><h4>Cons</h4><ul><li>Drawback</li></ul></div>
 </div>
 ```
 ### Mock elements (wireframe building blocks)
 ```html
 <div class="mock-nav">Logo | Home | About | Contact</div>
 <div style="display: flex;">
  <div class="mock-sidebar">Navigation</div>
  <div class="mock-content">Main content area</div>
 </div>
 <button class="mock-button">Action Button</button>
 <input class="mock-input" placeholder="Input field">
 <div class="placeholder">Placeholder area</div>
 ```
 ### Typography and sections
 - `h2` — page title
 - `h3` — section heading
 - `.subtitle` — secondary text below title
 - `.section` — content block with bottom margin
 - `.label` — small uppercase label text
 ## Browser Events Format
 When the user clicks options in the browser, their interactions are recorded to `$SCREEN_DIR/.events` (one JSON object per line). The file is cleared automatically when you push a new screen.
 ```jsonl
 {"type":"click","choice":"a","text":"Option A - Simple Layout","timestamp":1706000101}
 {"type":"click","choice":"c","text":"Option C - Complex Grid","timestamp":1706000108}
 {"type":"click","choice":"b","text":"Option B - Hybrid","timestamp":1706000115}
 ```
 The full event stream shows the user's exploration path — they may click multiple options before settling. The last `choice` event is typically the final selection, but the pattern of clicks can reveal hesitation or preferences worth asking about.
 If `.events` doesn't exist, the user didn't interact with the browser — use only their terminal text.
 ## Design Tips
 - **Scale fidelity to the question** — wireframes for layout, polish for polish questions
 - **Explain the question on each page** — "Which layout feels more professional?" not just "Pick one"
 - **Iterate before advancing** — if feedback changes current screen, write a new version
 - **2-4 options max** per screen
 - **Use real content when it matters** — for a photography portfolio, use actual images (Unsplash). Placeholder content obscures design issues.
 - **Keep mockups simple** — focus on layout and structure, not pixel-perfect design
 ## File Naming
 - Use semantic names: `platform.html`, `visual-style.html`, `layout.html`
 - Never reuse filenames — each screen must be a new file
 - For iterations: append version suffix like `layout-v2.html`, `layout-v3.html`
 - Server serves newest file by modification time
 ## Cleaning Up
 ```bash
 ${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/stop-server.sh $SCREEN_DIR
 ```
 If the session used `--project-dir`, mockup files persist in `.superpowers/brainstorm/` for later reference. Only `/tmp` sessions get deleted on stop.
 ## Reference
 - Frame template (CSS reference): `${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/frame-template.html`
 - Helper script (client-side): `${CLAUDE_PLUGIN_ROOT}/lib/brainstorm-server/helper.js`
--- a/codex/skills/executing-plans/SKILL.md
+++ b/codex/skills/executing-plans/SKILL.md
@ -7,12 +7,12 @@ description: Use when you have a written implementation plan to execute in a sep
 ## Overview
-Load plan, review critically, execute tasks in batches, report for review between batches.
+Load plan, review critically, execute all tasks, report when complete.
 **Core principle:** Batch execution with checkpoints for architect review.
 **Announce at start:** "I'm using the executing-plans skill to implement this plan."
 **Note:** Tell your human partner that Superpowers works much better with access to subagents. The quality of its work will be significantly higher if run on a platform with subagent support (such as Claude Code or Codex). If subagents are available, use superpowers:subagent-driven-development instead of this skill.
 ## The Process
 ### Step 1: Load and Review Plan
@ -21,8 +21,7 @@ Load plan, review critically, execute tasks in batches, report for review betwee
 3. If concerns: Raise them with your human partner before starting
 4. If no concerns: Create TodoWrite and proceed
-### Step 2: Execute Batch
+### Step 2: Execute Tasks
 **Default: First 3 tasks**
 For each task:
 1. Mark as in_progress
@ -30,19 +29,7 @@ For each task:
 3. Run verifications as specified
 4. Mark as completed
-### Step 3: Report
+### Step 3: Complete Development
 When batch complete:
 - Show what was implemented
 - Show verification output
 - Say: "Ready for feedback."
 ### Step 4: Continue
 Based on feedback:
 - Apply changes if needed
 - Execute next batch
 - Repeat until complete
 ### Step 5: Complete Development
 After all tasks complete and verified:
 - Announce: "I'm using the finishing-a-development-branch skill to complete this work."
@ -52,7 +39,7 @@ After all tasks complete and verified:
 ## When to Stop and Ask for Help
 **STOP executing immediately when:**
- Hit a blocker mid-batch (missing dependency, test fails, instruction unclear)
+- Hit a blocker (missing dependency, test fails, instruction unclear)
 - Plan has critical gaps preventing starting
 - You don't understand an instruction
 - Verification fails repeatedly
@ -72,5 +59,12 @@ After all tasks complete and verified:
 - Follow plan steps exactly
 - Don't skip verifications
 - Reference skills when plan says to
 - Between batches: just report and wait
 - Stop when blocked, don't guess
 - Never start implementation on main/master branch without explicit user consent
 ## Integration
 **Required workflow skills:**
 - **superpowers:using-git-worktrees** - REQUIRED: Set up isolated workspace before starting
 - **superpowers:writing-plans** - Creates the plan this skill executes
 - **superpowers:finishing-a-development-branch** - Complete development after all tasks
--- a/codex/skills/requesting-code-review/SKILL.md
+++ b/codex/skills/requesting-code-review/SKILL.md
@ -58,7 +58,7 @@ HEAD_SHA=$(git rev-parse HEAD)
 [Dispatch superpowers:code-reviewer subagent]
  WHAT_WAS_IMPLEMENTED: Verification and repair functions for conversation index
-  PLAN_OR_REQUIREMENTS: Task 2 from docs/plans/deployment-plan.md
+  PLAN_OR_REQUIREMENTS: Task 2 from docs/superpowers/plans/deployment-plan.md
  BASE_SHA: a7981ec
  HEAD_SHA: 3df7661
  DESCRIPTION: Added verifyIndex() and repairIndex() with 4 issue types
--- a/codex/skills/subagent-driven-development/SKILL.md
+++ b/codex/skills/subagent-driven-development/SKILL.md
@ -82,6 +82,39 @@ digraph process {
 }
 ```
 ## Model Selection
 Use the least powerful model that can handle each role to conserve cost and increase speed.
 **Mechanical implementation tasks** (isolated functions, clear specs, 1-2 files): use a fast, cheap model. Most implementation tasks are mechanical when the plan is well-specified.
 **Integration and judgment tasks** (multi-file coordination, pattern matching, debugging): use a standard model.
 **Architecture, design, and review tasks**: use the most capable available model.
 **Task complexity signals:**
 - Touches 1-2 files with a complete spec → cheap model
 - Touches multiple files with integration concerns → standard model
 - Requires design judgment or broad codebase understanding → most capable model
 ## Handling Implementer Status
 Implementer subagents report one of four statuses. Handle each appropriately:
 **DONE:** Proceed to spec compliance review.
 **DONE_WITH_CONCERNS:** The implementer completed the work but flagged doubts. Read the concerns before proceeding. If the concerns are about correctness or scope, address them before review. If they're observations (e.g., "this file is getting large"), note them and proceed to review.
 **NEEDS_CONTEXT:** The implementer needs information that wasn't provided. Provide the missing context and re-dispatch.
 **BLOCKED:** The implementer cannot complete the task. Assess the blocker:
 1. If it's a context problem, provide more context and re-dispatch with the same model
 2. If the task requires more reasoning, re-dispatch with a more capable model
 3. If the task is too large, break it into smaller pieces
 4. If the plan itself is wrong, escalate to the human
 **Never** ignore an escalation or force the same model to retry without changes. If the implementer said it's stuck, something needs to change.
 ## Prompt Templates
 - `./implementer-prompt.md` - Dispatch implementer subagent
@ -93,7 +126,7 @@ digraph process {
 ```
 You: I'm using Subagent-Driven Development to execute this plan.
-[Read plan file once: docs/plans/feature-plan.md]
+[Read plan file once: docs/superpowers/plans/feature-plan.md]
 [Extract all 5 tasks with full text and context]
 [Create TodoWrite with all tasks]
@ -199,6 +232,7 @@ Done!
 ## Red Flags
 **Never:**
 - Start implementation on main/master branch without explicit user consent
 - Skip reviews (spec compliance OR code quality)
 - Proceed with unfixed issues
 - Dispatch multiple implementation subagents in parallel (conflicts)
@ -229,6 +263,7 @@ Done!
 ## Integration
 **Required workflow skills:**
 - **superpowers:using-git-worktrees** - REQUIRED: Set up isolated workspace before starting
 - **superpowers:writing-plans** - Creates the plan this skill executes
 - **superpowers:requesting-code-review** - Code review template for reviewer subagents
 - **superpowers:finishing-a-development-branch** - Complete development after all tasks
--- a/codex/skills/subagent-driven-development/code-quality-reviewer-prompt.md
+++ b/codex/skills/subagent-driven-development/code-quality-reviewer-prompt.md
@ -17,4 +17,10 @@ Task tool (superpowers:code-reviewer):
  DESCRIPTION: [task summary]
 ```
 **In addition to standard code quality concerns, the reviewer should check:**
 - Does each file have one clear responsibility with a well-defined interface?
 - Are units decomposed so they can be understood and tested independently?
 - Is the implementation following the file structure from the plan?
 - Did this implementation create new files that are already large, or significantly grow existing files? (Don't flag pre-existing file sizes — focus on what this change contributed.)
 **Code reviewer returns:** Strengths, Issues (Critical/Important/Minor), Assessment
--- a/codex/skills/subagent-driven-development/implementer-prompt.md
+++ b/codex/skills/subagent-driven-development/implementer-prompt.md
@ -41,6 +41,36 @@ Task tool (general-purpose):
    **While you work:** If you encounter something unexpected or unclear, **ask questions**.
    It's always OK to pause and clarify. Don't guess or make assumptions.
    ## Code Organization
    You reason best about code you can hold in context at once, and your edits are more
    reliable when files are focused. Keep this in mind:
    - Follow the file structure defined in the plan
    - Each file should have one clear responsibility with a well-defined interface
    - If a file you're creating is growing beyond the plan's intent, stop and report
      it as DONE_WITH_CONCERNS — don't split files on your own without plan guidance
    - If an existing file you're modifying is already large or tangled, work carefully
      and note it as a concern in your report
    - In existing codebases, follow established patterns. Improve code you're touching
      the way a good developer would, but don't restructure things outside your task.
    ## When You're in Over Your Head
    It is always OK to stop and say "this is too hard for me." Bad work is worse than
    no work. You will not be penalized for escalating.
    **STOP and escalate when:**
    - The task requires architectural decisions with multiple valid approaches
    - You need to understand code beyond what was provided and can't find clarity
    - You feel uncertain about whether your approach is correct
    - The task involves restructuring existing code in ways the plan didn't anticipate
    - You've been reading file after file trying to understand the system without progress
    **How to escalate:** Report back with status BLOCKED or NEEDS_CONTEXT. Describe
    specifically what you're stuck on, what you've tried, and what kind of help you need.
    The controller can provide more context, re-dispatch with a more capable model,
    or break the task into smaller pieces.
    ## Before Reporting Back: Self-Review
    Review your work with fresh eyes. Ask yourself:
@ -70,9 +100,14 @@ Task tool (general-purpose):
    ## Report Format
    When done, report:
-    - What you implemented
+    - **Status:** DONE | DONE_WITH_CONCERNS | BLOCKED | NEEDS_CONTEXT
    - What you implemented (or what you attempted, if blocked)
    - What you tested and test results
    - Files changed
    - Self-review findings (if any)
    - Any issues or concerns
    Use DONE_WITH_CONCERNS if you completed the work but have doubts about correctness.
    Use BLOCKED if you cannot complete the task. Use NEEDS_CONTEXT if you need
    information that wasn't provided. Never silently produce work you're unsure about.
 ```
--- a/codex/skills/systematic-debugging/find-polluter.sh
+++ b/codex/skills/systematic-debugging/find-polluter.sh
--- a/codex/skills/using-git-worktrees/SKILL.md
+++ b/codex/skills/using-git-worktrees/SKILL.md
@ -210,8 +210,9 @@ Ready to implement auth feature
 **Called by:**
 - **brainstorming** (Phase 4) - REQUIRED when design is approved and implementation follows
 - **subagent-driven-development** - REQUIRED before executing any tasks
 - **executing-plans** - REQUIRED before executing any tasks
 - Any skill needing isolated workspace
 **Pairs with:**
 - **finishing-a-development-branch** - REQUIRED for cleanup after work complete
 - **executing-plans** or **subagent-driven-development** - Work happens in this worktree
--- a/codex/skills/using-superpowers/SKILL.md
+++ b/codex/skills/using-superpowers/SKILL.md
@ -3,6 +3,10 @@ name: using-superpowers
 description: Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions
 ---
 <SUBAGENT-STOP>
 If you were dispatched as a subagent to execute a specific task, skip this skill.
 </SUBAGENT-STOP>
 <EXTREMELY-IMPORTANT>
 If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST invoke the skill.
@ -11,12 +15,26 @@ IF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT.
 This is not negotiable. This is not optional. You cannot rationalize your way out of this.
 </EXTREMELY-IMPORTANT>
 ## Instruction Priority
 Superpowers skills override default system prompt behavior, but **user instructions always take precedence**:
 1. **User's explicit instructions** (CLAUDE.md, AGENTS.md, direct requests) — highest priority
 2. **Superpowers skills** — override default system behavior where they conflict
 3. **Default system prompt** — lowest priority
 If CLAUDE.md or AGENTS.md says "don't use TDD" and a skill says "always use TDD," follow the user's instructions. The user is in control.
 ## How to Access Skills
 **In Claude Code:** Use the `Skill` tool. When you invoke a skill, its content is loaded and presented to you—follow it directly. Never use the Read tool on skill files.
 **In other environments:** Check your platform's documentation for how skills are loaded.
 ## Platform Adaptation
 Skills use Claude Code tool names. Non-CC platforms: see `references/codex-tools.md` for tool equivalents.
 # Using Skills
 ## The Rule
@ -26,6 +44,9 @@ This is not negotiable. This is not optional. You cannot rationalize your way ou
 ```dot
 digraph skill_flow {
    "User message received" [shape=doublecircle];
    "About to EnterPlanMode?" [shape=doublecircle];
    "Already brainstormed?" [shape=diamond];
    "Invoke brainstorming skill" [shape=box];
    "Might any skill apply?" [shape=diamond];
    "Invoke Skill tool" [shape=box];
    "Announce: 'Using [skill] to [purpose]'" [shape=box];
@ -34,6 +55,11 @@ digraph skill_flow {
    "Follow skill exactly" [shape=box];
    "Respond (including clarifications)" [shape=doublecircle];
    "About to EnterPlanMode?" -> "Already brainstormed?";
    "Already brainstormed?" -> "Invoke brainstorming skill" [label="no"];
    "Already brainstormed?" -> "Might any skill apply?" [label="yes"];
    "Invoke brainstorming skill" -> "Might any skill apply?";
    "User message received" -> "Might any skill apply?";
    "Might any skill apply?" -> "Invoke Skill tool" [label="yes, even 1%"];
    "Might any skill apply?" -> "Respond (including clarifications)" [label="definitely not"];
--- a/codex/skills/using-superpowers/references/codex-tools.md
+++ b/codex/skills/using-superpowers/references/codex-tools.md
@ -0,0 +1,25 @@
 # Codex Tool Mapping
 Skills use Claude Code tool names. When you encounter these in a skill, use your platform equivalent:
 | Skill references | Codex equivalent |
 |-----------------|------------------|
 | `Task` tool (dispatch subagent) | `spawn_agent` |
 | Multiple `Task` calls (parallel) | Multiple `spawn_agent` calls |
 | Task returns result | `wait` |
 | Task completes automatically | `close_agent` to free slot |
 | `TodoWrite` (task tracking) | `update_plan` |
 | `Skill` tool (invoke a skill) | Skills load natively — just follow the instructions |
 | `Read`, `Write`, `Edit` (files) | Use your native file tools |
 | `Bash` (run commands) | Use your native shell tools |
 ## Subagent dispatch requires collab
 Add to your Codex config (`~/.codex/config.toml`):
 ```toml
 [features]
 collab = true
 ```
 This enables `spawn_agent`, `wait`, and `close_agent` for skills like `dispatching-parallel-agents` and `subagent-driven-development`.
--- a/codex/skills/writing-plans/SKILL.md
+++ b/codex/skills/writing-plans/SKILL.md
@ -15,7 +15,23 @@ Assume they are a skilled developer, but know almost nothing about our toolset o
 **Context:** This should be run in a dedicated worktree (created by brainstorming skill).
-**Save plans to:** `docs/plans/YYYY-MM-DD-<feature-name>.md`
+**Save plans to:** `docs/superpowers/plans/YYYY-MM-DD-<feature-name>.md`
 - (User preferences for plan location override this default)
 ## Scope Check
 If the spec covers multiple independent subsystems, it should have been broken into sub-project specs during brainstorming. If it wasn't, suggest breaking this into separate plans — one per subsystem. Each plan should produce working, testable software on its own.
 ## File Structure
 Before defining tasks, map out which files will be created or modified and what each one is responsible for. This is where decomposition decisions get locked in.
 - Design units with clear boundaries and well-defined interfaces. Each file should have one clear responsibility.
 - You reason best about code you can hold in context at once, and your edits are more reliable when files are focused. Prefer smaller, focused files over large ones that do too much.
 - Files that change together should live together. Split by responsibility, not by technical layer.
 - In existing codebases, follow established patterns. If the codebase uses large files, don't unilaterally restructure - but if a file you're modifying has grown unwieldy, including a split in the plan is reasonable.
 This structure informs the task decomposition. Each task should produce self-contained changes that make sense independently.
 ## Bite-Sized Task Granularity
@ -33,7 +49,7 @@ Assume they are a skilled developer, but know almost nothing about our toolset o
 ```markdown
 # [Feature Name] Implementation Plan
-> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+> **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking.
 **Goal:** [One sentence describing what this builds]
@ -46,7 +62,7 @@ Assume they are a skilled developer, but know almost nothing about our toolset o
 ## Task Structure
-```markdown
+````markdown
 ### Task N: [Component Name]
 **Files:**
@ -54,7 +70,7 @@ Assume they are a skilled developer, but know almost nothing about our toolset o
 - Modify: `exact/path/to/existing.py:123-145`
 - Test: `tests/exact/path/to/test.py`
-**Step 1: Write the failing test**
+- [ ] **Step 1: Write the failing test**
 ```python
 def test_specific_behavior():
@ -62,30 +78,30 @@ def test_specific_behavior():
    assert result == expected
 ```
-**Step 2: Run test to verify it fails**
+- [ ] **Step 2: Run test to verify it fails**
 Run: `pytest tests/path/test.py::test_name -v`
 Expected: FAIL with "function not defined"
-**Step 3: Write minimal implementation**
+- [ ] **Step 3: Write minimal implementation**
 ```python
 def function(input):
    return expected
 ```
-**Step 4: Run test to verify it passes**
+- [ ] **Step 4: Run test to verify it passes**
 Run: `pytest tests/path/test.py::test_name -v`
 Expected: PASS
-**Step 5: Commit**
+- [ ] **Step 5: Commit**
 ```bash
 git add tests/path/test.py src/path/file.py
 git commit -m "feat: add specific feature"
 ```
-```
+````
 ## Remember
 - Exact file paths always
@ -94,23 +110,38 @@ git commit -m "feat: add specific feature"
 - Reference relevant skills with @ syntax
 - DRY, YAGNI, TDD, frequent commits
 ## Plan Review Loop
 After completing each chunk of the plan:
 1. Dispatch plan-document-reviewer subagent (see plan-document-reviewer-prompt.md) for the current chunk
   - Provide: chunk content, path to spec document
 2. If ❌ Issues Found:
   - Fix the issues in the chunk
   - Re-dispatch reviewer for that chunk
   - Repeat until ✅ Approved
 3. If ✅ Approved: proceed to next chunk (or execution handoff if last chunk)
 **Chunk boundaries:** Use `## Chunk N: <name>` headings to delimit chunks. Each chunk should be ≤1000 lines and logically self-contained.
 **Review loop guidance:**
 - Same agent that wrote the plan fixes it (preserves context)
 - If loop exceeds 5 iterations, surface to human for guidance
 - Reviewers are advisory - explain disagreements if you believe feedback is incorrect
 ## Execution Handoff
-After saving the plan, offer execution choice:
+After saving the plan:
-**"Plan complete and saved to `docs/plans/<filename>.md`. Two execution options:**
+**"Plan complete and saved to `docs/superpowers/plans/<filename>.md`. Ready to execute?"**
-**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
+**Execution path depends on harness capabilities:**
-**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
+**If harness has subagents (Claude Code, etc.):**
 - **REQUIRED:** Use superpowers:subagent-driven-development
 - Do NOT offer a choice - subagent-driven is the standard approach
 - Fresh subagent per task + two-stage review
-**Which approach?"**
+**If harness does NOT have subagents:**
-
+- Execute plan in current session using superpowers:executing-plans
-**If Subagent-Driven chosen:**
+- Batch execution with checkpoints for review
 - **REQUIRED SUB-SKILL:** Use superpowers:subagent-driven-development
 - Stay in this session
 - Fresh subagent per task + code review
 **If Parallel Session chosen:**
 - Guide them to open new session in worktree
 - **REQUIRED SUB-SKILL:** New session uses superpowers:executing-plans
--- a/codex/skills/writing-plans/plan-document-reviewer-prompt.md
+++ b/codex/skills/writing-plans/plan-document-reviewer-prompt.md
@ -0,0 +1,52 @@
 # Plan Document Reviewer Prompt Template
 Use this template when dispatching a plan document reviewer subagent.
 **Purpose:** Verify the plan chunk is complete, matches the spec, and has proper task decomposition.
 **Dispatch after:** Each plan chunk is written
 ```
 Task tool (general-purpose):
  description: "Review plan chunk N"
  prompt: |
    You are a plan document reviewer. Verify this plan chunk is complete and ready for implementation.
    **Plan chunk to review:** [PLAN_FILE_PATH] - Chunk N only
    **Spec for reference:** [SPEC_FILE_PATH]
    ## What to Check
    | Category | What to Look For |
    |----------|------------------|
    | Completeness | TODOs, placeholders, incomplete tasks, missing steps |
    | Spec Alignment | Chunk covers relevant spec requirements, no scope creep |
    | Task Decomposition | Tasks atomic, clear boundaries, steps actionable |
    | File Structure | Files have clear single responsibilities, split by responsibility not layer |
    | File Size | Would any new or modified file likely grow large enough to be hard to reason about as a whole? |
    | Task Syntax | Checkbox syntax (`- [ ]`) on steps for tracking |
    | Chunk Size | Each chunk under 1000 lines |
    ## CRITICAL
    Look especially hard for:
    - Any TODO markers or placeholder text
    - Steps that say "similar to X" without actual content
    - Incomplete task definitions
    - Missing verification steps or expected outputs
    - Files planned to hold multiple responsibilities or likely to grow unwieldy
    ## Output Format
    ## Plan Review - Chunk N
    **Status:** Approved | Issues Found
    **Issues (if any):**
    - [Task X, Step Y]: [specific issue] - [why it matters]
    **Recommendations (advisory):**
    - [suggestions that don't block approval]
 ```
 **Reviewer returns:** Status, Issues (if any), Recommendations
--- a/codex/skills/writing-skills/SKILL.md
+++ b/codex/skills/writing-skills/SKILL.md
@ -9,7 +9,7 @@ description: Use when creating new skills, editing existing skills, or verifying
 **Writing skills IS Test-Driven Development applied to process documentation.**
-**Personal skills live in agent-specific directories (`~/.claude/skills` for Claude Code, `~/.codex/skills` for Codex)** 
+**Personal skills live in agent-specific directories (`~/.claude/skills` for Claude Code, `~/.agents/skills/` for Codex)** 
 You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).
--- a/codex/skills/writing-skills/render-graphs.js
+++ b/codex/skills/writing-skills/render-graphs.js
--- a/repo-issues.md
+++ b/repo-issues.md
@ -0,0 +1,326 @@
 # 仓库问题清单
 本文档整理当前仓库中已经确认的问题，重点覆盖可复现现象、根因、影响范围和建议修复方向。
 ## 适用范围
 - 仓库：`playbook`
 - 分析时间：2026-03-09
 - 当前分析环境：Windows + PowerShell + Python 3.12
 ## 问题总览
 | ID  | 严重级别 | 主题                                           | 主要影响                                |
 | --- | -------- | ---------------------------------------------- | --------------------------------------- |
 | 1   | 高       | Windows 下 TOML 配置解析失效                   | 大量 CLI 功能无法执行                   |
 | 2   | 中       | `SKILLS.md` 中 Third-party 内容重复维护        | 文档容易漂移，测试契约错误              |
 | 3   | 中       | `.agents/index.md` 不会随语言集合更新          | 生成产物前后不一致                      |
 | 4   | 低       | 本地验证文档默认依赖 POSIX shell               | 平台前提不清晰，易误导新环境用户        |
 | 5   | 低       | `load_config()` 真实入口测试曾缺失（现已补齐） | 当前无需增加 Windows runner             |
 | 6   | 低       | Python 缓存忽略已补上（现已缓解）              | 当前工作区不会再因测试产生 pycache 噪音 |
 ## 问题 1：Windows 下 TOML 配置解析失效
 ### 位置
 - `scripts/playbook.py`
 - 关键入口：
  - `load_config()`
  - `main()`
 ### 现象
 在当前 Windows 环境中，`playbook.py` 使用 `tomllib.loads()` 解析配置文件时，会因为双引号字符串中的反斜杠路径而直接抛出异常。
 典型错误：
 ```text
 tomllib.TOMLDecodeError: Invalid hex value (at line 3, column 21)
 ```
 ### 根因
 当前逻辑是：
 1. 只要运行环境存在 `tomllib`，就直接调用 `tomllib.loads(raw)`。
 2. Windows 临时目录路径通常形如 `D:\...\tmp`。
 3. 这些路径被直接写进 TOML 的双引号字符串后，反斜杠会被 TOML 当作转义前缀。
 4. 结果在真正执行任何动作之前就解析失败。
 仓库虽然实现了 `loads_toml_minimal()`，但只有 `tomllib` 不存在时才会启用；`tomllib` 存在但解析失败时不会回退。
 ### 影响
 会阻断以下动作的正常执行：
 - `vendor`
 - `sync_memory_bank`
 - `sync_rules`
 - `sync_prompts`
 - `sync_standards`
 - `install_skills`
 - `format_md`
 ### 证据
 - `scripts/playbook.py`
 - `tests/cli/test_playbook_cli.py`
 - `tests/test_format_md_action.py`
 - `tests/test_gitattributes_modes.py`
 - `tests/test_no_backup_flags.py`
 - `tests/test_sync_directory_actions.py`
 - `tests/test_sync_templates_placeholders.py`
 - `tests/test_vendor_snapshot_templates.py`
 ### 修复建议
 - 在 `tomllib.TOMLDecodeError` 时回退到 `loads_toml_minimal()`。
 - 或统一要求 Windows 路径在 TOML 中使用单引号或双反斜杠。
 - 最好同时补针对 `load_config()` 的 Windows 路径测试，而不是只测备用解析器。
 ## 问题 2：`SKILLS.md` 中 Third-party 内容重复维护
 ### 位置
 - `SKILLS.md`
 - `codex/skills/.sources/superpowers.list`
 - `.gitea/ci/sync_superpowers.sh`
 - `tests/test_superpowers_list_sync.py`
 ### 现象
 `SKILLS.md` 的 Third-party Skills (superpowers) 一节同时承担了两种职责：
 1. 声明第三方技能的来源是 `codex/skills/.sources/superpowers.list`
 2. 在文档内部再次内嵌一份 third-party skills 列表
 这种设计会制造两份需要同步维护的信息源；一旦同步脚本没有正确执行或生成产物未提交，文档内容和来源清单就会漂移，测试也会随之失败。
 ### 根因
 Third-party skills 的唯一真相来源本应是 `codex/skills/.sources/superpowers.list`，但仓库同时又要求 `.gitea/ci/sync_superpowers.sh` 把这份列表回写进 `SKILLS.md`。文档承担了“路由页”和“列表副本”两种角色，导致重复维护。
 ### 影响
 - `SKILLS.md` 容易与真实来源清单失步。
 - 一致性测试会围绕错误契约失败。
 - vendoring 快照会携带多余且容易过期的 third-party 列表副本。
 ### 证据
 - `SKILLS.md`
 - `codex/skills/.sources/superpowers.list`
 - `tests/test_superpowers_list_sync.py`
 - `.gitea/ci/sync_superpowers.sh`
 ### 修复建议
 - 保留 `SKILLS.md`，但将 Third-party Skills (superpowers) 一节降级为路由页。
 - 在 `SKILLS.md` 中仅保留来源说明：`codex/skills/.sources/superpowers.list`。
 - 停止由同步脚本向 `SKILLS.md` 回写 third-party skills 列表。
 - 将测试契约改为验证“route-only”，而不是验证 `SKILLS.md` 内嵌列表与来源清单完全一致。
 ## 问题 3：`.agents/index.md` 不会随语言集合更新
 ### 位置
 - `scripts/playbook.py`
 - 关键函数：
  - `sync_standards_action()`
  - `create_agents_index()`
 ### 现象
 首次执行 `sync_standards` 时会创建 `.agents/index.md`。但之后如果同步语言集合发生变化，这个文件不会刷新。
 实测场景：
 1. 第一次同步：`langs = ["tsl"]`
 2. 第二次同步：`langs = ["tsl", "cpp"]`
 结果：
 - `AGENTS.md` 已更新为同时列出 TSL 和 C++
 - `.agents/index.md` 仍然只保留首次创建时的 TSL 入口
 ### 根因
 `create_agents_index()` 中只要发现 `.agents/index.md` 已存在，就直接返回，不会执行任何重写或区块更新。
 ### 影响
 - 同一次同步产物内部不一致。
 - `AGENTS.md` 和 `.agents/index.md` 会长期漂移。
 - 后续代理或人工阅读时，可能误以为只有单语言规则生效。
 ### 证据
 - `scripts/playbook.py`
 - 手工复现实验：两次连续执行 `sync_standards`
 ### 修复建议
 - 把 `.agents/index.md` 改成幂等重生成。
 - 或为它增加类似 `AGENTS.md` 的区块更新机制。
 - 最好补一个回归测试，覆盖“第二次同步增加语言”的场景。
 ## 问题 4：本地验证文档默认依赖 POSIX shell，缺少平台前提说明
 ### 位置
 - `CONTRIBUTING.md`
 - `tests/README.md`
 - `tests/templates/*.sh`
 - `tests/integration/check_doc_links.sh`
 ### 现象
 仓库文档默认要求执行多条 `sh ...` 命令，例如：
 ```text
 sh tests/integration/check_doc_links.sh
 ```
 对于未安装 Git Bash / WSL / Git for Windows 的 Windows 环境，这类命令会直接失败，例如：
 ```text
 sh: The term 'sh' is not recognized
 ```
 当前分析环境中 `sh` 实际可用（来自 Git for Windows），因此上述报错在本机未复现。问题的核心不在于“仓库必然无法在 Windows 运行”，而在于文档没有明确说明这些检查默认依赖 POSIX shell。
 ### 根因
 模板验证和文档检查采用 POSIX shell 脚本实现，且文档没有明确声明运行这些命令需要 `sh` 环境（例如 Git Bash / WSL / Git for Windows）。仓库也没有提供 PowerShell 或 Python 的替代入口。
 ### 影响
 - 新环境用户可能误以为任意 Windows PowerShell 都能直接执行这些本地检查。
 - 实际开发体验与 CI（Linux）存在环境前提差异，但文档没有明确提示。
 - 这是文档说明问题，不是当前部署链路或 CI 链路的阻塞故障。
 ### 证据
 - `tests/README.md`
 - `CONTRIBUTING.md`
 - `tests/templates/validate_python_templates.sh`
 - `tests/templates/validate_cpp_templates.sh`
 - `tests/templates/validate_ci_templates.sh`
 - `tests/templates/validate_project_templates.sh`
 - `tests/integration/check_doc_links.sh`
 ### 修复建议
 - 在 `CONTRIBUTING.md` 和 `tests/README.md` 中明确说明这些本地检查默认需要 `sh` 环境。
 - 推荐使用 Git Bash / WSL / Git for Windows，并说明 CI 以 Linux 为准。
 - 只有在仓库明确要支持 Windows 本地开发时，再考虑补充 PowerShell 或 Python 替代入口。
 ## 问题 5：`load_config()` 真实入口测试曾缺失（现已补齐）
 ### 位置
 - `.gitea/workflows/test.yml`
 - `scripts/playbook.py`
 - `tests/test_toml_edge_cases.py`
 ### 现象
 这个问题按原描述“CI 只跑 Ubuntu，因此 Windows 回归会长期漏检”现在已经不完全成立。
 当前仓库虽然仍然只有 `ubuntu-22.04` runner，但已经补上了针对主入口 `load_config()` 的回归测试，而且测试输入直接使用 Windows 风格路径。由于这里的故障点是 TOML 字符串解析，而不是 Windows 系统调用，这类测试在 Linux 上同样能有效守住回归。
 ### 根因
 原先真正的问题不是“没有 Windows runner”本身，而是：
 - 测试只覆盖了备用解析器 `loads_toml_minimal()`。
 - 没有覆盖真实入口 `load_config()`。
 - 因而让人误以为主路径已有保护。
 这个缺口现在已经通过 `tests/test_toml_edge_cases.py` 中的 `load_config()` 回归测试补上。
 ### 影响
 - 历史上的漏测风险已显著下降。
 - 在当前“部署需跨平台、测试和工作流默认 Linux”这一边界下，它不再构成单独阻塞问题。
 - 只有当后续部署链路引入真正依赖 Windows OS 行为的逻辑时，才需要重新评估是否增加 Windows runner。
 ### 证据
 - `.gitea/workflows/test.yml`
 - `scripts/playbook.py`
 - `tests/test_toml_edge_cases.py`
 - `python -m unittest tests.test_toml_edge_cases -v`
 ### 处理建议
 - 保持 Linux CI 不变。
 - 持续保留 `load_config()` 的 Windows 路径回归测试。
 - 不再把“增加 Windows runner”作为当前问题的默认修复项；只有出现真正的 OS 级差异时再引入。
 ## 问题 6：Python 缓存忽略已补上（现已缓解）
 ### 位置
 - `.gitignore`
 ### 现象
 这个问题按原描述也已不再成立。`.gitignore` 已显式忽略当前仓库会产生的 Python 缓存目录：
 - `scripts/__pycache__`
 - `tests/__pycache__`
 - `tests/cli/__pycache__`
 并且 `git check-ignore -v` 可以验证这些路径当前都会被忽略。
 ### 根因
 历史上的问题确实是缺少缓存忽略项；当前仓库已经通过定向规则补上。结合仓库里 Python 文件目前只分布在 `scripts/`、`tests/` 和 `tests/cli/`，这些规则已经覆盖实际产物路径。
 ### 影响
 - 当前工作区不会再因为测试生成的 `__pycache__` 目录而变脏。
 - 剩余的只是“规则偏定向、不够通用”的风格问题，不再是当前确认问题。
 ### 证据
 - `.gitignore`
 - `git check-ignore -v scripts/__pycache__/x.pyc tests/__pycache__/x.pyc tests/cli/__pycache__/x.pyc`
 - `rg --files -g "*.py"`
 ### 处理建议
 - 保持现有规则即可。
 - 如果后续新增 Python 目录，再考虑收敛为通用规则：`__pycache__/`、`*.pyc`、`*.pyo`。
 ## 已执行验证
 已执行：
 ```text
 python -m unittest discover -s tests/cli -v
 python -m unittest discover -s tests -p "test_*.py" -v
 ```
 结果摘要：
 - `tests/cli`：3 通过，6 失败
 - `tests`：13 通过，14 失败
 无法在当前环境完成：
 ```text
 sh tests/integration/check_doc_links.sh
 ```
 原因：
 - 当前 Windows 环境缺少 `sh`
 ## 建议修复优先级
 1. 先修问题 1：否则大部分 CLI 回归测试都无法有效验证。
 2. 再修问题 2 和问题 3：这两项会直接影响产出内容正确性。
 3. 然后处理问题 4 和问题 5：补平台说明和测试覆盖。
 4. 最后处理问题 6：属于低风险但高频噪音问题。
--- a/scripts/playbook.py
+++ b/scripts/playbook.py
@ -21,6 +21,7 @@ ORDER = [
 ]
 SCRIPT_DIR = Path(__file__).resolve().parent
 PLAYBOOK_ROOT = SCRIPT_DIR.parent
 PATH_CONFIG_KEYS = {"project_root", "target_dir", "agents_home", "codex_home"}
 def usage() -> str:
@ -141,8 +142,63 @@ def loads_toml_minimal(raw: str) -> dict:
    return data
 def normalize_path_config_strings(raw: str) -> str:
    normalized_lines: list[str] = []
    for line in raw.splitlines():
        stripped = line.strip()
        if not stripped or stripped.startswith("#") or "=" not in line:
            normalized_lines.append(line)
            continue
        key_part, value_part = line.split("=", 1)
        key = key_part.strip()
        if key not in PATH_CONFIG_KEYS:
            normalized_lines.append(line)
            continue
        value = strip_inline_comment(value_part.strip())
        if len(value) < 2 or value[0] != '"' or value[-1] != '"' or "\\" not in value[1:-1]:
            normalized_lines.append(line)
            continue
        inner = value[1:-1]
        has_lone_backslash = False
        probe_idx = 0
        while probe_idx < len(inner):
            if inner[probe_idx] != "\\":
                probe_idx += 1
                continue
            if probe_idx + 1 < len(inner) and inner[probe_idx + 1] == "\\":
                probe_idx += 2
                continue
            has_lone_backslash = True
            break
        if not has_lone_backslash:
            normalized_lines.append(line)
            continue
        escaped: list[str] = []
        idx = 0
        while idx < len(inner):
            ch = inner[idx]
            if ch != "\\":
                escaped.append(ch)
                idx += 1
                continue
            if idx + 1 < len(inner) and inner[idx + 1] == "\\":
                escaped.extend(["\\", "\\"])
                idx += 2
                continue
            escaped.extend(["\\", "\\"])
            idx += 1
        normalized_lines.append(f'{key_part}= "{"".join(escaped)}"')
    suffix = "\n" if raw.endswith("\n") else ""
    return "\n".join(normalized_lines) + suffix
 def load_config(path: Path) -> dict:
-    raw = path.read_text(encoding="utf-8")
+    raw = normalize_path_config_strings(path.read_text(encoding="utf-8"))
    if tomllib is not None:
        return tomllib.loads(raw)
    return loads_toml_minimal(raw)
@ -830,8 +886,6 @@ def update_agents_block(agents_md: Path, block_lines: list[str]) -> None:
 def create_agents_index(agents_root: Path, langs: list[str], docs_prefix: str | None) -> None:
    agents_index = agents_root / "index.md"
    if agents_index.exists():
        return
    lines = [
        "# .agents（多语言）",
        "",
@ -859,7 +913,7 @@ def create_agents_index(agents_root: Path, langs: list[str], docs_prefix: str |
        f"- {docs_prefix or 'docs/standards/playbook/docs/'}",
    ]
    agents_index.write_text("\n".join(lines) + "\n", encoding="utf-8")
-    log("Created .agents/index.md")
+    log("Synced .agents/index.md")
 def rewrite_agents_docs_links(agents_dir: Path, docs_prefix: str) -> None:
--- a/tests/cli/test_playbook_cli.py
+++ b/tests/cli/test_playbook_cli.py
@ -101,6 +101,46 @@ langs = ["tsl"]
            self.assertEqual(result.returncode, 0)
            self.assertTrue(agents_index.is_file())
    def test_sync_standards_updates_agents_index_when_langs_expand(self):
        with tempfile.TemporaryDirectory() as tmp_dir:
            root = Path(tmp_dir)
            first_config = root / "playbook-first.toml"
            first_config.write_text(
                f"""
 [playbook]
 project_root = "{tmp_dir}"
 [sync_standards]
 langs = ["tsl"]
 no_backup = true
 """,
                encoding="utf-8",
            )
            first_result = run_cli("-config", str(first_config))
            self.assertEqual(first_result.returncode, 0)
            second_config = root / "playbook-second.toml"
            second_config.write_text(
                f"""
 [playbook]
 project_root = "{tmp_dir}"
 [sync_standards]
 langs = ["tsl", "cpp"]
 no_backup = true
 """,
                encoding="utf-8",
            )
            second_result = run_cli("-config", str(second_config))
            self.assertEqual(second_result.returncode, 0)
            agents_index = (root / ".agents" / "index.md").read_text(encoding="utf-8")
            self.assertIn("`.agents/tsl/index.md`", agents_index)
            self.assertIn("`.agents/cpp/index.md`", agents_index)
    def test_sync_standards_agents_block_has_blank_lines(self):
        with tempfile.TemporaryDirectory() as tmp_dir:
            config_body = f"""
--- a/tests/test_superpowers_list_sync.py
+++ b/tests/test_superpowers_list_sync.py
@ -4,6 +4,7 @@ from pathlib import Path
 ROOT = Path(__file__).resolve().parents[1]
 SKILLS_MD = ROOT / "SKILLS.md"
 SOURCES_LIST = ROOT / "codex" / "skills" / ".sources" / "superpowers.list"
 SOURCE_REF = "来源：`codex/skills/.sources/superpowers.list`（第三方来源清单）。"
 def read_sources_list() -> list[str]:
@ -14,28 +15,20 @@ def read_sources_list() -> list[str]:
    ]
-def read_skills_md_list() -> list[str]:
+def read_skills_md() -> str:
-    lines = SKILLS_MD.read_text(encoding="utf-8").splitlines()
+    return SKILLS_MD.read_text(encoding="utf-8")
    start = "<!-- superpowers:skills:start -->"
    end = "<!-- superpowers:skills:end -->"
    try:
        start_idx = lines.index(start) + 1
        end_idx = lines.index(end)
    except ValueError as exc:
        raise AssertionError("superpowers markers missing in SKILLS.md") from exc
    items = []
    for line in lines[start_idx:end_idx]:
        stripped = line.strip()
        if not stripped.startswith("-"):
            continue
        items.append(stripped.lstrip("- ").strip())
    return items
 class SuperpowersListSyncTests(unittest.TestCase):
-    def test_superpowers_list_matches_skills_md(self):
+    def test_superpowers_section_routes_to_source_list(self):
-        self.assertEqual(read_sources_list(), read_skills_md_list())
+        self.assertTrue(read_sources_list())
        text = read_skills_md()
        self.assertIn(SOURCE_REF, text)
        self.assertEqual(text.count("Third-party Skills (superpowers)"), 1)
        self.assertNotIn("<!-- superpowers:skills:start -->", text)
        self.assertNotIn("<!-- superpowers:skills:end -->", text)
 if __name__ == "__main__":
--- a/tests/test_toml_edge_cases.py
+++ b/tests/test_toml_edge_cases.py
@ -1,4 +1,6 @@
 import tempfile
 import unittest
 from pathlib import Path
 from scripts import playbook
@ -18,6 +20,44 @@ key = 1
        with self.assertRaises(ValueError):
            playbook.loads_toml_minimal(raw)
    def test_load_config_preserves_windows_users_path_in_basic_string(self):
        with tempfile.TemporaryDirectory() as tmp_dir:
            config_path = Path(tmp_dir) / "playbook.toml"
            config_path.write_text(
                '[playbook]\nproject_root = "C:\\Users\\demo\\workspace"\n',
                encoding="utf-8",
            )
            data = playbook.load_config(config_path)
        self.assertEqual(data["playbook"]["project_root"], r"C:\Users\demo\workspace")
    def test_load_config_preserves_windows_escape_like_segments_for_path_keys(self):
        with tempfile.TemporaryDirectory() as tmp_dir:
            config_path = Path(tmp_dir) / "playbook.toml"
            config_path.write_text(
                '[playbook]\nproject_root = "C:\\tmp\\notes"\n\n'
                '[install_skills]\nagents_home = "D:\\new\\tab"\n',
                encoding="utf-8",
            )
            data = playbook.load_config(config_path)
        self.assertEqual(data["playbook"]["project_root"], r"C:\tmp\notes")
        self.assertEqual(data["install_skills"]["agents_home"], r"D:\new\tab")
    def test_load_config_keeps_already_escaped_windows_path(self):
        with tempfile.TemporaryDirectory() as tmp_dir:
            config_path = Path(tmp_dir) / "playbook.toml"
            config_path.write_text(
                '[playbook]\nproject_root = "C:\\\\Users\\\\demo\\\\workspace"\n',
                encoding="utf-8",
            )
            data = playbook.load_config(config_path)
        self.assertEqual(data["playbook"]["project_root"], r"C:\Users\demo\workspace")
 if __name__ == "__main__":
    unittest.main()
Author	SHA1	Message	Date
ci-bot	e3faaaf69d	📦 deps(skills): sync superpowers Update Third-party Superpowers / Update thirdparty/skill snapshot (push) Successful in 1m15s Details	2026-03-10 01:59:03 +00:00
csh	3c9b306b64	🐛 fix(playbook): address reported repo issues Update Third-party Superpowers / Update thirdparty/skill snapshot (push) Successful in 1m16s Details normalize Windows path-like TOML config values, regenerate .agents/index.md on sync, and keep the SKILLS.md superpowers section route-only. also update ignore rules and refresh repo-issues.md to reflect the current status and verification evidence.	2026-03-10 09:57:22 +08:00