--- name: competitor-analysis description: "Research competitors with Browserbase discovery, enrichment lanes, screenshots, matrices, and HTML reports." license: MIT compatibility: Requires the browse CLI (npm install -g browse) and BROWSERBASE_API_KEY env var allowed-tools: Bash Agent AskUserQuestion metadata: author: browserbase version: "0.2.0" category: "marketing" risk: "safe" source: "official" source_repo: "browserbase/skills" source_type: "official" date_added: "2026-06-19" author: "Browserbase" license_source: "https://github.com/browserbase/skills/blob/main/skills/competitor-analysis/LICENSE.txt" tags: - competitor-analysis - browserbase - market-research - browser-automation tools: - claude-code - codex-cli - cursor --- # Competitor Analysis ## When to Use Use when the user needs structured competitor research with Browserbase discovery, enrichment lanes, screenshots, comparison matrices, and a final HTML report. _Source: [browserbase/skills](https://github.com/browserbase/skills) (MIT)._ Analyze a user's competitors. Uses Browserbase Search API for discovery and a 4-lane Plan→Research→Synthesize pattern for enrichment — outputting an HTML report with overview, per-competitor deep dives, a side-by-side feature/pricing matrix, and a chronological mentions feed. **Required**: `BROWSERBASE_API_KEY` env var and the `browse` CLI installed (`npm install -g browse`). **First-run setup**: On the first run you'll be prompted to approve `browse cloud fetch`, `browse cloud search`, `cat`, `mkdir`, `sed`, etc. Select **"Yes, and don't ask again for: browse cloud fetch:\*"** (or equivalent) for each. To permanently approve, add these to your `~/.claude/settings.json` under `permissions.allow`: ```json "Bash(browse:*)", "Bash(bunx:*)", "Bash(bun:*)", "Bash(node:*)", "Bash(cat:*)", "Bash(mkdir:*)", "Bash(sed:*)", "Bash(head:*)", "Bash(tr:*)", "Bash(rm:*)" ``` **Path rules**: Always use full literal paths in Bash — NOT `~` or `$HOME`. Resolve the home directory once and use it everywhere. When building subagent prompts, replace `{SKILL_DIR}` with the full literal path. **Output directory**: All output goes to `~/Desktop/{company_slug}_competitors_{YYYY-MM-DD}/`. This directory contains one `.md` file per competitor plus the generated HTML views and CSV. **CRITICAL — Tool restrictions (applies to main agent AND all subagents)**: - All web searches: use `browse cloud search`. NEVER WebSearch. - All page fetches: use `browse cloud fetch --allow-redirects` (returns markdown by default; add `--format raw` if you need the original HTML, then pipe through `sed ... | tr -s ' \n'` to extract text). NEVER WebFetch. 1 MB response limit — fall back to `browse get markdown` (after `browse open --remote`) for JS-heavy pages. - All research output: subagents write **one markdown file per competitor** to `{OUTPUT_DIR}/{competitor-slug}.md` using bash heredoc. NEVER use the Write tool or `python3 -c`. See `references/example-research.md` for the file format. - Report compilation: use `node {SKILL_DIR}/scripts/compile_report.mjs {OUTPUT_DIR} --user-company "{user_company}" --open` — generates `index.html`, `competitors/*.html`, `matrix.html`, `mentions.html`, `results.csv` in one step and opens overview. - URL deduplication: `node {SKILL_DIR}/scripts/list_urls.mjs /tmp --prefix competitor`. - **Subagents must use ONLY the Bash tool.** - **Main agent NEVER reads raw discovery JSON batch files.** **CRITICAL — Minimize permission prompts**: - Subagents MUST batch ALL file writes into a SINGLE Bash call using chained heredocs. - Batch ALL searches and ALL fetches into single Bash calls via `&&` chaining. ## Pipeline Overview Follow these 8 steps in order. Do not skip or reorder. 1. **User Company Research** — Deeply understand the user's company, produce `precise_category` + `category_include_keywords` + `exclusion_list` 2. **Depth Mode + Seed Input** — Choose depth, accept optional seed competitor URLs 3. **Discovery (3 parallel waves)** — Wave A (alternatives), Wave B (precise category), Wave C (comparison-page graph via "X vs Y" title parsing) 4. **Gate** — `scripts/gate_candidates.mjs` fetches each candidate's hero text (via `browse cloud fetch`) and drops wrong-category URLs 5. **Confirm enrichment set with the user** — Present PASS / UNKNOWN / rejected-brand-matches via `AskUserQuestion`. User ticks the real ones, adds any the discovery missed. Skipping this step is wasteful because enrichment is expensive (25 subagents × depth budget) and the gate is imperfect (JS-heavy homepages, Cloudflare challenges, semantic-variant taglines) 6. **Deep Enrichment (5 subagents per competitor in deep/deeper modes)** — Marketing, Discussion, Social, News, Technical — each lane a separate subagent writing to `partials/`; then `merge_partials.mjs` consolidates. In deep/deeper modes, **Step 5d** adds a 6th Battle Card synthesis lane AFTER Step 5c fact-check completes — produces per-competitor Landmines / Objection Handlers / Talk Tracks grounded in cited evidence. 7. **Screenshots** — `capture_screenshots.mjs` via the `browse` CLI captures a 1280×800 homepage hero per competitor 8. **HTML Report** — Overview + per-competitor (with embedded hero screenshot + Battle Card card) + matrix + mentions views --- ## Step 0: Setup Output Directory ```bash OUTPUT_DIR=~/Desktop/{company_slug}_competitors_{YYYY-MM-DD} mkdir -p "$OUTPUT_DIR" ``` Replace `{company_slug}` with the user's company name (lowercase, hyphenated) and `{YYYY-MM-DD}` with today's date. Pass `{OUTPUT_DIR}` as a full literal path to every subagent. Clean up discovery batch files from prior runs: ```bash rm -f /tmp/competitor_discovery_batch_*.json ``` **Re-runs must start from a clean `$OUTPUT_DIR`.** `compile_report.mjs` ingests *every* `{slug}.md` in the directory, and `merge_partials.mjs` only overwrites the slugs in the current set — it never deletes ones dropped from a new enrichment set. Since the directory is keyed by date, a same-day re-run with a different competitor set would leave stale competitors in the overview, matrix, CSV, and screenshots. Either use a fresh directory or clear the prior per-competitor files first: ```bash rm -f "$OUTPUT_DIR"/*.md && rm -rf "$OUTPUT_DIR"/partials "$OUTPUT_DIR"/screenshots ``` ## Step 1: User Company Research This step sets the baseline for what "competitor" means AND produces the verified data the Step 5b matrix will use for the `userCompany` row. **Rule**: The user's company gets the same 5-lane research depth as competitors. Do NOT fill `userCompany` in matrix.json from memory — it will ship false claims to the user's own team. On a search-API run (user company Exa, 2026-04-23), skipping this step produced a matrix that claimed Exa had a "published uptime SLA" (there is no numeric public SLA — only a status page) and marked its MIT-licensed Python SDK as `open-source: false` (the repo is github.com/exa-labs/exa-py, LICENSE confirmed MIT). Both errors would have surfaced in the "Where you're winning" card as fabricated moats. Process: 1. Ask the user for their company name or URL. 2. **Check for an existing profile** at `{SKILL_DIR}/profiles/{company-slug}.json`. If it exists, load it and confirm with the user: "I have your profile from {researched_at}. Still accurate?" — if yes, skip to Step 2 BUT still run the partial-lane enrichment below so matrix synthesis has fresh feature evidence. The profile format is shared with `company-research` (same shape). If a user already has a profile saved under `company-research/profiles/`, you may copy it into this skill's profiles directory rather than re-researching. 3. **Run the full 5-lane enrichment on the user's company** — identical to the competitor pattern in Step 5. For each lane, spawn a Bash-only subagent that writes to `{OUTPUT_DIR}/partials/{user-slug}.{lane}.md`: - **marketing** — tagline, positioning, pricing tiers, features, integrations, open-source components (SDK repos + licenses), regions offered, compliance (SOC 2 / HIPAA / trust portal URL) - **technical** — REST + streaming API support (with docs URLs), SDK languages, MCP server URL, neural vs keyword retrieval modes, reranking / highlights / live-crawl specifics, published uptime SLA (actual %, not status page), third-party retrieval-quality benchmarks - **discussion**, **social**, **news** — optional in quick mode, recommended in deep+ See `references/research-patterns.md` → "Self-Research" for sub-questions. Each finding MUST cite a URL. 4. Run `merge_partials.mjs` on the user's partials too — produces `{OUTPUT_DIR}/{user-slug}.md`, the canonical source Step 5b reads from for `userCompany` flags. 5. Synthesize into a profile: Company, Product, Existing Customers, Competitors (seed list), Use Cases, **precise_category**, **category_include_keywords**, **exclusion_list**. Do NOT include ICP — this skill doesn't need it. - `precise_category`: one sentence describing the category. e.g., "AI web search API for agents with neural + keyword retrieval". Avoid vague words like "tools" / "platform". - `category_include_keywords`: 8-15 phrases a direct competitor's marketing would likely contain (hero or title). Include semantic variants. - `exclusion_list`: phrases that indicate a *different* category — used by the gate to reject false positives (e.g. `antidetect browser`, `scraping api`, `screenshot api`, `residential proxy`). See `references/research-patterns.md` → "Synthesis Output" for the exact format and Exa as a worked example. 6. Present the profile + the user-company `.md` to the user for confirmation. Do not proceed until confirmed. 7. **Save the confirmed profile** to `{SKILL_DIR}/profiles/{company-slug}.json`. ## Step 2: Depth Mode + Seed Input Ask clarifying questions via `AskUserQuestion` with checkboxes: - **Known competitors?** Text area for URLs/names (optional — discovery will find more). - **Depth mode?** - `quick` — marketing surface only, many competitors, ~2-3 tool calls each - `deep` — + external signal (mentions, reviews, news), ~5-8 tool calls each - `deeper` — + public benchmarks + strategic diff vs user's company, ~10-15 tool calls each - **Target count?** Rough number of competitors to research (e.g., 10 / 20 / 50). This is the ONLY user interaction. After this, execute silently until the report is ready. | Mode | Research per competitor | Best for | |------|--------------------------|----------| | `quick` | Lane 1 only (homepage + pricing) | Scanning ~30-50 competitors fast | | `deep` | Lanes 1+2 | ~15-25 competitors with external signal | | `deeper` | All 4 lanes (+ benchmarks + strategic diff) | ~5-15 competitors with full intel | ## Step 3: Discovery (3 parallel waves) **Formula**: `ceil(target_count / 20)` queries per wave. Over-discover ~3x because the gate drops ~40-60%. Evaluation on a search-API run shows all three waves are additive — skip any and you lose real competitors: **Wave A — Generic alternatives** (broad; heavy aggregator noise, filtered out later) - `"alternatives to {user_company}"` - `"{user_company} competitors"` **Wave B — Precise category** (uses `precise_category` from the profile) - `"{precise_category}"` verbatim - 2-3 queries composed from the most distinctive tokens (e.g. `"web search api for ai agents"`, `"retrieval API for LLMs"`) **Wave C — Comparison-page graph** (highest precision) - `"{user_company} vs"` - `"{seed1} vs"`, `"{seed2} vs"`, `"{seed3} vs"` (seeds from the profile's `competitors` list) - After the searches, run `scripts/extract_vs_names.mjs` to parse `"X vs Y"` patterns from result titles — this uniquely surfaces competitors that don't appear as URL hits. **Process**: 1. Issue **3 parallel `browse cloud search` Bash calls** (one per wave) in a SINGLE message — NOT subagents. Each Bash call chains its 2-4 queries with `&&`. See `references/workflow.md` → "Discovery — parallel Bash, not subagents" for the exact recipe. Subagents are too heavy for a workload of 6-12 `browse cloud search` calls. 2. After all waves complete: ```bash node {SKILL_DIR}/scripts/list_urls.mjs /tmp --prefix competitor > /tmp/competitor_urls.txt node {SKILL_DIR}/scripts/extract_vs_names.mjs /tmp --prefix competitor \ --seed "{user_company},{seed1},{seed2},{seed3}" \ > /tmp/competitor_vs_names.jsonl ``` 3. **Filter** `/tmp/competitor_urls.txt` — remove blog posts, news, AI-tool directories (seektool.ai, respan.ai, agentsindex.ai, toolradar.com, aitoolsatlas.ai, vibecodedthis.com, etc.), review aggregators (g2.com, capterra.com), databases (crunchbase.com, tracxn.com), user's own domain. See `references/workflow.md` for the full noise-domain list. 4. For `vs_names` entries that have a resolved `domain`, add them. For unresolved names, optionally run `browse cloud search "{name}" --num-results 3` and pick the top root domain. 5. Merge with user-provided seed URLs. Dedup by hostname → `/tmp/competitor_candidates.txt`. ## Step 4: Gate (category-fit filter) Drop candidates whose marketing identifies them as a *different* category before enrichment burns tool calls on them. ```bash cat /tmp/competitor_candidates.txt \ | node {SKILL_DIR}/scripts/gate_candidates.mjs \ --include "{profile.category_include_keywords joined with commas}" \ --exclude "{profile.exclusion_list joined with commas}" \ --concurrency 6 \ > /tmp/competitor_gated.jsonl grep '"status":"PASS"' /tmp/competitor_gated.jsonl \ | node -e 'require("fs").readFileSync(0,"utf-8").split("\n").filter(Boolean).forEach(l => { try { console.log(JSON.parse(l).url); } catch {} })' \ > /tmp/competitor_passed.txt ``` The gate fetches each candidate's homepage via `browse cloud fetch --allow-redirects --format raw`, extracts the first 800 chars of visible text, and classifies position-aware: exclude in `` → REJECT; include in `<title>` → PASS; hybrid title → hero200 tiebreak; otherwise fall through. **Evaluated on a search-API run** with 12 mixed candidates: 7/7 real competitors passed, 4/4 wrong-category rejected, 1 known-hybrid edge case rejected. ## Step 4.5: Confirm enrichment set with the user **This step is mandatory. Do NOT skip to enrichment just because the gate ran.** Enrichment is expensive: 5 competitors × 5 lane-subagents = 25 subagents, ~10-15 minutes of wall clock, ~300 `browse cloud` calls. Running it on the wrong set wastes all of that. The gate also has known blind spots: - **JS-heavy homepages** (e.g. Tavily, Firecrawl) — `browse cloud fetch` returns near-empty text, so keyword matching has nothing to match on → REJECT or UNKNOWN - **Cloudflare challenge pages** (e.g. Perplexity) — title becomes "Just a moment..." → no category signal - **Semantic variants** — "search foundation" / "retrieval backbone" don't lexically match a list centered on "search API" - **Domain ambiguity** — `brave.com` (the browser) vs `api-dashboard.search.brave.com` (the actual API product) can confuse classification The user almost always has domain knowledge the skill lacks. Ask them. **Process** — the main agent: 1. Read `/tmp/competitor_gated.jsonl` and group rows: - **PASS bucket**: everything with status=PASS. - **UNKNOWN bucket**: status=UNKNOWN (fetch failed — always surface, these are the silent misses). - **Rejected-brand bucket**: top ~10 REJECT rows whose title mentions a well-known brand pattern (e.g. contains the token from a user-supplied seed list, or appears frequently in the Wave C "X vs Y" graph). 2. Present the buckets to the user, one table per bucket, with URL + title + reason (for rejects). 3. Use `AskUserQuestion` with a checkbox list of all candidates across the three buckets, plus a free-text "add more" field. The prompt should be explicit: > "Here are the gate's picks plus a few it was unsure about. Tick the ones that are real competitors in your space, and paste any URLs I missed (comma-separated). Enrichment will run on ONLY the ticked set." 4. Write the confirmed set to `/tmp/competitor_enrichment_set.txt` (one URL per line). This is the input for Step 5 — not `/tmp/competitor_passed.txt`. **If the user doesn't respond** or explicitly says "just run it", fall back to `/tmp/competitor_passed.txt` as-is, but warn in chat that the run may waste budget on wrong-category hits. **Exa test, 2026-04-24**: gate auto-passed 22 of 101 candidates but missed Tavily (generic title), Jina AI (semantic mismatch — "search foundation"), Firecrawl (JS-heavy fetch failure), and Perplexity (Cloudflare challenge). All four are real direct competitors. This step catches them. ## Step 5: Deep Enrichment Two modes. See `references/workflow.md` for prompt templates and wave management. See `references/research-patterns.md` for the lane-by-lane methodology. ### Quick mode — single subagent per batch - Input: `/tmp/competitor_enrichment_set.txt` (user-confirmed set from Step 4.5), ~8 competitors per subagent. - One subagent runs Lane A only (marketing surface). 2-3 tool calls each. - Writes directly to `{OUTPUT_DIR}/{slug}.md`. ### Deep / Deeper mode — 5 subagents PER competitor (parallel lane fan-out) For each competitor, launch 5 parallel subagents, one per lane: - **A. Marketing** (`marketing`): pricing, features, positioning, integrations, customers, team, funding, HQ. Owns canonical frontmatter. - **B. Discussion** (`discussion`): Reddit, HN, forums, Dev.to, Hashnode. Broad queries beyond `site:` — also `"{competitor}" review 2026`, `"{competitor}" issues OR problems`, `"{competitor}" discussion`. - **C. Social** (`social`): LinkedIn posts, YouTube videos, Twitter/X. Snippets only — do NOT fetch. - **D. News & Comparisons** (`news`): TechCrunch, Verge, VentureBeat, Forbes, Businesswire, Substack, blog reviews. Every mention needs a date. - **E. Technical & Benchmarks** (`technical`): GitHub benchmark repos/PRs, performance posts. Writes Benchmarks + technical Findings. Budget per lane: deep = 5-8 tool calls, deeper = 10-15. **Launch ALL competitor × lane subagents in a SINGLE Agent tool message.** For 10 competitors × 5 lanes = 50 parallel Agent calls in one message. Do NOT split into batches per competitor or per lane — wall clock collapses to the slowest single agent (~3-5 min). Splitting into 5 rounds of 10 cost 25 minutes of wall clock vs 5 minutes parallel on a real measured run; do not do it. Each subagent writes a partial to `{OUTPUT_DIR}/partials/{slug}.{lane}.md`. **Critical**: Pass the user's company name, product, and key features verbatim into every subagent prompt so the technical lane can do strategic diffing. Pass the full literal `{OUTPUT_DIR}` path to every subagent. ### Merge partials → canonical per-competitor file After all subagents for all competitors complete: ```bash node {SKILL_DIR}/scripts/merge_partials.mjs {OUTPUT_DIR} ``` Unions the 5 partials per competitor into one `{OUTPUT_DIR}/{slug}.md` — dedup'd Mentions (sorted by date desc), dedup'd Benchmarks, merged Findings, canonical frontmatter from the marketing lane. ### Synthesize the comparison matrix (write `matrix.json`) **Subagents write `key_features` and `integrations` as prose**, not as pipe-separated atomic feature labels. So a naive `|`-split axis becomes one-blob-per-competitor with no overlap — the rendered matrix shows a useless diagonal. The main agent fixes this by synthesizing a **shared taxonomy** across competitors and writing `{OUTPUT_DIR}/matrix.json`. `compile_report.mjs` auto-detects this file and renders the matrix from it instead of from the pipe split. **Process** — main agent: 1. Read ALL `{slug}.md` files, INCLUDING the user's company file `{user-slug}.md` produced in Step 1. The user is competitor #0 for matrix purposes — treat with identical rigor. 2. Produce a canonical list of 12-20 *atomic* features — each must be a yes/no proposition a competitor either has or doesn't (e.g. "MCP server", "SOC 2", "Site crawler", "Reranker"). Avoid sentence-length features. Avoid features only one competitor has. 3. Produce a canonical list of 10-20 integrations (frameworks, marketplaces, SDK languages). 4. For each company INCLUDING THE USER, map each taxonomy entry to `true` / `false` based on the enrichment data in their `.md` file. **Every flag must be traceable to a Research Findings bullet with a cited URL.** If the user's file says "exa-py MIT-licensed (github.com/exa-labs/exa-py)", the Open-source feature is `true` with that URL as the source. If not mentioned, leave `false`. 5. Write the result to `{OUTPUT_DIR}/matrix.json` in this shape: ```json { "category": "AI search APIs", "features": [{ "name": "Web Search API", "description": "..." }, ...], "integrations": [{ "name": "LangChain" }, ...], "userCompany": { "name": "Exa", "winningSummary": "Exa's moats are its first-party neural index and the integrated Research API — no one else in the set ships a semantic/embeddings-native retrieval primitive alongside a multi-step agentic research endpoint. It's also the only provider with a crawler product bundled in, and ties with SerpAPI on breadth of SDK language coverage.", "losingSummary": "Exa trails competitors on operational transparency — SerpAPI, Serper, and Tavily all publish hourly throughput SLAs, and Exa lacks a dedicated news endpoint that SerpAPI, Serper, and You.com all ship. Image/visual search is also missing vs 4 of 5 competitors.", "features": { "Web Search API": true, "Site crawler": true, ... }, "integrations": { "LangChain": true, ... } }, "competitors": { "tavily": { "features": { "Web Search API": true, "Site crawler": true, ... }, "integrations": { "LangChain": true, "Databricks Marketplace": true, ... } }, "serpapi": { "features": {...}, "integrations": {...} } } } ``` **`userCompany` is required**. The overview page renders two cards — "Where {user} is winning" and "Where {user} is losing". Populate `userCompany.features` and `userCompany.integrations` from the self-research profile (Step 1). Without this field those two cards don't render. **Write order (two passes — this resolves the apparent ordering tension below).** In this step (5b) write all `features` / `integrations` cells for `userCompany` and every competitor, plus a **draft** `winningSummary` / `losingSummary`. The drafts exist only to tell the Step 5c fact-checker which claims are high-stakes (it prioritizes cells named in the summaries). After Step 5c flips cells on verified evidence, **rewrite** the two summaries so the prose reflects only fact-checked cells. The JSON shape above shows the finalized post-fact-check object. **`userCompany.winningSummary` / `losingSummary` are strongly preferred** (analyst-style prose, 2-4 sentences each). When present, the cards render as paragraphs instead of bulleted lists — reads like a briefing, not a spreadsheet. If absent, the cards fall back to a bulleted list of winning/losing items with who-else-has-it. If this step is skipped, the matrix view falls back to the raw pipe-split axis (useless for atomic comparison) and the strategic summary doesn't render. Do not skip. ### Fact-check the matrix — spot-check the high-stakes cells (default) **Do not trust the taxonomy pass alone for high-stakes cells.** It is LLM inference from prose and will hallucinate moats. Observed during a search-API run (2026-04-23): matrix.json claimed SOC 2 was unique to the user's company; verification showed three of the other competitors also have SOC 2 Type II. But verifying every cell is the opposite mistake. A 7-company × 33-axis matrix has 231 cells. The Apr 2026 search-API run got stuck at 111+ tool calls in fact-check before interrupt — the subagent kept going on table-stakes cells (REST API, JSON responses, Python SDK) that are universal in the category. **Default = spot-check, not full sweep.** Only verify cells that meaningfully change the strategic narrative. Launch a single fact-check subagent (Bash-only) with **a hard 25-call budget** that targets ONLY these high-stakes axes: 1. **Every `userCompany.features` and `userCompany.integrations` cell** (the user's own moats — these go straight into "Where you're winning" prose). Typical: 17 + 16 = 33 cells, but most are obvious (your own product). Focus on: - Anything claimed as a *moat* in `winningSummary` - Anything claimed as a *gap* in `losingSummary` - Compliance (SOC 2, HIPAA, ISO 27001, GDPR) - Open-source license claims (MIT / Apache 2.0 / AGPL — observed wrong on a competitor's SDK) - Published uptime SLA (status page ≠ SLA) 2. **Across competitors, only the cells that drive the win/loss summary**: - For each "Winning" claim, verify the user has it AND verify the competitors don't. - For each "Losing" claim, verify the named competitors do have it. - Compliance + license + SLA across all competitors (high-trust, frequently wrong). 3. **Do NOT verify**: - Universal table-stakes (REST API, JSON responses, Python SDK, API-key auth) — every search API has these. - `false` cells with no claim being made (no moat lost or won). - Integration cells unless they appear in the win/loss summary. ``` You are a matrix spot-check subagent. Budget: 25 browse cloud calls TOTAL across all cells. Stop and return what you have when you hit the budget — partial fact-check is better than blocking the rest of the pipeline. TOOL RULES: Bash ONLY. browse cloud search + browse cloud fetch. Count your calls; stop at 25. PRIORITY ORDER (highest-stakes first — work down until budget): 1. Every cell that appears in userCompany.winningSummary or losingSummary 2. Compliance cells (SOC 2, HIPAA, ISO 27001) for user + every competitor 3. Open-source / self-hostable + license cells across all competitors 4. Pricing tier numbers ($X/mo, /hr) for user + competitors named in summaries 5. Funding / employee_estimate fields (only if cited in summaries) Skip: - Universal cells (REST API, JSON responses, Python SDK, API-key auth, etc.) - `false` cells where no claim is being made - Integration matrix cells unless they appear in summaries For each cell verified: - If `true` — find one source URL (docs, trust portal, GitHub LICENSE, etc). - If `false` — one targeted browse cloud search. Flip ONLY on first-party evidence. Output: matrix.json with `sources: { "Feature": "https://..." }` on the verified cells (other cells stay as-is). Cells-changed log to {OUTPUT_DIR}/matrix_fact_check.md with each flip + URL + quoted evidence. Report back: "spot-check: N cells verified, M flipped, B/25 budget used". ``` **Full-sweep mode (opt-in, slower)**: if the user explicitly says "full fact check" or for a high-stakes deliverable (board deck, press release), set the budget to 80 calls and verify every non-universal cell. Default is spot-check. After the subagent completes, re-read matrix.json, recompile, and surface `matrix_fact_check.md` delta to the user. The summary is much more trustworthy with spot-check than without — and ships in 3-5 minutes instead of stalling the pipeline. ### Step 5d: Battle Card synthesis (deep/deeper only, after Step 5c) **Depends on fact-checked matrix.json from Step 5c.** This is a sales-enablement lane. For each competitor, launch a Bash-only synthesis subagent (no new `browse cloud` calls) that reads all 5 existing partials + the user's merged `.md` + fact-checked `matrix.json`, and produces per-competitor Landmines / Objection Handlers / Talk Tracks grounded in cited evidence. Prompt template: `references/battle-card-subagent.md` (substitute `{COMPETITOR_SLUG}` / `{COMPETITOR_NAME}` / `{USER_COMPANY_NAME}` / `{USER_WINNING_SUMMARY}` per competitor). Format spec: `references/battle-card.md`. Output: `{OUTPUT_DIR}/partials/{slug}.battle.md` with a `## Battle Card` section. **Re-run the merge after this lane completes.** The Step 5 merge ran *before* the battle partials existed, so the consolidated `{slug}.md` files don't contain them yet. Re-run: ```bash node {SKILL_DIR}/scripts/merge_partials.mjs {OUTPUT_DIR} ``` This unions each `{slug}.battle.md` into its consolidated `{slug}.md` (the `battle` lane is already handled by `merge_partials.mjs`). `compile_report.mjs` reads the `## Battle Card` section from `{slug}.md` and renders it as a brand-accented card on the per-competitor HTML page. **Skip this re-merge and the battle cards never appear in the report.** **Why this lane is synthesis-only** — battle cards must be grounded in facts that already survived Step 5c. Letting the subagent do fresh `browse cloud` searches would reintroduce the hallucinated-moat problem the fact-check step exists to prevent. The subagent's adversarial self-check explicitly rejects claims not traceable to an input partial bullet or a `sources`-backed matrix cell. Parallelism: 1 subagent per competitor, all in one Agent-tool message (synthesis is fast, ~3-5 Bash calls per subagent). Skip this step in `quick` mode — there isn't enough research depth to ground the cards credibly. ## Step 6: Screenshots Capture a homepage hero screenshot per competitor: ```bash node {SKILL_DIR}/scripts/capture_screenshots.mjs {OUTPUT_DIR} --mode remote ``` Uses the `browse` CLI (`npm install -g browse`). The `--mode` flag selects the browser session: `remote` (default) drives a Browserbase session — best for protected/bot-detecting homepages and the only option without local Chrome; `local` uses Chrome on your machine. The script passes the corresponding `--remote` / `--local` flag on each `browse` command, so there is no separate environment-config step to run. Writes one PNG per competitor to `{OUTPUT_DIR}/screenshots/{slug}-hero.png`. The compile step in Step 7 auto-embeds the hero on each per-competitor HTML page. Cost: ~10-20s per competitor. ~60s for 5 competitors. ## Step 7: HTML Report 1. **Generate all views + CSV** (opens overview in browser): ```bash node {SKILL_DIR}/scripts/compile_report.mjs {OUTPUT_DIR} --user-company "{user_company}" --open ``` Produces: - `{OUTPUT_DIR}/index.html` — overview: competitor table with tagline, pricing summary, key features, strategic diff - `{OUTPUT_DIR}/competitors/{slug}.html` — per-competitor deep dive (all sections) - `{OUTPUT_DIR}/matrix.html` — side-by-side feature/pricing matrix - `{OUTPUT_DIR}/mentions.html` — chronological feed with source-type pills + client-side filter - `{OUTPUT_DIR}/results.csv` — flat spreadsheet 2. **Present a chat summary**: ``` ## Competitor Analysis Complete - **Competitors researched**: {count} - **Depth mode**: {mode} - **Mentions collected**: {total mentions} across {source types count} source types - **Public benchmarks found**: {count} - **Opened in browser**: ~/Desktop/{company_slug}_competitors_{date}/index.html ``` 3. Show the **overview table** in chat: ``` | Competitor | Positioning | Pricing | Key Features | Strategic Diff | |------------|-------------|---------|--------------|----------------| | Rival Co | AI-native web search API | $99/mo entry | semantic search, reranking, crawler | Similar retrieval; cheaper entry | ``` 4. Call out the top 3-5 most interesting findings — e.g., "3 competitors have public benchmarks; Rival Co is cheapest; Foo Inc launched a dedicated news-search endpoint 2 weeks ago." Offer to dig deeper into any specific competitor or re-run with different depth. ## Limitations - Requires the upstream tool, account, API key, or local setup when the workflow names one. - Does not authorize destructive, production, paid, or external-message actions without explicit user approval. - Validate generated artifacts or recommendations against the user's real sources before treating them as final.