playbook/antigravity-awesome-skills/skills/competitor-analysis/SKILL.md

435 lines
31 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
name: competitor-analysis
description: "Research competitors with Browserbase discovery, enrichment lanes, screenshots, matrices, and HTML reports."
license: MIT
compatibility: Requires the browse CLI (npm install -g browse) and BROWSERBASE_API_KEY env var
allowed-tools: Bash Agent AskUserQuestion
metadata:
author: browserbase
version: "0.2.0"
category: "marketing"
risk: "safe"
source: "official"
source_repo: "browserbase/skills"
source_type: "official"
date_added: "2026-06-19"
author: "Browserbase"
license_source: "https://github.com/browserbase/skills/blob/main/skills/competitor-analysis/LICENSE.txt"
tags:
- competitor-analysis
- browserbase
- market-research
- browser-automation
tools:
- claude-code
- codex-cli
- cursor
---
# Competitor Analysis
## When to Use
Use when the user needs structured competitor research with Browserbase discovery, enrichment lanes, screenshots, comparison matrices, and a final HTML report.
_Source: [browserbase/skills](https://github.com/browserbase/skills) (MIT)._
Analyze a user's competitors. Uses Browserbase Search API for discovery and a 4-lane Plan→Research→Synthesize pattern for enrichment — outputting an HTML report with overview, per-competitor deep dives, a side-by-side feature/pricing matrix, and a chronological mentions feed.
**Required**: `BROWSERBASE_API_KEY` env var and the `browse` CLI installed (`npm install -g browse`).
**First-run setup**: On the first run you'll be prompted to approve `browse cloud fetch`, `browse cloud search`, `cat`, `mkdir`, `sed`, etc. Select **"Yes, and don't ask again for: browse cloud fetch:\*"** (or equivalent) for each. To permanently approve, add these to your `~/.claude/settings.json` under `permissions.allow`:
```json
"Bash(browse:*)", "Bash(bunx:*)", "Bash(bun:*)", "Bash(node:*)",
"Bash(cat:*)", "Bash(mkdir:*)", "Bash(sed:*)", "Bash(head:*)", "Bash(tr:*)", "Bash(rm:*)"
```
**Path rules**: Always use full literal paths in Bash — NOT `~` or `$HOME`. Resolve the home directory once and use it everywhere. When building subagent prompts, replace `{SKILL_DIR}` with the full literal path.
**Output directory**: All output goes to `~/Desktop/{company_slug}_competitors_{YYYY-MM-DD}/`. This directory contains one `.md` file per competitor plus the generated HTML views and CSV.
**CRITICAL — Tool restrictions (applies to main agent AND all subagents)**:
- All web searches: use `browse cloud search`. NEVER WebSearch.
- All page fetches: use `browse cloud fetch --allow-redirects` (returns markdown by default; add `--format raw` if you need the original HTML, then pipe through `sed ... | tr -s ' \n'` to extract text). NEVER WebFetch. 1 MB response limit — fall back to `browse get markdown` (after `browse open <url> --remote`) for JS-heavy pages.
- All research output: subagents write **one markdown file per competitor** to `{OUTPUT_DIR}/{competitor-slug}.md` using bash heredoc. NEVER use the Write tool or `python3 -c`. See `references/example-research.md` for the file format.
- Report compilation: use `node {SKILL_DIR}/scripts/compile_report.mjs {OUTPUT_DIR} --user-company "{user_company}" --open` — generates `index.html`, `competitors/*.html`, `matrix.html`, `mentions.html`, `results.csv` in one step and opens overview.
- URL deduplication: `node {SKILL_DIR}/scripts/list_urls.mjs /tmp --prefix competitor`.
- **Subagents must use ONLY the Bash tool.**
- **Main agent NEVER reads raw discovery JSON batch files.**
**CRITICAL — Minimize permission prompts**:
- Subagents MUST batch ALL file writes into a SINGLE Bash call using chained heredocs.
- Batch ALL searches and ALL fetches into single Bash calls via `&&` chaining.
## Pipeline Overview
Follow these 8 steps in order. Do not skip or reorder.
1. **User Company Research** — Deeply understand the user's company, produce `precise_category` + `category_include_keywords` + `exclusion_list`
2. **Depth Mode + Seed Input** — Choose depth, accept optional seed competitor URLs
3. **Discovery (3 parallel waves)** — Wave A (alternatives), Wave B (precise category), Wave C (comparison-page graph via "X vs Y" title parsing)
4. **Gate**`scripts/gate_candidates.mjs` fetches each candidate's hero text (via `browse cloud fetch`) and drops wrong-category URLs
5. **Confirm enrichment set with the user** — Present PASS / UNKNOWN / rejected-brand-matches via `AskUserQuestion`. User ticks the real ones, adds any the discovery missed. Skipping this step is wasteful because enrichment is expensive (25 subagents × depth budget) and the gate is imperfect (JS-heavy homepages, Cloudflare challenges, semantic-variant taglines)
6. **Deep Enrichment (5 subagents per competitor in deep/deeper modes)** — Marketing, Discussion, Social, News, Technical — each lane a separate subagent writing to `partials/`; then `merge_partials.mjs` consolidates. In deep/deeper modes, **Step 5d** adds a 6th Battle Card synthesis lane AFTER Step 5c fact-check completes — produces per-competitor Landmines / Objection Handlers / Talk Tracks grounded in cited evidence.
7. **Screenshots**`capture_screenshots.mjs` via the `browse` CLI captures a 1280×800 homepage hero per competitor
8. **HTML Report** — Overview + per-competitor (with embedded hero screenshot + Battle Card card) + matrix + mentions views
---
## Step 0: Setup Output Directory
```bash
OUTPUT_DIR=~/Desktop/{company_slug}_competitors_{YYYY-MM-DD}
mkdir -p "$OUTPUT_DIR"
```
Replace `{company_slug}` with the user's company name (lowercase, hyphenated) and `{YYYY-MM-DD}` with today's date. Pass `{OUTPUT_DIR}` as a full literal path to every subagent.
Clean up discovery batch files from prior runs:
```bash
rm -f /tmp/competitor_discovery_batch_*.json
```
**Re-runs must start from a clean `$OUTPUT_DIR`.** `compile_report.mjs` ingests *every* `{slug}.md` in the directory, and `merge_partials.mjs` only overwrites the slugs in the current set — it never deletes ones dropped from a new enrichment set. Since the directory is keyed by date, a same-day re-run with a different competitor set would leave stale competitors in the overview, matrix, CSV, and screenshots. Either use a fresh directory or clear the prior per-competitor files first:
```bash
rm -f "$OUTPUT_DIR"/*.md && rm -rf "$OUTPUT_DIR"/partials "$OUTPUT_DIR"/screenshots
```
## Step 1: User Company Research
This step sets the baseline for what "competitor" means AND produces the verified data the Step 5b matrix will use for the `userCompany` row.
**Rule**: The user's company gets the same 5-lane research depth as competitors. Do NOT fill `userCompany` in matrix.json from memory — it will ship false claims to the user's own team. On a search-API run (user company Exa, 2026-04-23), skipping this step produced a matrix that claimed Exa had a "published uptime SLA" (there is no numeric public SLA — only a status page) and marked its MIT-licensed Python SDK as `open-source: false` (the repo is github.com/exa-labs/exa-py, LICENSE confirmed MIT). Both errors would have surfaced in the "Where you're winning" card as fabricated moats.
Process:
1. Ask the user for their company name or URL.
2. **Check for an existing profile** at `{SKILL_DIR}/profiles/{company-slug}.json`. If it exists, load it and confirm with the user: "I have your profile from {researched_at}. Still accurate?" — if yes, skip to Step 2 BUT still run the partial-lane enrichment below so matrix synthesis has fresh feature evidence.
The profile format is shared with `company-research` (same shape). If a user already has a profile saved under `company-research/profiles/`, you may copy it into this skill's profiles directory rather than re-researching.
3. **Run the full 5-lane enrichment on the user's company** — identical to the competitor pattern in Step 5. For each lane, spawn a Bash-only subagent that writes to `{OUTPUT_DIR}/partials/{user-slug}.{lane}.md`:
- **marketing** — tagline, positioning, pricing tiers, features, integrations, open-source components (SDK repos + licenses), regions offered, compliance (SOC 2 / HIPAA / trust portal URL)
- **technical** — REST + streaming API support (with docs URLs), SDK languages, MCP server URL, neural vs keyword retrieval modes, reranking / highlights / live-crawl specifics, published uptime SLA (actual %, not status page), third-party retrieval-quality benchmarks
- **discussion**, **social**, **news** — optional in quick mode, recommended in deep+
See `references/research-patterns.md` → "Self-Research" for sub-questions. Each finding MUST cite a URL.
4. Run `merge_partials.mjs` on the user's partials too — produces `{OUTPUT_DIR}/{user-slug}.md`, the canonical source Step 5b reads from for `userCompany` flags.
5. Synthesize into a profile: Company, Product, Existing Customers, Competitors (seed list), Use Cases, **precise_category**, **category_include_keywords**, **exclusion_list**. Do NOT include ICP — this skill doesn't need it.
- `precise_category`: one sentence describing the category. e.g., "AI web search API for agents with neural + keyword retrieval". Avoid vague words like "tools" / "platform".
- `category_include_keywords`: 8-15 phrases a direct competitor's marketing would likely contain (hero or title). Include semantic variants.
- `exclusion_list`: phrases that indicate a *different* category — used by the gate to reject false positives (e.g. `antidetect browser`, `scraping api`, `screenshot api`, `residential proxy`).
See `references/research-patterns.md` → "Synthesis Output" for the exact format and Exa as a worked example.
6. Present the profile + the user-company `.md` to the user for confirmation. Do not proceed until confirmed.
7. **Save the confirmed profile** to `{SKILL_DIR}/profiles/{company-slug}.json`.
## Step 2: Depth Mode + Seed Input
Ask clarifying questions via `AskUserQuestion` with checkboxes:
- **Known competitors?** Text area for URLs/names (optional — discovery will find more).
- **Depth mode?**
- `quick` — marketing surface only, many competitors, ~2-3 tool calls each
- `deep` — + external signal (mentions, reviews, news), ~5-8 tool calls each
- `deeper` — + public benchmarks + strategic diff vs user's company, ~10-15 tool calls each
- **Target count?** Rough number of competitors to research (e.g., 10 / 20 / 50).
This is the ONLY user interaction. After this, execute silently until the report is ready.
| Mode | Research per competitor | Best for |
|------|--------------------------|----------|
| `quick` | Lane 1 only (homepage + pricing) | Scanning ~30-50 competitors fast |
| `deep` | Lanes 1+2 | ~15-25 competitors with external signal |
| `deeper` | All 4 lanes (+ benchmarks + strategic diff) | ~5-15 competitors with full intel |
## Step 3: Discovery (3 parallel waves)
**Formula**: `ceil(target_count / 20)` queries per wave. Over-discover ~3x because the gate drops ~40-60%.
Evaluation on a search-API run shows all three waves are additive — skip any and you lose real competitors:
**Wave A — Generic alternatives** (broad; heavy aggregator noise, filtered out later)
- `"alternatives to {user_company}"`
- `"{user_company} competitors"`
**Wave B — Precise category** (uses `precise_category` from the profile)
- `"{precise_category}"` verbatim
- 2-3 queries composed from the most distinctive tokens (e.g. `"web search api for ai agents"`, `"retrieval API for LLMs"`)
**Wave C — Comparison-page graph** (highest precision)
- `"{user_company} vs"`
- `"{seed1} vs"`, `"{seed2} vs"`, `"{seed3} vs"` (seeds from the profile's `competitors` list)
- After the searches, run `scripts/extract_vs_names.mjs` to parse `"X vs Y"` patterns from result titles — this uniquely surfaces competitors that don't appear as URL hits.
**Process**:
1. Issue **3 parallel `browse cloud search` Bash calls** (one per wave) in a SINGLE message — NOT subagents. Each Bash call chains its 2-4 queries with `&&`. See `references/workflow.md` → "Discovery — parallel Bash, not subagents" for the exact recipe. Subagents are too heavy for a workload of 6-12 `browse cloud search` calls.
2. After all waves complete:
```bash
node {SKILL_DIR}/scripts/list_urls.mjs /tmp --prefix competitor > /tmp/competitor_urls.txt
node {SKILL_DIR}/scripts/extract_vs_names.mjs /tmp --prefix competitor \
--seed "{user_company},{seed1},{seed2},{seed3}" \
> /tmp/competitor_vs_names.jsonl
```
3. **Filter** `/tmp/competitor_urls.txt` — remove blog posts, news, AI-tool directories (seektool.ai, respan.ai, agentsindex.ai, toolradar.com, aitoolsatlas.ai, vibecodedthis.com, etc.), review aggregators (g2.com, capterra.com), databases (crunchbase.com, tracxn.com), user's own domain. See `references/workflow.md` for the full noise-domain list.
4. For `vs_names` entries that have a resolved `domain`, add them. For unresolved names, optionally run `browse cloud search "{name}" --num-results 3` and pick the top root domain.
5. Merge with user-provided seed URLs. Dedup by hostname → `/tmp/competitor_candidates.txt`.
## Step 4: Gate (category-fit filter)
Drop candidates whose marketing identifies them as a *different* category before enrichment burns tool calls on them.
```bash
cat /tmp/competitor_candidates.txt \
| node {SKILL_DIR}/scripts/gate_candidates.mjs \
--include "{profile.category_include_keywords joined with commas}" \
--exclude "{profile.exclusion_list joined with commas}" \
--concurrency 6 \
> /tmp/competitor_gated.jsonl
grep '"status":"PASS"' /tmp/competitor_gated.jsonl \
| node -e 'require("fs").readFileSync(0,"utf-8").split("\n").filter(Boolean).forEach(l => { try { console.log(JSON.parse(l).url); } catch {} })' \
> /tmp/competitor_passed.txt
```
The gate fetches each candidate's homepage via `browse cloud fetch --allow-redirects --format raw`, extracts the first 800 chars of visible text, and classifies position-aware: exclude in `<title>` → REJECT; include in `<title>` → PASS; hybrid title → hero200 tiebreak; otherwise fall through.
**Evaluated on a search-API run** with 12 mixed candidates: 7/7 real competitors passed, 4/4 wrong-category rejected, 1 known-hybrid edge case rejected.
## Step 4.5: Confirm enrichment set with the user
**This step is mandatory. Do NOT skip to enrichment just because the gate ran.**
Enrichment is expensive: 5 competitors × 5 lane-subagents = 25 subagents, ~10-15 minutes of wall clock, ~300 `browse cloud` calls. Running it on the wrong set wastes all of that. The gate also has known blind spots:
- **JS-heavy homepages** (e.g. Tavily, Firecrawl) — `browse cloud fetch` returns near-empty text, so keyword matching has nothing to match on → REJECT or UNKNOWN
- **Cloudflare challenge pages** (e.g. Perplexity) — title becomes "Just a moment..." → no category signal
- **Semantic variants** — "search foundation" / "retrieval backbone" don't lexically match a list centered on "search API"
- **Domain ambiguity** — `brave.com` (the browser) vs `api-dashboard.search.brave.com` (the actual API product) can confuse classification
The user almost always has domain knowledge the skill lacks. Ask them.
**Process** — the main agent:
1. Read `/tmp/competitor_gated.jsonl` and group rows:
- **PASS bucket**: everything with status=PASS.
- **UNKNOWN bucket**: status=UNKNOWN (fetch failed — always surface, these are the silent misses).
- **Rejected-brand bucket**: top ~10 REJECT rows whose title mentions a well-known brand pattern (e.g. contains the token from a user-supplied seed list, or appears frequently in the Wave C "X vs Y" graph).
2. Present the buckets to the user, one table per bucket, with URL + title + reason (for rejects).
3. Use `AskUserQuestion` with a checkbox list of all candidates across the three buckets, plus a free-text "add more" field. The prompt should be explicit:
> "Here are the gate's picks plus a few it was unsure about. Tick the ones that are real competitors in your space, and paste any URLs I missed (comma-separated). Enrichment will run on ONLY the ticked set."
4. Write the confirmed set to `/tmp/competitor_enrichment_set.txt` (one URL per line). This is the input for Step 5 — not `/tmp/competitor_passed.txt`.
**If the user doesn't respond** or explicitly says "just run it", fall back to `/tmp/competitor_passed.txt` as-is, but warn in chat that the run may waste budget on wrong-category hits.
**Exa test, 2026-04-24**: gate auto-passed 22 of 101 candidates but missed Tavily (generic title), Jina AI (semantic mismatch — "search foundation"), Firecrawl (JS-heavy fetch failure), and Perplexity (Cloudflare challenge). All four are real direct competitors. This step catches them.
## Step 5: Deep Enrichment
Two modes. See `references/workflow.md` for prompt templates and wave management. See `references/research-patterns.md` for the lane-by-lane methodology.
### Quick mode — single subagent per batch
- Input: `/tmp/competitor_enrichment_set.txt` (user-confirmed set from Step 4.5), ~8 competitors per subagent.
- One subagent runs Lane A only (marketing surface). 2-3 tool calls each.
- Writes directly to `{OUTPUT_DIR}/{slug}.md`.
### Deep / Deeper mode — 5 subagents PER competitor (parallel lane fan-out)
For each competitor, launch 5 parallel subagents, one per lane:
- **A. Marketing** (`marketing`): pricing, features, positioning, integrations, customers, team, funding, HQ. Owns canonical frontmatter.
- **B. Discussion** (`discussion`): Reddit, HN, forums, Dev.to, Hashnode. Broad queries beyond `site:` — also `"{competitor}" review 2026`, `"{competitor}" issues OR problems`, `"{competitor}" discussion`.
- **C. Social** (`social`): LinkedIn posts, YouTube videos, Twitter/X. Snippets only — do NOT fetch.
- **D. News & Comparisons** (`news`): TechCrunch, Verge, VentureBeat, Forbes, Businesswire, Substack, blog reviews. Every mention needs a date.
- **E. Technical & Benchmarks** (`technical`): GitHub benchmark repos/PRs, performance posts. Writes Benchmarks + technical Findings.
Budget per lane: deep = 5-8 tool calls, deeper = 10-15.
**Launch ALL competitor × lane subagents in a SINGLE Agent tool message.** For 10 competitors × 5 lanes = 50 parallel Agent calls in one message. Do NOT split into batches per competitor or per lane — wall clock collapses to the slowest single agent (~3-5 min). Splitting into 5 rounds of 10 cost 25 minutes of wall clock vs 5 minutes parallel on a real measured run; do not do it.
Each subagent writes a partial to `{OUTPUT_DIR}/partials/{slug}.{lane}.md`.
**Critical**: Pass the user's company name, product, and key features verbatim into every subagent prompt so the technical lane can do strategic diffing. Pass the full literal `{OUTPUT_DIR}` path to every subagent.
### Merge partials → canonical per-competitor file
After all subagents for all competitors complete:
```bash
node {SKILL_DIR}/scripts/merge_partials.mjs {OUTPUT_DIR}
```
Unions the 5 partials per competitor into one `{OUTPUT_DIR}/{slug}.md` — dedup'd Mentions (sorted by date desc), dedup'd Benchmarks, merged Findings, canonical frontmatter from the marketing lane.
### Synthesize the comparison matrix (write `matrix.json`)
**Subagents write `key_features` and `integrations` as prose**, not as pipe-separated atomic feature labels. So a naive `|`-split axis becomes one-blob-per-competitor with no overlap — the rendered matrix shows a useless diagonal.
The main agent fixes this by synthesizing a **shared taxonomy** across competitors and writing `{OUTPUT_DIR}/matrix.json`. `compile_report.mjs` auto-detects this file and renders the matrix from it instead of from the pipe split.
**Process** — main agent:
1. Read ALL `{slug}.md` files, INCLUDING the user's company file `{user-slug}.md` produced in Step 1. The user is competitor #0 for matrix purposes — treat with identical rigor.
2. Produce a canonical list of 12-20 *atomic* features — each must be a yes/no proposition a competitor either has or doesn't (e.g. "MCP server", "SOC 2", "Site crawler", "Reranker"). Avoid sentence-length features. Avoid features only one competitor has.
3. Produce a canonical list of 10-20 integrations (frameworks, marketplaces, SDK languages).
4. For each company INCLUDING THE USER, map each taxonomy entry to `true` / `false` based on the enrichment data in their `.md` file. **Every flag must be traceable to a Research Findings bullet with a cited URL.** If the user's file says "exa-py MIT-licensed (github.com/exa-labs/exa-py)", the Open-source feature is `true` with that URL as the source. If not mentioned, leave `false`.
5. Write the result to `{OUTPUT_DIR}/matrix.json` in this shape:
```json
{
"category": "AI search APIs",
"features": [{ "name": "Web Search API", "description": "..." }, ...],
"integrations": [{ "name": "LangChain" }, ...],
"userCompany": {
"name": "Exa",
"winningSummary": "Exa's moats are its first-party neural index and the integrated Research API — no one else in the set ships a semantic/embeddings-native retrieval primitive alongside a multi-step agentic research endpoint. It's also the only provider with a crawler product bundled in, and ties with SerpAPI on breadth of SDK language coverage.",
"losingSummary": "Exa trails competitors on operational transparency — SerpAPI, Serper, and Tavily all publish hourly throughput SLAs, and Exa lacks a dedicated news endpoint that SerpAPI, Serper, and You.com all ship. Image/visual search is also missing vs 4 of 5 competitors.",
"features": { "Web Search API": true, "Site crawler": true, ... },
"integrations": { "LangChain": true, ... }
},
"competitors": {
"tavily": {
"features": { "Web Search API": true, "Site crawler": true, ... },
"integrations": { "LangChain": true, "Databricks Marketplace": true, ... }
},
"serpapi": { "features": {...}, "integrations": {...} }
}
}
```
**`userCompany` is required**. The overview page renders two cards — "Where {user} is winning" and "Where {user} is losing". Populate `userCompany.features` and `userCompany.integrations` from the self-research profile (Step 1). Without this field those two cards don't render.
**Write order (two passes — this resolves the apparent ordering tension below).** In this step (5b) write all `features` / `integrations` cells for `userCompany` and every competitor, plus a **draft** `winningSummary` / `losingSummary`. The drafts exist only to tell the Step 5c fact-checker which claims are high-stakes (it prioritizes cells named in the summaries). After Step 5c flips cells on verified evidence, **rewrite** the two summaries so the prose reflects only fact-checked cells. The JSON shape above shows the finalized post-fact-check object.
**`userCompany.winningSummary` / `losingSummary` are strongly preferred** (analyst-style prose, 2-4 sentences each). When present, the cards render as paragraphs instead of bulleted lists — reads like a briefing, not a spreadsheet. If absent, the cards fall back to a bulleted list of winning/losing items with who-else-has-it.
If this step is skipped, the matrix view falls back to the raw pipe-split axis (useless for atomic comparison) and the strategic summary doesn't render. Do not skip.
### Fact-check the matrix — spot-check the high-stakes cells (default)
**Do not trust the taxonomy pass alone for high-stakes cells.** It is LLM inference from prose and will hallucinate moats. Observed during a search-API run (2026-04-23): matrix.json claimed SOC 2 was unique to the user's company; verification showed three of the other competitors also have SOC 2 Type II.
But verifying every cell is the opposite mistake. A 7-company × 33-axis matrix has 231 cells. The Apr 2026 search-API run got stuck at 111+ tool calls in fact-check before interrupt — the subagent kept going on table-stakes cells (REST API, JSON responses, Python SDK) that are universal in the category.
**Default = spot-check, not full sweep.** Only verify cells that meaningfully change the strategic narrative.
Launch a single fact-check subagent (Bash-only) with **a hard 25-call budget** that targets ONLY these high-stakes axes:
1. **Every `userCompany.features` and `userCompany.integrations` cell** (the user's own moats — these go straight into "Where you're winning" prose). Typical: 17 + 16 = 33 cells, but most are obvious (your own product). Focus on:
- Anything claimed as a *moat* in `winningSummary`
- Anything claimed as a *gap* in `losingSummary`
- Compliance (SOC 2, HIPAA, ISO 27001, GDPR)
- Open-source license claims (MIT / Apache 2.0 / AGPL — observed wrong on a competitor's SDK)
- Published uptime SLA (status page ≠ SLA)
2. **Across competitors, only the cells that drive the win/loss summary**:
- For each "Winning" claim, verify the user has it AND verify the competitors don't.
- For each "Losing" claim, verify the named competitors do have it.
- Compliance + license + SLA across all competitors (high-trust, frequently wrong).
3. **Do NOT verify**:
- Universal table-stakes (REST API, JSON responses, Python SDK, API-key auth) — every search API has these.
- `false` cells with no claim being made (no moat lost or won).
- Integration cells unless they appear in the win/loss summary.
```
You are a matrix spot-check subagent. Budget: 25 browse cloud calls TOTAL across all cells.
Stop and return what you have when you hit the budget — partial fact-check is
better than blocking the rest of the pipeline.
TOOL RULES: Bash ONLY. browse cloud search + browse cloud fetch. Count your calls; stop at 25.
PRIORITY ORDER (highest-stakes first — work down until budget):
1. Every cell that appears in userCompany.winningSummary or losingSummary
2. Compliance cells (SOC 2, HIPAA, ISO 27001) for user + every competitor
3. Open-source / self-hostable + license cells across all competitors
4. Pricing tier numbers ($X/mo, /hr) for user + competitors named in summaries
5. Funding / employee_estimate fields (only if cited in summaries)
Skip:
- Universal cells (REST API, JSON responses, Python SDK, API-key auth, etc.)
- `false` cells where no claim is being made
- Integration matrix cells unless they appear in summaries
For each cell verified:
- If `true` — find one source URL (docs, trust portal, GitHub LICENSE, etc).
- If `false` — one targeted browse cloud search. Flip ONLY on first-party evidence.
Output: matrix.json with `sources: { "Feature": "https://..." }` on the
verified cells (other cells stay as-is). Cells-changed log to
{OUTPUT_DIR}/matrix_fact_check.md with each flip + URL + quoted evidence.
Report back: "spot-check: N cells verified, M flipped, B/25 budget used".
```
**Full-sweep mode (opt-in, slower)**: if the user explicitly says "full fact check" or for a high-stakes deliverable (board deck, press release), set the budget to 80 calls and verify every non-universal cell. Default is spot-check.
After the subagent completes, re-read matrix.json, recompile, and surface `matrix_fact_check.md` delta to the user. The summary is much more trustworthy with spot-check than without — and ships in 3-5 minutes instead of stalling the pipeline.
### Step 5d: Battle Card synthesis (deep/deeper only, after Step 5c)
**Depends on fact-checked matrix.json from Step 5c.** This is a sales-enablement lane. For each competitor, launch a Bash-only synthesis subagent (no new `browse cloud` calls) that reads all 5 existing partials + the user's merged `.md` + fact-checked `matrix.json`, and produces per-competitor Landmines / Objection Handlers / Talk Tracks grounded in cited evidence.
Prompt template: `references/battle-card-subagent.md` (substitute `{COMPETITOR_SLUG}` / `{COMPETITOR_NAME}` / `{USER_COMPANY_NAME}` / `{USER_WINNING_SUMMARY}` per competitor). Format spec: `references/battle-card.md`.
Output: `{OUTPUT_DIR}/partials/{slug}.battle.md` with a `## Battle Card` section.
**Re-run the merge after this lane completes.** The Step 5 merge ran *before* the battle partials existed, so the consolidated `{slug}.md` files don't contain them yet. Re-run:
```bash
node {SKILL_DIR}/scripts/merge_partials.mjs {OUTPUT_DIR}
```
This unions each `{slug}.battle.md` into its consolidated `{slug}.md` (the `battle` lane is already handled by `merge_partials.mjs`). `compile_report.mjs` reads the `## Battle Card` section from `{slug}.md` and renders it as a brand-accented card on the per-competitor HTML page. **Skip this re-merge and the battle cards never appear in the report.**
**Why this lane is synthesis-only** — battle cards must be grounded in facts that already survived Step 5c. Letting the subagent do fresh `browse cloud` searches would reintroduce the hallucinated-moat problem the fact-check step exists to prevent. The subagent's adversarial self-check explicitly rejects claims not traceable to an input partial bullet or a `sources`-backed matrix cell.
Parallelism: 1 subagent per competitor, all in one Agent-tool message (synthesis is fast, ~3-5 Bash calls per subagent). Skip this step in `quick` mode — there isn't enough research depth to ground the cards credibly.
## Step 6: Screenshots
Capture a homepage hero screenshot per competitor:
```bash
node {SKILL_DIR}/scripts/capture_screenshots.mjs {OUTPUT_DIR} --mode remote
```
Uses the `browse` CLI (`npm install -g browse`). The `--mode` flag selects the browser session: `remote` (default) drives a Browserbase session — best for protected/bot-detecting homepages and the only option without local Chrome; `local` uses Chrome on your machine. The script passes the corresponding `--remote` / `--local` flag on each `browse` command, so there is no separate environment-config step to run. Writes one PNG per competitor to `{OUTPUT_DIR}/screenshots/{slug}-hero.png`. The compile step in Step 7 auto-embeds the hero on each per-competitor HTML page.
Cost: ~10-20s per competitor. ~60s for 5 competitors.
## Step 7: HTML Report
1. **Generate all views + CSV** (opens overview in browser):
```bash
node {SKILL_DIR}/scripts/compile_report.mjs {OUTPUT_DIR} --user-company "{user_company}" --open
```
Produces:
- `{OUTPUT_DIR}/index.html` — overview: competitor table with tagline, pricing summary, key features, strategic diff
- `{OUTPUT_DIR}/competitors/{slug}.html` — per-competitor deep dive (all sections)
- `{OUTPUT_DIR}/matrix.html` — side-by-side feature/pricing matrix
- `{OUTPUT_DIR}/mentions.html` — chronological feed with source-type pills + client-side filter
- `{OUTPUT_DIR}/results.csv` — flat spreadsheet
2. **Present a chat summary**:
```
## Competitor Analysis Complete
- **Competitors researched**: {count}
- **Depth mode**: {mode}
- **Mentions collected**: {total mentions} across {source types count} source types
- **Public benchmarks found**: {count}
- **Opened in browser**: ~/Desktop/{company_slug}_competitors_{date}/index.html
```
3. Show the **overview table** in chat:
```
| Competitor | Positioning | Pricing | Key Features | Strategic Diff |
|------------|-------------|---------|--------------|----------------|
| Rival Co | AI-native web search API | $99/mo entry | semantic search, reranking, crawler | Similar retrieval; cheaper entry |
```
4. Call out the top 3-5 most interesting findings — e.g., "3 competitors have public benchmarks; Rival Co is cheapest; Foo Inc launched a dedicated news-search endpoint 2 weeks ago." Offer to dig deeper into any specific competitor or re-run with different depth.
## Limitations
- Requires the upstream tool, account, API key, or local setup when the workflow names one.
- Does not authorize destructive, production, paid, or external-message actions without explicit user approval.
- Validate generated artifacts or recommendations against the user's real sources before treating them as final.