playbook/antigravity-awesome-skills/skills/vercel-optimize/references/candidates.md

8.3 KiB

Candidate gates

The deterministic threshold expressions that turn observability signals into investigation candidates. Pure JS, no LLM. Thresholds live in lib/gates/*.mjs.

Total gates: 15. Budget cap: MAX_CODE_CANDIDATES = 6. Gate version: 1.8.0.

Gates

build_minutes_fanout

  • Threshold: Build Minutes share > 0.15 OR turbo-force-bypass finding present
  • Billing dimension: build
  • Scope: account
  • Source citation: vercel-optimize gate threshold

Build Minutes line dominates the bill or Turborepo cache is bypassed. On monorepos, unchanged work should be skipped through Vercel skip-unaffected behavior, a verified Ignored Build Step, and a complete Turbo cache contract.


cold_start

  • Threshold: coldPct > 0.4 AND total >= 1000
  • Billing dimension: function-duration
  • Scope: route
  • Source citation: vercel-optimize gate threshold

Routes where > 40% of invocations are cold-start, at meaningful traffic (>=1,000 total invocations in window). Cold starts add 200-800ms per request and break the perceived latency budget on cache-miss paths. The 40% threshold is where cold-rate becomes a real signal vs Poisson noise on serverless. Sourced from vercel.function_invocation.count grouped by function_start_type.


cwv_poor

  • Threshold: LCP p75>2500 OR INP p75>200 OR CLS p75>0.1, AND speed_insights count > 50
  • Billing dimension: speed-insights
  • Scope: route
  • Source citation: https://web.dev/articles/vitals

Routes where Core Web Vitals fall into Google's "Poor" band on real-user traffic. LCP > 2500ms, INP > 200ms, or CLS > 0.1 each hurt SEO and conversion. Surfaces one candidate per (route, metric) pair to keep recommendations focused.


external_api_slow

  • Threshold: p75Ms > 2000 AND callCount >= 500
  • Billing dimension: function-duration
  • Scope: route
  • Source citation: vercel-optimize gate threshold

External API hostnames with p75 latency above 2 seconds AND at least 500 calls in the window. External API latency is a primary driver of function duration cost when the upstream is on a hot path; a single slow stale call isn't worth recommending against.


isr_overrevalidation

  • Threshold: writes/reads > 0.5 AND writes > 100
  • Billing dimension: isr
  • Scope: route
  • Source citation: https://vercel.com/docs/incremental-static-regeneration

ISR routes with > 1 write per 2 reads. The revalidate interval is too aggressive relative to read traffic — many reads pay to regenerate. Investigate whether the page can tolerate a longer revalidate window or on-demand revalidation via revalidateTag.


middleware_heavy

  • Threshold: middlewareInv/totalInv > 0.5 AND middlewareInv > 1000
  • Billing dimension: edge-requests
  • Scope: account
  • Source citation: https://nextjs.org/docs/app/building-your-application/routing/middleware

Middleware invocations cover > 50% of total requests at non-trivial volume. The matcher is probably broader than necessary; narrow it to the paths that actually need auth/rewrites/headers.


observability_events_attribution

  • Threshold: observabilityEventsShare > 0.20 (critical at > 0.30)
  • Billing dimension: observability-events
  • Scope: account
  • Source citation: vercel-optimize gate threshold

Observability Events line item exceeds 20% of total billed cost. High share usually traces to low cache hit rate, middleware-heavy traffic, or unconstrained custom-span cardinality. No sampling lever exists for Observability Plus; reduce upstream invocations instead.


platform_bot_protection

  • Threshold: botIdEnabled=false AND (botPct >= 0.05 OR edge_cost >= $25/window OR requests >= 14k/14d)
  • Billing dimension: edge-requests
  • Scope: account
  • Source citation: vercel-optimize gate threshold

When BotID is disabled AND there is evidence (observed bot bandwidth share, edge cost, or substantial request volume) that bot traffic is non-trivial. Bot traffic inflates edge request counts without delivering user value; staged bot protection can reduce waste on bot-heavy projects. Skipped on quiet projects with no bot evidence — the recommendation would be noise.


platform_fluid_compute

  • Threshold: fluid=false AND (any cold_start signal OR any route with p95>1000ms AND inv>1000)
  • Billing dimension: function-duration
  • Scope: account
  • Source citation: vercel-optimize gate threshold

When Fluid Compute is disabled on a project that shows cold-start pressure (high cold-start rate) or sustained slow function p95 on hot routes. Fluid Compute reduces cold starts via instance reuse — recommend turning it on at the project level rather than per-route.


region_misconfig

  • Threshold: single-region pin found AND routes.length > 20 (scanner-only branch)
  • Billing dimension: function-duration
  • Scope: account
  • Source citation: vercel-optimize gate threshold

A single function region is pinned in vercel.json or per-route preferredRegion. Without per-region TTFB data (data gap), the gate can't quantify the geographic latency cost — but a single-region pin on a project with 20+ routes is worth auditing against Speed Insights traffic geo.


route_errors

  • Threshold: count > 250 OR (totalRequests >= 1000 AND errorRate > 0.01)
  • Billing dimension: function-duration
  • Scope: route
  • Source citation: vercel-optimize gate threshold

Routes producing > 250 5xx errors over the window, or with > 1% error rate on at least 1,000 total requests. Errored function invocations still bill at full duration; high error rates also poison user experience.


scanner-driven

  • Threshold: per-kind: scanner matches.length >= threshold
  • Billing dimension: mixed
  • Scope: mixed
  • Source citation: vercel-optimize gate threshold

Configured kinds emitted from scanner output. Each requires a minimum match count to avoid noise. Findings on cold-path or unmappable files are dropped unless the underlying scanner is trafficIndependent.


slow_route

  • Threshold: (p95 > 500 AND inv >= 1400) OR (p95 > 1500 AND inv >= 250); disqualified when 5xx rate > 50%; Vercel Workflow runtime endpoints are hard-gated
  • Billing dimension: function-duration
  • Scope: route
  • Source citation: vercel-optimize gate threshold

Routes with p95 function duration above 500ms at meaningful traffic (>=1,400 invocations in window), OR catastrophically slow routes (>1500ms p95 at any volume >=250). High duration drives both function-duration cost and user-perceived latency. Investigate sequential awaits, slow external APIs, missing caching, N+1 patterns. Routes with >50% 5xx rate are disqualified — those are reliability problems, not performance tuning targets, and surface via route_errors instead. Vercel Workflow runtime endpoints (/.well-known/workflow/v1/*) are hard-gated before launch because long-running step/flow requests are expected orchestration, not app-route bottlenecks.


uncached_route

  • Threshold: requests > 500 AND hitRate < 0.5 AND getShare > 0.2 (missing getShare is gated)
  • Billing dimension: edge-requests
  • Scope: route
  • Source citation: vercel-optimize gate threshold

Routes serving > 500 requests/period at < 50% cache hit AND at least 20% GET traffic. Each uncached GET request reaches the function, costing edge requests + function duration. Routes that are mostly POST/PUT/DELETE (Server Actions, mutations) are skipped — 0% cache is correct behavior there. Routes with missing method-share data are gated instead of launched. Auth-gated routes are disqualified separately.


usage_spike_triage

  • Threshold: any-day total > 2x mean OR any-day SKU > 3x SKU mean
  • Billing dimension: mixed
  • Scope: account
  • Source citation: vercel-optimize gate threshold

A single day in the billing window deviates sharply from the window baseline. Triage branches: bot or AI crawler spike, viral moment, pricing-model migration (legacy SKU → new), code regression. Without daily-granularity data, this gate stays dormant.