8.3 KiB

Raw Blame History

Candidate gates

The deterministic threshold expressions that turn observability signals into investigation candidates. Pure JS, no LLM. Thresholds live in lib/gates/*.mjs.

Total gates: 15. Budget cap: MAX_CODE_CANDIDATES = 6. Gate version: 1.8.0.

Gates

`build_minutes_fanout`

Threshold: Build Minutes share > 0.15 OR turbo-force-bypass finding present
Billing dimension: build
Scope: account
Source citation: vercel-optimize gate threshold

Build Minutes line dominates the bill or Turborepo cache is bypassed. On monorepos, unchanged work should be skipped through Vercel skip-unaffected behavior, a verified Ignored Build Step, and a complete Turbo cache contract.

`cold_start`

Threshold: coldPct > 0.4 AND total >= 1000
Billing dimension: function-duration
Scope: route
Source citation: vercel-optimize gate threshold

Routes where > 40% of invocations are cold-start, at meaningful traffic (>=1,000 total invocations in window). Cold starts add 200-800ms per request and break the perceived latency budget on cache-miss paths. The 40% threshold is where cold-rate becomes a real signal vs Poisson noise on serverless. Sourced from vercel.function_invocation.count grouped by function_start_type.

`cwv_poor`

Threshold: LCP p75>2500 OR INP p75>200 OR CLS p75>0.1, AND speed_insights count > 50
Billing dimension: speed-insights
Scope: route
Source citation: https://web.dev/articles/vitals

Routes where Core Web Vitals fall into Google's "Poor" band on real-user traffic. LCP > 2500ms, INP > 200ms, or CLS > 0.1 each hurt SEO and conversion. Surfaces one candidate per (route, metric) pair to keep recommendations focused.

`external_api_slow`

Threshold: p75Ms > 2000 AND callCount >= 500
Billing dimension: function-duration
Scope: route
Source citation: vercel-optimize gate threshold

External API hostnames with p75 latency above 2 seconds AND at least 500 calls in the window. External API latency is a primary driver of function duration cost when the upstream is on a hot path; a single slow stale call isn't worth recommending against.

`isr_overrevalidation`

Threshold: writes/reads > 0.5 AND writes > 100
Billing dimension: isr
Scope: route
Source citation: https://vercel.com/docs/incremental-static-regeneration

ISR routes with > 1 write per 2 reads. The revalidate interval is too aggressive relative to read traffic — many reads pay to regenerate. Investigate whether the page can tolerate a longer revalidate window or on-demand revalidation via revalidateTag.

`middleware_heavy`

Threshold: middlewareInv/totalInv > 0.5 AND middlewareInv > 1000
Billing dimension: edge-requests
Scope: account
Source citation: https://nextjs.org/docs/app/building-your-application/routing/middleware

Middleware invocations cover > 50% of total requests at non-trivial volume. The matcher is probably broader than necessary; narrow it to the paths that actually need auth/rewrites/headers.

`observability_events_attribution`

Threshold: observabilityEventsShare > 0.20 (critical at > 0.30)
Billing dimension: observability-events
Scope: account
Source citation: vercel-optimize gate threshold

Observability Events line item exceeds 20% of total billed cost. High share usually traces to low cache hit rate, middleware-heavy traffic, or unconstrained custom-span cardinality. No sampling lever exists for Observability Plus; reduce upstream invocations instead.

`platform_bot_protection`

Threshold: botIdEnabled=false AND (botPct >= 0.05 OR edge_cost >= $25/window OR requests >= 14k/14d)
Billing dimension: edge-requests
Scope: account
Source citation: vercel-optimize gate threshold

When BotID is disabled AND there is evidence (observed bot bandwidth share, edge cost, or substantial request volume) that bot traffic is non-trivial. Bot traffic inflates edge request counts without delivering user value; staged bot protection can reduce waste on bot-heavy projects. Skipped on quiet projects with no bot evidence — the recommendation would be noise.

`platform_fluid_compute`

Threshold: fluid=false AND (any cold_start signal OR any route with p95>1000ms AND inv>1000)
Billing dimension: function-duration
Scope: account
Source citation: vercel-optimize gate threshold

When Fluid Compute is disabled on a project that shows cold-start pressure (high cold-start rate) or sustained slow function p95 on hot routes. Fluid Compute reduces cold starts via instance reuse — recommend turning it on at the project level rather than per-route.

`region_misconfig`

Threshold: single-region pin found AND routes.length > 20 (scanner-only branch)
Billing dimension: function-duration
Scope: account
Source citation: vercel-optimize gate threshold

A single function region is pinned in vercel.json or per-route preferredRegion. Without per-region TTFB data (data gap), the gate can't quantify the geographic latency cost — but a single-region pin on a project with 20+ routes is worth auditing against Speed Insights traffic geo.

`route_errors`

Threshold: count > 250 OR (totalRequests >= 1000 AND errorRate > 0.01)
Billing dimension: function-duration
Scope: route
Source citation: vercel-optimize gate threshold

Routes producing > 250 5xx errors over the window, or with > 1% error rate on at least 1,000 total requests. Errored function invocations still bill at full duration; high error rates also poison user experience.

`scanner-driven`

Threshold: per-kind: scanner matches.length >= threshold
Billing dimension: mixed
Scope: mixed
Source citation: vercel-optimize gate threshold

Configured kinds emitted from scanner output. Each requires a minimum match count to avoid noise. Findings on cold-path or unmappable files are dropped unless the underlying scanner is trafficIndependent.

`slow_route`

Threshold: (p95 > 500 AND inv >= 1400) OR (p95 > 1500 AND inv >= 250); disqualified when 5xx rate > 50%; Vercel Workflow runtime endpoints are hard-gated
Billing dimension: function-duration
Scope: route
Source citation: vercel-optimize gate threshold

Routes with p95 function duration above 500ms at meaningful traffic (>=1,400 invocations in window), OR catastrophically slow routes (>1500ms p95 at any volume >=250). High duration drives both function-duration cost and user-perceived latency. Investigate sequential awaits, slow external APIs, missing caching, N+1 patterns. Routes with >50% 5xx rate are disqualified — those are reliability problems, not performance tuning targets, and surface via route_errors instead. Vercel Workflow runtime endpoints (/.well-known/workflow/v1/*) are hard-gated before launch because long-running step/flow requests are expected orchestration, not app-route bottlenecks.

`uncached_route`

Threshold: requests > 500 AND hitRate < 0.5 AND getShare > 0.2 (missing getShare is gated)
Billing dimension: edge-requests
Scope: route
Source citation: vercel-optimize gate threshold

Routes serving > 500 requests/period at < 50% cache hit AND at least 20% GET traffic. Each uncached GET request reaches the function, costing edge requests + function duration. Routes that are mostly POST/PUT/DELETE (Server Actions, mutations) are skipped — 0% cache is correct behavior there. Routes with missing method-share data are gated instead of launched. Auth-gated routes are disqualified separately.

`usage_spike_triage`

Threshold: any-day total > 2x mean OR any-day SKU > 3x SKU mean
Billing dimension: mixed
Scope: account
Source citation: vercel-optimize gate threshold

A single day in the billing window deviates sharply from the window baseline. Triage branches: bot or AI crawler spike, viral moment, pricing-model migration (legacy SKU → new), code regression. Without daily-granularity data, this gate stays dormant.

8.3 KiB Raw Blame History

Candidate gates

Gates

build_minutes_fanout

cold_start

cwv_poor

external_api_slow

isr_overrevalidation

middleware_heavy

observability_events_attribution

platform_bot_protection

platform_fluid_compute

region_misconfig

route_errors

scanner-driven

slow_route

uncached_route

usage_spike_triage