playbook/antigravity-awesome-skills/plugins/antigravity-awesome-skills-.../skills/nextjs-seo-indexing/SKILL.md

264 lines
7.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
name: nextjs-seo-indexing
description: "Fix SEO indexing issues, crawl budget problems, and Search Console coverage errors for Next.js apps. Covers canonical tags, noindex audits, sitemap health, static rendering, and internal linking."
category: seo
risk: safe
source: self
source_type: self
date_added: "2026-05-31"
author: Whoisabhishekadhikari
tags: [seo, indexing, nextjs, search-console, crawl-budget, canonical, sitemap]
tools: [claude, cursor, gemini, claude-code]
version: 1.0.0
---
# Next.js SEO Indexing & Crawl Budget Skill
Fix Google Search Console coverage issues, canonical problems, sitemap errors, and crawl budget waste in Next.js apps.
---
## When to Use
- Use when a Next.js site has Google Search Console coverage issues such as duplicate canonicals, accidental noindex, crawl waste, or discovered-but-not-indexed URLs.
- Use when auditing sitemap, robots.txt, redirect, internal-linking, or static-rendering problems before an SEO release.
- Use when you need framework-specific examples for Next.js App Router metadata, `generateMetadata`, `robots.js`, and sitemap routes.
---
## Understanding Search Console Coverage States
| Status | Meaning | Fix |
|--------|---------|-----|
| Crawled not indexed | Google crawled but chose not to index | Improve content quality + canonical + internal links |
| Duplicate without canonical | Multiple URLs serve same content, no canonical | Add explicit canonical to the preferred URL |
| Excluded by noindex | `noindex` tag present | Remove noindex if page should be indexed |
| Duplicate, Google chose different canonical | Google prefers a different URL than you specified | Align canonical with the URL Google naturally picks |
| Alternative page with proper canonical | Correct — non-preferred duplicate pointing to canonical | Expected behavior, not a problem |
| Not found 404 | Page deleted or URL changed | Add redirect or restore page |
| Discovered not indexed | Google knows it exists but hasn't crawled it | Improve internal linking + crawl budget |
| Page with redirect | Redirect chain or redirect to wrong target | Shorten redirect chain, verify destination |
---
## Step 1 — Canonical Audit
### Next.js App Router (metadata export)
```js
// app/blog/my-post/page.js
export const metadata = {
title: 'My Post Title',
alternates: {
canonical: 'https://www.yourdomain.com/blog/my-post',
},
};
```
### Next.js App Router (generateMetadata)
```js
export async function generateMetadata({ params }) {
return {
alternates: {
canonical: `https://www.yourdomain.com/blog/${params.slug}`,
},
};
}
```
### Common canonical mistakes to fix:
```js
// ❌ WRONG — relative URL
canonical: '/blog/my-post'
// ❌ WRONG — missing trailing slash inconsistency
// (pick one and stick with it sitewide)
// ✓ CORRECT — absolute URL, consistent scheme + subdomain
canonical: 'https://www.yourdomain.com/blog/my-post'
```
---
## Step 2 — Noindex Audit
Find pages that are accidentally noindexed:
```bash
# Search for noindex in metadata
grep -r "noindex\|robots.*noindex" --include="*.{js,ts,jsx,tsx}" app/ pages/ -l
# Check layout.js — a noindex here affects ALL pages
grep -n "robots" app/layout.js
```
In Next.js App Router, `robots` in the root layout applies globally. Only set it there if you want the whole site affected.
```js
// app/layout.js — only set robots if you need sitewide control
export const metadata = {
// ✓ Allow indexing
robots: { index: true, follow: true },
// ❌ This would noindex the entire site:
// robots: { index: false }
};
```
---
## Step 3 — Sitemap Health
### Verify sitemap routes return 200 + valid XML
```bash
curl -sI https://www.yourdomain.com/sitemap.xml | grep -i "content-type\|status"
curl -s https://www.yourdomain.com/sitemap.xml | head -20
```
### Next.js App Router sitemap (recommended pattern)
```js
// app/sitemap.js
export default async function sitemap() {
const baseUrl = 'https://www.yourdomain.com';
// Static pages
const staticPages = [
{ url: baseUrl, lastModified: new Date(), changeFrequency: 'daily', priority: 1.0 },
{ url: `${baseUrl}/about`, lastModified: new Date(), changeFrequency: 'monthly', priority: 0.8 },
];
// Dynamic pages (fetch from DB or CMS)
const posts = await getPosts(); // your data fetch
const dynamicPages = posts.map(post => ({
url: `${baseUrl}/blog/${post.slug}`,
lastModified: new Date(post.updatedAt),
changeFrequency: 'weekly',
priority: 0.7,
}));
return [...staticPages, ...dynamicPages];
}
```
### Multiple sitemaps (sitemap index)
```js
// app/sitemap-tools/sitemap.js
// app/sitemap-blog/sitemap.js
// Each returns an array of URL entries
```
---
## Step 4 — Static Rendering Verification
Pages must be statically generated (or SSR with metadata in HTML) for Google to see SEO tags.
```bash
# Check build output — pages should show ● (static) not λ (dynamic)
npm run build 2>&1 | grep -E "○|●|λ|/blog|/tools"
```
```
○ /about (static)
● /blog/[slug] (SSG) ← good
λ /api/data (serverless) ← expected for APIs
```
If important pages are `λ` (fully dynamic with no static generation), add:
```js
// app/blog/[slug]/page.js
export async function generateStaticParams() {
const posts = await getPosts();
return posts.map(post => ({ slug: post.slug }));
}
```
---
## Step 5 — Internal Linking Audit
Pages with zero internal links are rarely indexed. Every important page should be reachable from:
1. Homepage or navigation
2. A sitemap
3. At least one other content page
```bash
# Find pages that have no inbound links from other pages
# (manual check — grep for the slug across all files)
grep -r "/blog/my-orphan-post" --include="*.{js,ts,jsx,tsx,md}" . | grep -v "sitemap\|the-page-itself"
```
---
## Step 6 — Redirect Audit
```bash
# Find all redirects in Next.js config
grep -A 3 "redirects" next.config.js
# Check for redirect chains (A → B → C — should be A → C)
# Test a suspected chain:
curl -sI https://www.yourdomain.com/old-url | grep -i location
```
```js
// next.config.js — keep redirects flat (no chains)
async redirects() {
return [
{
source: '/old-url',
destination: '/new-url', // Must NOT itself redirect
permanent: true, // 308 for SEO
},
];
}
```
---
## Step 7 — robots.txt Check
```bash
curl -s https://www.yourdomain.com/robots.txt
```
```
# ✓ Good
User-agent: *
Allow: /
Sitemap: https://www.yourdomain.com/sitemap.xml
# ❌ Bad — disallows crawling of important content
Disallow: /blog/
Disallow: /tools/
```
```js
// app/robots.js (Next.js App Router)
export default function robots() {
return {
rules: { userAgent: '*', allow: '/' },
sitemap: 'https://www.yourdomain.com/sitemap.xml',
};
}
```
---
## Indexing Checklist
- [ ] All important pages have absolute canonical URLs
- [ ] No important pages accidentally noindexed
- [ ] Sitemap routes return 200 with valid XML
- [ ] Sitemap submitted to Google Search Console
- [ ] Important pages statically generated (●) in build output
- [ ] No redirect chains (A→B→C should be A→C)
- [ ] robots.txt allows important content
- [ ] Every important page has ≥1 internal inbound link
- [ ] `generateStaticParams` added for dynamic routes with known slugs
## Limitations
- Does not guarantee Google will index a page; final indexing decisions remain with the search engine.
- Requires access to the codebase, deployed URLs, and ideally Google Search Console data for confident diagnosis.
- Treat recommendations that change URL structure, redirects, or canonical policy as production-impacting and review them before deployment.