← Back to Blog
Next.js2026-05-07·35 min read

Google deindexed half of my Next.js site. Here's the four-phase recovery.

By GasPriceCheck

Google deindexed half of my Next.js site. Here's the four-phase recovery.

Current Situation Analysis

A programmatic Next.js 15 site (ISR on Vercel, Cloudflare DNS proxy) managing ~33,620 ZIP code pages experienced sudden, large-scale deindexing. Google Search Console (GSC) flagged 87 hard 404s, 61 soft 404s, and a significant backlog of "crawled, currently not indexed" URLs.

Pain Points & Failure Modes:

  • Thin SSR Content: Un-cached long-tail ZIP pages rendered only a hero, search box, and loading placeholder (~640 visible words, mostly global nav/footer). Google's crawler classified this as a "glorified 404 wearing a 200 costume," triggering soft 404s.
  • Redirect Hop Chains: Multi-layer infrastructure (Cloudflare handling http→https and apex→www, Vercel handling slug migrations) created 2-hop redirect chains. Google's crawler penalizes multi-hop redirects, causing validation failures and signal dilution.
  • Broken Internal Graph: Missing state abbreviation lookups and client-side-only nearby ZIP grids resulted in orphaned pages and zero internal link equity flow for programmatic pages.
  • Geocoding Data Gaps: Incomplete ZIP-to-lat/lng mappings caused distance calculations to return nonsensical values (0,0 coordinates), breaking nearby ZIP grids and user trust signals.

Why Traditional Methods Fail: Standard 404 cleanup or simple sitemap.xml resubmissions do not address the root cause: crawler-visible content depth and infrastructure-level redirect topology. Framework-level fixes alone cannot override CDN redirect stacking, and client-side data fetching remains invisible to server-side crawlers.

WOW Moment: Key Findings

Approach SSR Visible Word Count Redirect Hop Count Internal Links/Page GSC Indexing Status
Baseline (Pre-Fix) ~640 2 0–1 61 Soft 404s, 87 Hard 404s
Phase 1 & 1.5 (Redirect Topology) ~640 1 0–1 Hard 404s cleared, Soft 404s persist
Phase 2 & 3 (SSR Enrichment + Geocoding) ~890 1 8–12 0 Soft/Hard 404s, Full Indexing

Key Findings:

  • SSR Threshold: Programmatic pages require ~850+ unique, crawler-visible words to avoid soft 404 classification. Adding a server-rendered nearby ZIP grid and static save tips crossed this threshold.
  • Hop Count Sensitivity: Reducing redirects from 2 to 1 hop immediately resolved stuck hard 404s. Google's crawler abandons or deprioritizes chains exceeding a single redirect.
  • Internal Link Density: Increasing internal links from <1 to 8–12 per page significantly improved site graph traversal and contextual relevance signals.

Core Solution

Phase 1: Framework-Level Redirect Mapping

Hard 404s from historical slug changes (/austin/austin-tx) were resolved using Next.js redirects() with permanent: true (308). URLs without clear successors were mapped to parent state pages to preserve link equity.

const STALE_CITY_REDIRECTS = [
  { source: '/austin', destination: '/austin-tx', permanent: true },
  { source: '/dallas', destination: '/dallas-tx', permanent: true },
  // ...64 more
];

module.exports = {
  async redirects() {
    return STALE_CITY_REDIRECTS;
  },
};

Phase 1.5: CDN Redirect Consolidation

Multi-layer redirects were collapsed into a single Cloudflare Redirect Rule, eliminating the 2-hop chain.

If: hostname matches "gas-price-check.com" AND scheme is http
Then: 308 to https://www.gas-price-check.com/$path

Phase 2: SSR Content Enrichment

Client-side data fetching was moved to server components to ensure crawler visibility. Unconditional local content and fallback link resolution were added to prevent thin content and broken graph edges.

// Before: client-side, invisible to Google
const nearby = useNearbyZips(zip);

// After: server-rendered, visible to Google
const nearby = await getNearbyZips(zip, 25);
return (
  <section>
    <h2>Nearby ZIP codes</h2>
    <ul>
      {nearby.map(z => <li key={z}><Link href={`/${z}`}>{z}</Link></li>)}
    </ul>
  </section>
);

State backlink fallback implementation:

const stateName = getStateByAbbr(state) ?? getStateByName(state) ?? state;

Phase 3: Geocoding Resolution & Caching

The ZIP-to-lat/lng resolver was updated to handle missing coordinates. A fallback geocoding service was integrated, and resolved coordinates were cached in Redis to prevent repeated API calls and ensure consistent distance calculations for the nearby ZIP grid.

Pitfall Guide

  1. Ignoring Multi-Layer Redirect Chains: Framework and CDN redirect rules stack by default. Always audit hop counts with curl -IL <url> and consolidate transformations at the edge to prevent crawler abandonment.
  2. Relying on Client-Side Rendering for Crawler-Visible Content: Google's crawler indexes SSR HTML. Data fetching must occur in server components or getServerSideProps/getStaticProps to ensure content depth is visible during initial crawl.
  3. Treating Soft 404s as Hard 404s: Soft 404s indicate insufficient content depth, not missing routes. Fix by enriching SSR output with unique, page-specific data (grids, tips, metadata) rather than redirecting or deleting.
  4. Neglecting Internal Link Graph Consistency: Broken or missing internal links (e.g., undefined state names) create orphaned pages and dilute crawl budget. Implement fallback resolution and validate link integrity across all programmatic templates.
  5. Assuming Static/Default Geocoding Data is Complete: ZIP code boundaries and coordinates change. Relying on a single static source causes calculation errors and empty UI states. Implement fallback resolvers and cache results to maintain data integrity.
  6. Overlooking SSR Word Count Thresholds: Google evaluates content depth relative to page type. Programmatic pages typically require 800+ unique, crawler-visible words to avoid soft 404 classification. Audit SSR output, not just client-side DOM.

Deliverables

  • 📘 Recovery Blueprint: 4-phase diagnostic & remediation workflow covering GSC error classification, redirect topology optimization, SSR content enrichment, and data resolver fallbacks.
  • ✅ Crawl Health Checklist:
    • Audit GSC for hard/soft 404s and "crawled, not indexed" backlog
    • Map redirect chains with curl -IL and consolidate to single-hop
    • Verify SSR HTML contains 800+ unique, page-specific words
    • Validate internal link density (8-12 links/page) and fallback resolution
    • Test geocoding/distance calculations for edge-case ZIPs
  • ⚙️ Configuration Templates:
    • next.config.mjs redirect mapping structure (308 permanent)
    • Cloudflare Redirect Rule syntax for edge-level scheme/host normalization
    • Server component pattern for crawler-visible data grids with fallback resolution