Three post-deploy checks I run after every Cloudflare Pages build
Beyond Build Success: Validating Static Deployments in Production
Current Situation Analysis
Modern CI/CD pipelines are optimized for artifact generation, not runtime reality. When you deploy a static site generator (SSG) like Astro 5 to an edge network such as Cloudflare Pages, the pipeline celebrates a green build and exits. This creates a dangerous blind spot: the gap between successful compilation and actual edge availability. Developers routinely assume that a passing build equals a live, crawlable, and performant site. In practice, static deployments introduce a unique failure surface that only manifests after CDN propagation, routing rule evaluation, and third-party crawler interaction.
This problem is systematically overlooked because traditional testing strategies focus on unit coverage, integration mocks, or pre-deploy E2E flows. Those strategies validate code logic, not deployment topology. Routing files like _redirects are evaluated at the edge, not during the build phase. A misconfigured rewrite rule can silently block search engine crawlers while appearing perfectly functional in a browser, since browsers automatically follow HTTP 301/302 responses. Similarly, build-time data pipelines (e.g., querying Turso or SQLite during SSG compilation) can fail to populate expected content without throwing a build error, resulting in empty sub-sitemaps or missing route segments.
Real-world incident data from production static deployments shows that routing misconfigurations can persist for up to five days before detection. IndexNow verification failures often go unnoticed until search rankings drop, and performance regressions from CSS framework updates (like Tailwind v4 layout shifts) compound silently across deploys. The industry has over-indexed on build-time validation while under-investing in post-deploy artifact verification. For static and SSG architectures, the runtime is simply pre-rendered HTML, CSS, and JSON served from an edge cache. The failure surface is narrow but critical: crawlability, indexing velocity, and baseline performance. Validating these three dimensions after every deployment closes the gap between compilation success and production readiness.
WOW Moment: Key Findings
The shift from build-gate validation to post-deploy artifact verification fundamentally changes how teams measure deployment health. Traditional CI/CD metrics focus on compile time, test coverage, and bundle size. Post-deploy validation shifts the metric focus to edge propagation latency, crawler accessibility, and indexing submission success. The following comparison illustrates the operational impact of adopting a targeted post-deploy validation pipeline versus relying solely on traditional gates or full E2E suites.
| Approach | Detection Latency | CDN Propagation Awareness | Maintenance Overhead | False Positive Rate |
|---|---|---|---|---|
| Traditional Build-Gate CI | High (misses edge routing) | None | Low | Low |
| Full E2E Testing Suite | Medium (mocks edge behavior) | Simulated only | High | Medium |
| Post-Deploy Validation Pipeline | Low (validates live edge) | Native | Low-Medium | Very Low |
This finding matters because it decouples deployment velocity from validation accuracy. Full E2E suites are expensive to maintain and often fail due to flaky network conditions or mocked edge behavior. Build gates catch syntax and logic errors but remain blind to routing rules, CDN caching headers, and third-party API dependencies. A targeted post-deploy pipeline validates exactly what matters for static architectures: whether crawlers can reach the sitemap, whether indexing APIs accept the live URLs, and whether performance baselines remain stable. The result is faster feedback loops, fewer false alarms, and immediate detection of edge-specific failures that would otherwise degrade SEO and user experience.
Core Solution
The validation pipeline consists of three independent checks, each targeting a specific failure mode in static edge deployments. The architecture prioritizes speed, accuracy, and separation of concerns.
Step 1: Sitemap Integrity & Reachability Validation
Search engines rely on sitemap-index.xml as the entry point for crawling. A single misconfigured _redirects rule can rewrite this path to a sub-sitemap, causing crawlers to receive a 301 instead of a 200. Browsers mask this behavior, but crawlers and validation tools require explicit success status codes.
The validation script performs two operations:
- Verifies that
sitemap-index.xmlreturns HTTP 200 without following redirects. - Parses the XML to extract sub-sitemap URLs and validates that each contains a minimum expected URL count.
// scripts/validate-sitemap.ts
import { fetch } from 'undici';
import { parseStringPromise } from 'xml2js';
import { readFile } from 'fs/promises';
interface SitemapConfig {
domain: string;
minUrlCount: number;
}
async function validateSitemap(config: SitemapConfig): Promise<void> {
const baseUrl = `https://${config.domain}`;
// Check 1: Index reachability without redirect following
const indexResponse = await fetch(`${baseUrl}/sitemap-index.xml`, {
redirect: 'manual',
headers: { 'User-Agent': 'SitemapValidator/1.0' }
});
if (indexResponse.status !== 200) {
throw new Error(`[${config.domain}] sitemap-index.xml returned ${indexResponse.status}`);
}
// Check 2: Parse and validate sub-sitemap counts
const indexXml = await indexResponse.text();
const parsed = await parseStringPromise(indexXml);
const sitemaps = parsed.sitemapindex.sitemap || [];
for (const site of sitemaps) {
const loc = site.loc[0];
const subResponse = await fetch(loc, { redirect: 'manual' });
if (subResponse.status !== 200) continue;
const subXml = await subResponse.text();
const subParsed = await parseStringPromise(subXml);
const urlCount = subParsed.urlset.url?.length || 0;
if (urlCount < config.minUrlCount) {
throw new Error(
`[${config.domain}] ${loc} contains ${urlCount} URLs (min: ${config.minUrlCount})`
);
}
console.log(`[${config.domain}] ${loc} β ${urlCount} URLs β`);
}
}
// Usage
const domains: SitemapConfig[] = [
{ domain: 'aiappdex.com', minUrlCount: 1000 },
{ domain: 'findindiegame.com', minUrlCount: 150 },
{ domain: 'ossfind.com', minUrlCount: 100 }
];
Promise.all(domains.map(validateSitemap)).catch(err => {
console.error('Validation failed:', err.message);
process.exit(1);
});
Architecture Rationale:
redirect: 'manual'prevents automatic 301/302 following, exposing routing misconfigurations that browsers hide.- XML parsing validates structural integrity, not just HTTP status. Empty sub-sitemaps indicate silent ETL or build pipeline failures.
- Thresholds are domain-specific, accounting for varying content volumes.
Step 2: IndexNow Batch Submission
IndexNow is a protocol that notifies search engines (Bing, Yandex, Naver, Seznam) of URL changes. It requires live, publicly accessible URLs and a verified key file (/<key>.txt) at the domain root. Submitting before CDN propagation completes results in 404 responses or stale content indexing.
// scripts/submit-indexnow.ts
import { fetch } from 'undici';
import { parseStringPromise } from 'xml2js';
interface IndexNowConfig {
domain: string;
key: string;
sitemapUrl: string;
}
async function submitToIndexNow(config: IndexNowConfig): Promise<void> {
const response = await fetch(config.sitemapUrl);
const xml = await response.text();
const parsed = await parseStringPromise(xml);
const urls = parsed.urlset.url?.map((u: any) => u.loc[0]) || [];
if (urls.length === 0) {
console.warn(`[${config.domain}] No URLs found in sitemap`);
return;
}
const payload = {
host: `https://${config.domain}`,
key: config.key,
urlList: urls
};
const submitResponse = await fetch('https://api.indexnow.org/indexnow', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload)
});
if (submitResponse.status === 403) {
throw new Error(
`[${config.domain}] IndexNow 403: Verify /${config.key}.txt is deployed and accessible`
);
}
if (!submitResponse.ok) {
throw new Error(`[${config.domain}] IndexNow failed: ${submitResponse.status}`);
}
console.log(`[${config.domain}] Submitted ${urls.length} URLs β ${submitResponse.status}`);
}
// Usage
const configs: IndexNowConfig[] = [
{ domain: 'aiappdex.com', key: 'a1b2c3d4e5f6', sitemapUrl: 'https://aiappdex.com/sitemap-0.xml' },
{ domain: 'findindiegame.com', key: 'f7g8h9i0j1k2', sitemapUrl: 'https://findindiegame.com/sitemap-0.xml' },
{ domain: 'ossfind.com', key: 'l3m4n5o6p7q8', sitemapUrl: 'https://ossfind.com/sitemap-0.xml' }
];
Promise.all(configs.map(submitToIndexNow)).catch(err => {
console.error('IndexNow submission failed:', err.message);
process.exit(1);
});
Architecture Rationale:
- Decoupled from the build pipeline. Execution is triggered manually via
workflow_dispatchafter deployment succeeds. This guarantees CDN propagation completes before submission. - 403 detection explicitly checks for missing key verification files, a common deployment oversight.
- Batch submission reduces API call volume and aligns with IndexNow rate limits.
Step 3: Scheduled Performance & Accessibility Baseline
Static sites should maintain stable performance metrics. Framework updates, CSS changes, or third-party script injections can introduce layout shifts or render-blocking resources. Lighthouse is used here as a trend monitor, not a deployment gate.
# .github/workflows/lighthouse-baseline.yml
name: Weekly Lighthouse Baseline
on:
schedule:
- cron: '30 4 * * 1' # Monday 04:30 UTC
workflow_dispatch:
jobs:
audit:
runs-on: ubuntu-latest
strategy:
matrix:
target:
- domain: aiappdex.com
path: /models/timm-vit-base-patch16-clip-224-openai/
- domain: findindiegame.com
path: /games/dredge-1562430/
- domain: ossfind.com
path: /alternatives/ghost/
steps:
- uses: actions/checkout@v4
- name: Run Lighthouse CI
uses: treosh/lighthouse-ci-action@v11
with:
urls: |
https://${{ matrix.target.domain }}
https://${{ matrix.target.domain }}${{ matrix.target.path }}
uploadArtifacts: true
temporaryPublicStorage: true
config: |
{
"extends": "lighthouse:default",
"settings": {
"onlyCategories": ["performance", "accessibility", "best-practices"],
"formFactor": "desktop"
}
}
Architecture Rationale:
- Weekly cron schedule balances monitoring frequency with static site update velocity. Daily runs are wasteful for pre-rendered content.
- Matrix strategy samples homepage and deep routes, catching both global and route-specific regressions.
temporaryPublicStorage: trueenables historical diffing without permanent storage costs.- No hard failure thresholds. Scores are treated as trend indicators; alerts trigger investigation, not deployment blocks.
Pitfall Guide
| Pitfall | Explanation | Fix |
|---|---|---|
| Following Redirects in Health Checks | Browsers and default HTTP clients auto-follow 301/302 responses, masking routing misconfigurations. Crawlers require explicit 200 status codes for sitemap discovery. | Use redirect: 'manual' or --max-redirs 0 in fetch/curl. Validate status code before parsing response body. |
| Premature IndexNow Submission | Submitting URLs before CDN propagation completes results in 404 responses or stale content indexing. Search engines may penalize repeated failed submissions. | Decouple submission from build pipeline. Trigger manually via workflow_dispatch after deployment confirmation, or implement a 2-3 minute propagation delay. |
| Hard-Gating Lighthouse Scores | Static sites experience minor metric fluctuations due to network variance, third-party script loading, or Lighthouse sampling. Hard thresholds block deployments for negligible regressions. | Treat scores as trend monitors. Configure alert-only thresholds (e.g., Performance < 80, CLS > 0.1) and investigate regressions without blocking CI. |
| Ignoring Sub-Sitemap URL Counts | The main sitemap-index.xml may exist and return 200, but sub-sitemaps could be empty due to silent build pipeline failures or missing data sources. |
Parse XML structure and validate URL count per sub-sitemap. Set domain-specific minimum thresholds based on expected content volume. |
Misplaced _redirects Files |
Cloudflare Pages evaluates _redirects from the publish directory root. Placing it in public/ or src/ without proper build copying results in ignored rules or unexpected rewrites. |
Ensure _redirects is copied to the build output directory. Validate with curl -I against the live domain after deployment. |
| Assuming Build-Time DB Success Equals Runtime Content | SSG frameworks query databases during compilation. A failed query may return empty arrays without throwing errors, resulting in missing routes or empty sitemaps. | Validate output artifacts post-build. Check route count, sitemap size, and generated HTML files against expected baselines. |
| Over-Engineering Validation for Static Assets | Running full E2E tests or uptime monitors on pre-rendered static sites adds unnecessary complexity. Edge networks handle availability; the real risk is crawlability and indexing. | Focus validation on three dimensions: sitemap reachability, indexing submission, and performance baselines. Omit runtime API checks and user flow tests for pure SSG deployments. |
Production Bundle
Action Checklist
- Verify sitemap-index.xml returns HTTP 200 without redirect following
- Parse sub-sitemap XML and validate minimum URL count thresholds
- Confirm IndexNow key verification file (
/<key>.txt) is deployed and accessible - Trigger IndexNow batch submission after CDN propagation completes
- Schedule weekly Lighthouse audits for homepage and deep routes
- Store Lighthouse results in temporary public storage for historical diffing
- Configure alert-only thresholds for performance and accessibility regressions
- Document domain-specific sitemap thresholds and IndexNow keys in environment variables
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Small static site (<100 pages) | Sitemap reachability + weekly Lighthouse | IndexNow overhead outweighs benefits for low-volume sites. Focus on crawlability and performance baselines. | Near-zero infrastructure cost |
| Medium SSG with dynamic data (100-5000 pages) | Full three-check pipeline | Data pipelines can silently drop content. IndexNow accelerates indexing for frequently updated directories. | Low (GitHub Actions minutes + API calls) |
| Large e-commerce SSG (>5000 pages) | Pipeline + incremental sitemap validation + CDN cache purge monitoring | High URL volume requires batched IndexNow submissions. Cache invalidation timing impacts indexing accuracy. | Medium (increased CI minutes, potential CDN egress) |
| Pre-revenue experimental sites | Sitemap reachability only | Minimal SEO dependency. Validate core artifact integrity without indexing overhead. | Minimal |
Configuration Template
# .github/workflows/post-deploy-validation.yml
name: Post-Deploy Validation
on:
workflow_dispatch:
inputs:
environment:
description: 'Target environment'
required: true
default: 'production'
type: choice
options:
- production
- staging
jobs:
validate-sitemap:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- run: npx tsx scripts/validate-sitemap.ts
env:
SITEMAP_DOMAINS: ${{ secrets.SITEMAP_DOMAINS }}
SITEMAP_THRESHOLDS: ${{ secrets.SITEMAP_THRESHOLDS }}
submit-indexnow:
needs: validate-sitemap
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- run: npx tsx scripts/submit-indexnow.ts
env:
INDEXNOW_KEYS: ${{ secrets.INDEXNOW_KEYS }}
INDEXNOW_SITEMAPS: ${{ secrets.INDEXNOW_SITEMAPS }}
Quick Start Guide
- Install dependencies: Add
undici,xml2js, andtsxto your project. These provide fast HTTP fetching, reliable XML parsing, and TypeScript execution without compilation overhead. - Configure environment secrets: Store domain lists, sitemap thresholds, and IndexNow keys in your repository secrets. Never hardcode credentials or domain configurations.
- Create the validation scripts: Copy the TypeScript examples into a
scripts/directory. Adjust domain arrays and thresholds to match your content volume. - Wire the workflow: Add the GitHub Actions template to
.github/workflows/. Configureworkflow_dispatchto trigger manually after Cloudflare Pages deployment completes. - Validate and iterate: Run the workflow against a staging environment first. Verify that sitemap parsing, IndexNow submission, and Lighthouse audits execute without errors. Adjust thresholds based on historical baseline data.
