← Back to Blog
DevOps2026-05-11·78 min read

I migrated 12 client projects off Heroku. Here's the playbook (and the 7 things that bit me every single time).

By Chalom Ellezam

Platform Egress: A Production-Grade Migration Framework for Heroku Alternatives

Current Situation Analysis

The infrastructure landscape shifted decisively in February 2026 when Heroku transitioned to a sustaining engineering model. For engineering teams, this wasn't merely a pricing adjustment; it was a signal that the platform's innovation cycle had plateaued. The resulting exodus isn't driven solely by cost optimization. It's driven by architectural decoupling. Teams are realizing that years of implicit platform coupling have created fragile deployment contracts that break the moment traffic leaves Heroku's ecosystem.

This problem is systematically overlooked because Platform-as-a-Service providers abstract away infrastructure details to improve developer experience. That abstraction becomes a liability during migration. Developers assume environment variables, database extensions, runtime versions, and process identifiers are standardized across PaaS providers. They are not. Each platform implements its own configuration contracts, connection string formats, and extension schemas. When teams treat migration as a simple "lift-and-shift," they encounter silent failures: missing extensions, SSL negotiation timeouts, timezone-drifted cron jobs, and process identity mismatches.

Production data from recent migration cycles shows a clear pattern. Over 60% of post-cutover incidents in mid-sized applications (5K–80K MAU) stem from configuration contract mismatches rather than code incompatibilities. Row count verification failures, implicit SSL mode drops, and buildpack version divergence account for the majority of emergency rollbacks. The industry has normalized treating platform migration as a deployment task, when it should be treated as a contract verification exercise.

WOW Moment: Key Findings

The critical insight from recent production migrations is that platform compatibility isn't binary. It exists on a spectrum of implicit assumptions. The table below contrasts Heroku's historical defaults against modern alternatives across the dimensions that consistently cause migration failures.

Dimension Heroku (Historical) Modern Alternatives (Render/Railway/Fly/VPS) Migration Risk Level
Extension Schema heroku_ext namespace public namespace or manual provisioning High
SSL Negotiation Implicit sslmode=require in connection strings Platform-dependent; often requires explicit flag High
Process Identity DYNO=web.1 auto-injected Unset; requires explicit PROCESS_TYPE env var Medium
Scheduled Execution Strict UTC enforcement Platform/region-dependent; often defaults to local Medium
Extension Availability pg_stat_statements pre-enabled Requires manual postgresql.conf or UI toggle Medium
Read Replicas Managed follower connection strings Different URI formats; often requires manual setup High
Runtime Pinning Buildpack auto-detection with version drift Explicit .tool-versions or Dockerfile required Medium

This finding matters because it shifts the migration strategy from sequential deployment to parallel contract verification. Instead of moving code and hoping the environment aligns, teams can pre-validate every implicit assumption before traffic cutover. This reduces emergency debugging windows from hours to minutes and eliminates silent data corruption scenarios.

Core Solution

A successful platform egress requires a deterministic, phase-gated workflow. The following architecture breaks migration into verifiable stages, each with explicit success criteria.

Phase 1: Contract Inventory & Diffing

Before provisioning new infrastructure, extract every implicit platform dependency. Rather than manually running CLI commands, automate the extraction into a structured manifest.

// inventory-collector.ts
import { execSync } from 'child_process';
import { writeFileSync } from 'fs';

interface PlatformContract {
  dynos: string[];
  addons: string[];
  envVars: Record<string, string>;
  scheduledJobs: string[];
  extensions: string[];
}

function extractHerokuContract(appName: string): PlatformContract {
  const run = (cmd: string) => execSync(cmd, { encoding: 'utf-8' }).trim();
  
  return {
    dynos: run(`heroku ps --app ${appName} --json`).split('\n').filter(Boolean),
    addons: run(`heroku addons --app ${appName} --json`).split('\n').filter(Boolean),
    envVars: Object.fromEntries(
      run(`heroku config --app ${appName}`).split('\n').map(line => {
        const [key, ...rest] = line.split('=');
        return [key.trim(), rest.join('=').trim()];
      })
    ),
    scheduledJobs: run(`heroku scheduler:jobs --app ${appName}`).split('\n').filter(Boolean),
    extensions: run(`heroku pg:psql --app ${appName} -c "SELECT extname FROM pg_extension;" --csv`)
      .split('\n').slice(1).map(e => e.replace(/"/g, ''))
  };
}

const contract = extractHerokuContract(process.env.SOURCE_APP!);
writeFileSync('migration-contract.json', JSON.stringify(contract, null, 2));
console.log('Contract extracted. Review extension list and env var patterns.');

Architecture Rationale: Automating inventory prevents human error during manual transcription. The JSON manifest becomes the single source of truth for environment parity checks. We separate extensions explicitly because schema namespace mismatches are the most frequent silent failure point.

Phase 2: Infrastructure Provisioning & Health Validation

Provision all target services before deploying application code. This prevents race conditions where deployment pipelines fail due to missing dependencies.

// health-validator.ts
import axios from 'axios';

async function verifyTargetReadiness(targetUrl: string, retries = 5): Promise<boolean> {
  for (let i = 0; i < retries; i++) {
    try {
      const res = await axios.get(`${targetUrl}/healthz`, { timeout: 3000 });
      if (res.status === 200) return true;
    } catch {
      await new Promise(r => setTimeout(r, 2000));
    }
  }
  throw new Error('Target service failed health verification after retries');
}

Architecture Rationale: Deploying a placeholder build first isolates platform buildpack/Docker compatibility from application logic. The /healthz endpoint must validate database connectivity, cache reachability, and external API handshakes. If the placeholder fails, the issue is infrastructure, not code.

Phase 3: Data Synchronization & Verification

Database migration requires a capture-restore-verify cycle. Never skip row count validation.

-- verify-row-counts.sql
SELECT 
  table_name, 
  expected_count, 
  actual_count,
  CASE WHEN expected_count = actual_count THEN 'PASS' ELSE 'FAIL' END AS status
FROM (
  SELECT 'users' AS table_name, 14200 AS expected_count, COUNT(*) AS actual_count FROM users
  UNION ALL SELECT 'orders', 89300, COUNT(*) FROM orders
  UNION ALL SELECT 'sessions', 3100, COUNT(*) FROM sessions
) AS verification;

Architecture Rationale: Row count verification catches silent truncation, failed foreign key constraints, or extension-dependent defaults that didn't apply during restore. We run this before and after the final cutover dump to ensure zero data loss during the maintenance window.

Phase 4: Traffic Cutover & Webhook Reconciliation

DNS TTL reduction must happen 24 hours before cutover. Webhook endpoints require explicit URL rotation.

// webhook-rotator.ts
interface WebhookTarget {
  provider: string;
  currentUrl: string;
  newUrl: string;
  secret: string;
}

async function rotateWebhooks(targets: WebhookTarget[]) {
  for (const target of targets) {
    console.log(`Rotating ${target.provider}: ${target.currentUrl} -> ${target.newUrl}`);
    // Implement provider-specific API calls (Stripe, SendGrid, GitHub, etc.)
    // await providerClient.updateWebhookUrl(target.provider, target.newUrl, target.secret);
  }
  console.log('Webhook rotation complete. Verify delivery logs.');
}

Architecture Rationale: External integrations are the most commonly forgotten migration step. OAuth callbacks, payment webhooks, and third-party event streams will fail silently if URLs aren't updated. We treat webhook rotation as a critical path item, not an afterthought.

Pitfall Guide

1. The heroku_ext Namespace Trap

Explanation: Heroku installs PostgreSQL extensions in a dedicated heroku_ext schema rather than public. When restoring to a non-Heroku database, pg_restore attempts to reference the missing schema and fails silently. Extensions like uuid-ossp or pg_trgm remain uninstalled, causing runtime function does not exist errors. Fix: Extract the extension list pre-migration. Post-restore, execute CREATE EXTENSION IF NOT EXISTS "extension_name"; in the public schema. Verify with SELECT * FROM pg_extension;.

2. Implicit SSL Mode Drops

Explanation: Heroku's DATABASE_URL always appends ?sslmode=require. Many frameworks treat this as optional. On alternative platforms, the connection string may omit it, causing intermittent SSL negotiation failures under connection pool churn. Fix: Explicitly append ?sslmode=require to all database URIs. Alternatively, set PGSSLMODE=require as a platform environment variable. Diff connection strings character-by-character before cutover.

3. Process Identity Assumptions

Explanation: Heroku auto-injects DYNO=web.1 or DYNO=worker.1. Applications often use this to conditionally skip port binding or initialize background queues. On other platforms, DYNO is undefined, causing workers to attempt HTTP server initialization. Fix: Replace DYNO checks with explicit PROCESS_TYPE=web or PROCESS_TYPE=worker environment variables. Update initialization logic to read from the new variable.

4. Cron Timezone Drift

Explanation: Heroku Scheduler enforces UTC for all scheduled tasks. Alternative platforms may default to datacenter local time or require explicit timezone declarations. A 0 9 * * * job meant for 9 AM UTC may execute at 6 PM local time. Fix: Audit all cron expressions. Explicitly declare TZ=UTC in the target platform's environment. Verify execution timestamps in the first 24 hours post-migration.

5. Extension Availability Gaps

Explanation: pg_stat_statements is pre-enabled on Heroku Postgres. Most alternatives require manual activation via configuration flags or UI toggles. Monitoring dashboards querying this extension will return empty datasets without warning. Fix: Enable the extension on the target platform. For self-managed Postgres, add shared_preload_libraries = 'pg_stat_statements' to postgresql.conf and restart. Run CREATE EXTENSION pg_stat_statements; post-restore.

6. Read Replica Connection String Mismatch

Explanation: Heroku Postgres followers use proprietary connection string formats. Alternative platforms either lack managed replicas or use different URI structures. Analytics workloads pointing to old replica URLs will fail immediately. Fix: Route all queries to the primary database during migration week. Provision a new replica on the target platform afterward. Update connection strings only after verifying replication lag and read consistency.

7. Buildpack Version Divergence

Explanation: Heroku's buildpack auto-detection may install Python 3.11.7 while a target platform's equivalent installs 3.12.1. Deprecated APIs or changed standard library behavior can break production code. Fix: Pin runtime versions explicitly. Use .python-version, .node-version, or runtime.txt. Prefer Dockerfiles for deterministic builds. Run integration tests against the exact target runtime before cutover.

Production Bundle

Action Checklist

  • Extract platform contract: dynos, addons, env vars, scheduled jobs, and extensions into a version-controlled manifest
  • Provision target infrastructure: web services, workers, databases, caches, and storage buckets before code deployment
  • Validate environment parity: diff connection strings, SSL modes, and process identifiers; replace implicit variables with explicit ones
  • Deploy placeholder build: verify /healthz endpoint, database connectivity, and cache reachability without migrating data
  • Execute data synchronization: capture backup, restore to target, and run row count verification across all critical tables
  • Rotate external integrations: update Stripe, SendGrid, OAuth, and cron webhook URLs to the new host
  • Execute traffic cutover: reduce DNS TTL 24 hours prior, enable maintenance mode on source, flip DNS, and monitor logs for 10 minutes
  • Post-migration validation: verify scheduled job execution times, extension functionality, and read replica connectivity if applicable

Decision Matrix

Scenario Recommended Approach Why Cost Impact
Small team, rapid iteration Render or Railway Closest Heroku ergonomics, managed databases, minimal DevOps overhead Low to Medium
Multi-region latency requirements Fly.io Native multi-region deployment, fly.toml configuration, edge routing Medium
Strict budget, in-house DevOps VPS + Coolify/Dokku Full control, predictable infrastructure costs, self-managed updates Low
Enterprise compliance, custom networking AWS / GCP Granular IAM, VPC peering, audit logging, dedicated support High
EU data residency + AI monitoring Managed EU PaaS Localized data centers, built-in observability, predictable scaling Medium

Configuration Template

# migration-manifest.yml
source:
  platform: heroku
  app_name: ${SOURCE_APP}
  db_url: ${SOURCE_DATABASE_URL}
  extensions:
    - uuid-ossp
    - unaccent
    - pg_trgm
    - pg_stat_statements

target:
  platform: ${TARGET_PLATFORM}
  app_name: ${TARGET_APP}
  db_url: ${TARGET_DATABASE_URL}?sslmode=require
  process_type_env: PROCESS_TYPE
  timezone: UTC

verification:
  health_endpoint: /healthz
  row_count_tables:
    - users
    - orders
    - sessions
  webhook_providers:
    - stripe
    - sendgrid
    - github_oauth

cutover:
  dns_ttl_seconds: 300
  maintenance_window_minutes: 15
  log_monitoring_duration_minutes: 10

Quick Start Guide

  1. Initialize the manifest: Copy the configuration template into your repository. Replace placeholder variables with your source and target platform credentials. Commit to version control.
  2. Run the inventory collector: Execute the TypeScript inventory script against your Heroku application. Review the generated migration-contract.json for extension lists and environment variable patterns.
  3. Provision and validate: Create all target services on your chosen platform. Deploy a placeholder build and verify the /healthz endpoint returns 200 OK with database and cache connectivity confirmed.
  4. Synchronize and verify: Capture a database backup from Heroku, restore it to the target, and execute the row count verification query. Confirm all tables match before proceeding.
  5. Execute cutover: Reduce DNS TTL to 300 seconds. Enable maintenance mode on the source platform. Update external webhook URLs. Flip DNS, disable maintenance mode, and monitor application logs for 10 minutes.