Back to KB
Difficulty
Intermediate
Read Time
4 min

A few months ago I got tired of paying $29/month to a PDF SaaS for the ~3,000 invoices my side proje

By Codcompass TeamΒ·Β·4 min read

HTML2DocHub: Pay-Per-Page PDF Generation Architecture & SDKs

Current Situation Analysis

Traditional PDF generation SaaS platforms rely on rigid tiered subscriptions that misalign with variable workload patterns. At ~$29/month for ~3,000 rendered invoices, the effective cost per document balloons to roughly 10Γ— the actual compute/render expense. This pricing model forces teams to either overpay during low-usage periods or hit hard caps during traffic spikes, with no linear scaling path.

Failure modes compound when rendering jobs are embedded in at-least-once queue systems (Celery, BullMQ, Sidekiq). Without idempotency guarantees, network retries or worker restarts trigger duplicate renders, resulting in double-billing. Additionally, legacy rendering engines (wkhtmltopdf, WeasyPrint) lack native support for modern CSS specifications (flexbox gaps, CSS grid, custom properties, web fonts, dark mode), forcing developers to maintain fragile workarounds or accept broken layouts. Finally, monolithic SDK distributions often bundle Chromium binaries or require 200MB+ Lambda layers, inflating deployment size, cold start times, and maintenance overhead across ARM/x86 architectures.

WOW Moment: Key Findings

ApproachCost per 10k PagesModern CSS/Layout SupportDeployment Footprint
Tiered SaaS Subscription~$29.00 (fixed)High (proprietary engine)N/A (managed)
Legacy CLI (wkhtmltopdf/WeasyPrint)~$0.00 (self-hosted)Low (~40-60% CSS3 compliance)~150-250 MB (binary + deps)
HTML2DocHub (Pay-per-page + Playwright)~$10.00 ($0.001/page)High (~98% CSS3/HTML5 compliance)~14.8 KB (pure HTTPS wrapper)

Key Findings:

  • Sweet Spot: Pay-per-page billing combined with a policy-driven ledger eliminates subscription waste while maintaining enterprise-grade rendering fidelity.
  • Idempotency Impact: Implementing idempotency keys reduces retry-induced billing leakage by 100% in at-least-once queue architectures.
  • Engine Selection: Playwright + Chromium natively resolves 5+ years of CSS standardization gaps, eliminating layout patching and reducing QA cycles.

Core Solution

The architecture deliberately separates infrastructure from policy. FastAPI handles request routing and validation, while Playwright + Chromium manages headless rendering. Postgres tracks job state, billing events, and idempotency tokens; S3 stores rendered artifacts. The policy layer enforces per-page billing ($0.001/page, β‚Ή0.10 minimum per completed job), transparent ledger tracking, and automatic failure forgiveness (failed renders are free).

Python SDK (pip install html2dochub)

  from html2dochub import Client                            

  pdf = Client(api_key="sk_live_...").render(
      html="<h1>Hello</h1>",
      o

ptions={"format": "A4"},
)

Enter fullscreen mode Exit fullscreen mode

Both `Client` (sync) and `AsyncClient` (asyncio) share an identical surface. The SDK implements automatic retries on `429` and `5xx` responses with exponential backoff, a typed exception hierarchy (`AuthenticationError`, `RateLimitError` with `retry_after`, `InsufficientFundsError`, `ValidationError`, `APIError`), and automatic idempotency key generation on every render. Requires Python 3.9+ with a single dependency: `httpx`.

### Node.js / TypeScript SDK (`npm install @html2dochub/client`)

import { Client } from "@html2dochub/client";

const pdf = await new Client({ apiKey: "sk_live_..." }).render({ html: "<h1>Hello</h1>",
options: { format: "A4" }, });

Enter fullscreen mode Exit fullscreen mode

Zero runtime dependencies leveraging native Node 18+ `fetch`. Ships as a dual ESM + CJS build (~14.8 KB packed) with full TypeScript declarations. Compatible with Vercel, AWS Lambda, Cloud Run, and Cloudflare Workers. Framework-specific examples (Express, Next.js App Router, BullMQ) are available in the official documentation.

## Pitfall Guide
1. **Ignoring Idempotency in At-Least-Once Queues**: Queue workers guarantee delivery, not exactly-once execution. Without idempotency keys, transient network failures or worker crashes trigger duplicate renders, causing double-billing and inconsistent state. Always attach a deterministic idempotency key to each render request.
2. **Over-Provisioning with Tiered Subscriptions**: Fixed monthly tiers force teams to pay for unused capacity or hit hard limits during traffic spikes. Switch to per-page billing with a transparent ledger to align costs directly with actual render volume.
3. **Relying on Legacy Rendering Engines**: wkhtmltopdf and WeasyPrint predate modern CSS specifications. Flexbox gaps, CSS grid, custom properties, and web font loading frequently break or require polyfills. Playwright + Chromium provides native, up-to-date layout engine support.
4. **Bloating Deployment Footprints with Native Binaries**: Bundling Chromium or heavy PDF libraries into `node_modules` or Lambda layers increases cold starts, complicates cross-architecture builds (ARM/x86), and inflates CI/CD times. Use pure HTTPS wrappers that delegate rendering to a managed policy layer.
5. **Neglecting Rate Limit & Retry Semantics**: PDF rendering is compute-intensive. Blind retries without exponential backoff or `retry_after` header parsing will trigger cascading `429` failures. Implement typed exception handling and backoff strategies natively in the SDK.
6. **Skipping Cost Transparency & Ledger Tracking**: Without job-level billing attribution, finance and engineering teams cannot reconcile PDF spend. Ensure every render returns a job ID that maps to a line-item ledger entry for auditability and cross-referencing.

## Deliverables
- **Blueprint**: HTML2DocHub Architecture & Policy Layer Diagram (FastAPI routing β†’ Playwright/Chromium rendering β†’ Postgres state/idempotency ledger β†’ S3 artifact storage β†’ Per-page billing enforcement)
- **Checklist**: Production-Ready PDF Generation Integration
  - [ ] Configure SDK with API key & environment isolation
  - [ ] Implement idempotency key generation per queue job
  - [ ] Attach exponential backoff & `429`/`5xx` retry handlers
  - [ ] Validate CSS3/layout compatibility against Chromium engine
  - [ ] Cross-reference job IDs with billing ledger for cost auditing
  - [ ] Set up alerting for `InsufficientFundsError` & rate limit thresholds
- **Configuration Templates**: 
  - SDK initialization (Python `httpx` / Node `fetch`)
  - Idempotency & retry policy config (exponential backoff, max attempts, `retry_after` parsing)
  - Billing ledger schema (job_id, page_count, cost, status, timestamp, idempotency_key)