|----------|-------------------|--------------------|--------------------|-----------------------|----------------------|
| Local Library (Ghostscript/pdf-lib) | ~60-70% | 250-400 MB | 2-5 sec (init + deps) | Limited/Unreliable | Language-bound |
| REST API (Forgelab) | ~80% | <10 MB | <100 ms (HTTP) | Fully Compatible | Any HTTP-capable language |
Key Findings:
- Client-side memory footprint drops by >95% by streaming files directly to the API.
- Deployment artifacts shrink by eliminating binary dependencies, enabling faster CI/CD pipelines.
- Consistent compression quality is guaranteed regardless of client environment or PDF complexity.
Core Solution
The Forgelab PDF API accepts a multipart/form-data POST request and returns a compressed binary stream. Authentication is handled via a simple header.
Endpoint:
POST https://www.forgelab.africa/api/pdf/compress
Headers: X-API-Key: your_api_key
Body (multipart/form-data): file β the PDF to compress
Response: compressed PDF as a binary download.
curl
curl -X POST https://www.forgelab.africa/api/pdf/compress \
-H "X-API-Key: $FORGELAB_API_KEY" \
-F "file=@report.pdf" \
--output compressed.pdf
A 10 MB PDF typically comes back under 2 MB.
Node.js
import fs from 'fs';
import fetch from 'node-fetch';
import FormData from 'form-data';
async function compressPdf(inputPath, outputPath) {
const form = new FormData();
form.append('file', fs.createReadStream(inputPath));
const res = await fetch('https://www.forgelab.africa/api/pdf/compress', {
method: 'POST',
headers: {
'X-API-Key': process.env.FORGELAB_API_KEY,
...form.getHeaders(),
},
body: form,
});
if (!res.ok) throw new Error(`Compress failed: ${res.status}`);
const buffer = await res.buffer();
fs.writeFileSync(outputPath, buffer);
console.log(`Saved: ${outputPath}`);
}
compressPdf('invoice.pdf', 'invoice-compressed.pdf');
Python
import os
import requests
def compress_pdf(input_path: str, output_path: str) -> None:
with open(input_path, "rb") as f:
res = requests.post(
"https://www.forgelab.africa/api/pdf/compress",
headers={"X-API-Key": os.environ["FORGELAB_API_KEY"]},
files={"file": f},
)
res.raise_for_status()
with open(output_path, "wb") as out:
out.write(res.content)
print(f"Compressed PDF saved to {output_path}")
compress_pdf("report.pdf", "report-compressed.pdf")
Pitfall Guide
- Hardcoding API Keys: Embedding
X-API-Key directly in source code or frontend bundles exposes credentials. Always inject via environment variables or secret managers (AWS Secrets Manager, HashiCorp Vault, GitHub Secrets).
- Ignoring Gateway Payload Limits: Reverse proxies (Nginx, AWS ALB, Cloudflare) often default to 1-10 MB request limits. Configure
client_max_body_size or equivalent to match your largest expected PDF.
- Blocking I/O on Large Files: Loading entire files into memory before upload (
fs.readFileSync, open().read()) causes OOM errors on files >50 MB. Use streaming APIs (createReadStream, chunked uploads) to maintain constant memory usage.
- Missing Error & Status Validation: Assuming success without checking HTTP status codes leads to corrupted output files. Always validate
res.ok (Node) or call raise_for_status() (Python) before writing to disk.
- No Retry/Backoff Logic: Transient network drops or API rate limits (429) will break naive implementations. Implement exponential backoff with jitter for retries, and respect
Retry-After headers when present.
- Skipping File Type Validation: Uploading non-PDF files triggers silent failures or API errors. Validate MIME type (
application/pdf) and file magic bytes client-side before initiating the request.
- Neglecting Output Integrity Checks: The API may return a valid HTTP 200 but deliver a truncated or corrupted stream. Verify the output file's PDF header (
%PDF-) and run a lightweight validation (e.g., qpdf --check or PyPDF2) before downstream processing.
Deliverables
- π¦ Integration Blueprint: Architecture diagram showing client streaming β API gateway β Forgelab compression engine β binary return flow, including timeout, retry, and fallback configurations.
- β
Pre-Flight Checklist: Environment variable setup, payload size validation, MIME verification, error handling coverage, and serverless compatibility verification steps.
- βοΈ Configuration Templates: Ready-to-use
.env examples, Nginx/AWS API Gateway payload limit overrides, and exponential backoff retry wrappers for Node.js and Python.
- π Pricing & Scaling Reference: Tiered usage mapping (Free: 5 calls/mo β Business: 10,000 calls/mo) with cost-per-compression calculations to optimize batch processing vs. real-time triggers.