← Back to Blog
TypeScript2026-05-06·36 min read

I shipped a free ATS preview inside my paid AI tool. Here's the engineering write-up.

By Giovanni Sizino Ennes

I shipped a free ATS preview inside my paid AI tool. Here's the engineering write-up.

Current Situation Analysis

Most resume optimization tools operate on a reactive or siloed model. Users typically upload a CV to a third-party SaaS (e.g., Jobscan at $49/mo) for a one-shot diagnostic that is completely disconnected from the actual application workflow. The critical failure mode is timing: applicants only discover their CV broke parsing after receiving a rejection email. By then, downstream AI-generated assets (cover letters, fit scores, interview prep) are already built on corrupted or truncated text. Traditional methods fail because they treat ATS compliance as a post-hoc validation step rather than a foundational input gate, forcing users to either pay for external subscriptions or gamble on parse integrity before spending AI tokens.

WOW Moment: Key Findings

Approach Client Latency Bundle Impact Privacy & Security
Traditional SaaS ATS (e.g., Jobscan) 2–5s (server round-trip) N/A (external) Low (CV bytes sent to third-party)
Server-Side AI Pre-Scan 1–3s (API overhead) N/A (backend only) Medium (bytes transit to backend)
In-Product Client-Side Preview (Vantage) <200ms (pure JS) +11 kB minified / +4.4 kB gzipped High (bytes never leave browser)

Key Findings:

  • Zero-Token Gate: The preview executes before any AI token consumption, preventing wasted credits on unparseable CVs.
  • Vendor-Specific Signal Mapping: Five major ATS vendors (Workday, Greenhouse, Lever, Taleo, iCIMS) are covered using pure client-side heuristics without external API calls.
  • Bundle Efficiency: Client-side parsing adds only 4.4 kB gzipped to the main chunk, avoiding the ~300 kB overhead of pdfjs-dist by strategically deferring PDF handling to the paid flow.

Core Solution

The architecture is built around four strict constraints: run before token expenditure, add zero new dependencies, remain deletable in 5 lines, and stay entirely client-side for privacy.

1. Signal Computation & Vendor Heuristics

The lint engine ports pure functions from the open-source CV Mirror project. It computes structural signals directly from plain text without I/O:

export function computeSignals(text: string, fileSize: number): Signals {
  const lines = text.split('\n');
  const nonEmpty = lines.filter((l) => l.trim().length > 0);
  const wordCount = (text.match(/\b\w+\b/g) || []).length;

  // Multi-column heuristic: lines with a 5+ space gap
  const multiColumnLines = nonEmpty.filter((l) => /\S {5,}\S/.test(l)).length;
  const multiColumnRatio = nonEmpty.length > 0 ? multiColumnLines / nonEmpty.length : 0;

  const wordsPerKB = fileSize > 0 ? wordCount / (fileSize / 1024) : 0;
  const hasHeaderFooterLikeText = /^\s*page \d+( of \d+)?\s*$/im.test(text);
  const hasEmoji = /[\u{1F300}-\u{1FAFF}\u{2600}-\u{27BF}]/u.test(text);
  const hasSmartQuotes = /[‘’“”]/.test(text);
  // ... etc
}

These signals feed into five vendor rule sets:

  • Workday: Flags multiColumnRatio > 15% as ERROR.
  • Greenhouse: Flags hasEmoji as WARN (strips codepoints, losing surrounding context).
  • Lever: Flags missing standard section headers as ERROR (parser uses headers to delimit sections).
  • Taleo: Flags ISO-style dates as WARN (prefers Month-Year format).
  • iCIMS: Flags multiColumnRatio > 20% as ERROR.

Every rule cites public vendor documentation, maintaining transparency and auditability.

2. Strategic PDF Exclusion

PDF is the most common CV format, but pdfjs-dist adds ~300 kB minified—roughly a third of the main bundle. The trade-off: DOCX and TXT are supported inline using mammoth (already bundled for the paid flow) and File.text(). PDF uploads trigger a defer message: "Upload a DOCX version for the instant preview — your full Vantage analysis still works with PDFs." This preserves bundle size while keeping the preview instant.

3. Dashboard Integration & Rollback Safety

The feature is 100% additive. Only two surgical edits were made to Dashboard.tsx:

{/* === ATS scanner (additive, free, client-side). Removing this and the
     import line restores the previous behaviour entirely. === */}
{cvFile && <AtsScannerSection cvFile={cvFile} />}
{/* === END ATS scanner === */}

Rollback requires git revert <hash> and deleting two new files. Existing tests, types, services, contexts, and routes remain untouched.

4. Bundle & QA Validation

  • Main chunk: 1,154.70 kB → 1,165.85 kB (+11 kB minified)
  • Gzipped: 308.20 kB → 312.56 kB (+4.4 kB gzipped)
  • New dependencies: Zero
  • mammoth is lazy-imported only on DOCX upload, ensuring cost is paid only by active users.
  • Pre-ship audit caught two UI inconsistencies: passCount misalignment with warning states, and inherited text tone conflicting with icon color. Both resolved via deterministic isClean flags and strict errors === 0 && warns === 0 logic.

Pitfall Guide

  1. PDF Parsing Bundle Bloat: Importing pdfjs-dist client-side adds ~300 kB, severely impacting TTI. Best Practice: Defer PDF handling to server-side or paid flows; use mammoth/File.text() for instant, lightweight previews.
  2. Metric/UI State Misalignment: Counting "passes" without accounting for warnings creates contradictory UI states (e.g., "5/5" headline with only 3 green pills). Best Practice: Align pass logic strictly with errors === 0 && warns === 0 and derive UI classes deterministically.
  3. Ignoring Vendor-Specific Heuristics: Treating all ATS parsers as identical causes false positives/negatives. Best Practice: Map raw signals to specific vendor rules (e.g., Workday multi-column >15%, Greenhouse emoji stripping) using documented parsing behaviors.
  4. Tight Coupling with Paid Token Flow: Running previews after token consumption wastes user credits on broken CVs. Best Practice: Gate the preview before token expenditure, keep it purely client-side, and treat it as a foundational input validation step.
  5. Irreversible Feature Integration: Adding features that touch core contexts/routes makes rollback painful and risky. Best Practice: Use additive-only components with clearly fenced comments and isolated imports to enable instant git revert + file deletion without touching existing architecture.

Deliverables

  • 📐 Blueprint: Client-Side ATS Preview Architecture & Vendor Heuristic Mapping (PDF) — Details the signal computation pipeline, lazy-loading strategy, and zero-dependency integration pattern.
  • ✅ Checklist: Pre-Shipping Audit & Rollback Verification — Validates bundle impact thresholds, metric/UI alignment, vendor rule coverage, and git revert safety procedures.
  • ⚙️ Configuration Templates: Signal Computation Config & Vendor Rule Mapping Table — JSON/TS templates for extending heuristic thresholds, adding new vendor parsers, and configuring pass/warn/error states without modifying core logic.