ilding a production-ready JSON validation pipeline requires separating syntax validation from business logic validation. The goal is to catch structural violations before they reach domain parsers, while maintaining performance and developer ergonomics.
Step 1: Implement a Strict Parser Wrapper
Native JSON.parse() provides minimal error context. Wrapping it with a structured error handler and optional pre-sanitization layer improves observability and recovery.
import { z } from 'zod';
export class JSONValidationError extends Error {
constructor(
message: string,
public readonly rawInput: string,
public readonly position: number
) {
super(message);
this.name = 'JSONValidationError';
}
}
export function parseStrictJSON<T>(input: string, schema?: z.ZodType<T>): T {
let parsed: unknown;
try {
parsed = JSON.parse(input);
} catch (error) {
if (error instanceof SyntaxError) {
const positionMatch = error.message.match(/position\s+(\d+)/i);
const position = positionMatch ? parseInt(positionMatch[1], 10) : -1;
throw new JSONValidationError(
`Invalid JSON structure: ${error.message}`,
input,
position
);
}
throw error;
}
if (schema) {
const result = schema.safeParse(parsed);
if (!result.success) {
throw new JSONValidationError(
`Schema validation failed: ${result.error.message}`,
input,
-1
);
}
return result.data;
}
return parsed as T;
}
Architecture Rationale:
- Separating syntax parsing from schema validation allows you to catch RFC 7159 violations before business rules are evaluated.
- The custom error class preserves the raw payload and approximate failure position, which is critical for logging and alerting in distributed systems.
- Optional Zod integration provides a migration path from strict syntax checking to full contract validation without changing the calling signature.
Step 2: Pre-Validation Sanitization for Common Bleed-Over
When ingesting payloads from legacy systems or developer-generated configs, you can apply a deterministic sanitization pass before parsing. This handles predictable language-specific leaks without compromising spec compliance.
export function sanitizeJSONPayload(raw: string): string {
return raw
.replace(/\/\/.*$/gm, '') // Strip line comments
.replace(/\/\*[\s\S]*?\*\//g, '') // Strip block comments
.replace(/,\s*([}\]])/g, '$1') // Remove trailing commas
.replace(/(?<=:\s*)'([^']*)'/g, '"$1"') // Convert single-quoted values
.replace(/(?<={\s*)(\w+)(?=\s*:)/g, '"$1"') // Quote bare keys
.replace(/\bTrue\b/gi, 'true')
.replace(/\bFalse\b/gi, 'false')
.replace(/\bNone\b/gi, 'null')
.replace(/\bNaN\b/gi, 'null')
.replace(/\bundefined\b/gi, 'null');
}
Why this approach:
- Regex-based sanitization is deterministic and fast. It avoids the overhead of full AST parsing for simple syntax normalization.
- Each replacement targets a specific RFC 7159 violation. The order matters: comments are stripped first to prevent them from interfering with comma or quote detection.
- This is a defensive layer, not a replacement for strict validation. Sanitized payloads should still pass through
parseStrictJSON.
Step 3: Boundary Enforcement in API Gateways
In production, validation must occur at the network edge. Middleware should reject malformed payloads before they consume application resources.
import { Request, Response, NextFunction } from 'express';
export function jsonValidationMiddleware(req: Request, res: Response, next: NextFunction) {
if (req.headers['content-type']?.includes('application/json') && req.body) {
try {
parseStrictJSON(JSON.stringify(req.body));
} catch (error) {
if (error instanceof JSONValidationError) {
return res.status(400).json({
error: 'INVALID_JSON_PAYLOAD',
message: error.message,
hint: 'Ensure strict RFC 7159 compliance. No comments, trailing commas, or single quotes.'
});
}
}
}
next();
}
Design Decision:
- Validating
req.body after Express parses it catches cases where the framework's built-in parser silently fails or truncates.
- Returning structured error responses with actionable hints reduces support tickets and accelerates client-side fixes.
- This middleware can be conditionally applied to public-facing routes while bypassing internal service-to-service communication where payloads are guaranteed.
Pitfall Guide
1. Trailing Comma Tolerance
Explanation: JavaScript and Python allow trailing commas in object and array literals to simplify version control diffs. JSON parsers reject them because the grammar expects a value or closing delimiter immediately after a comma.
Fix: Strip trailing commas during serialization or apply a pre-parse sanitizer. Configure linters to flag trailing commas in .json files.
2. Single-Quote Leakage
Explanation: Developers accustomed to JavaScript or Python string syntax often wrap JSON keys or values in single quotes. The JSON specification requires double quotes for all strings, including keys.
Fix: Enforce double-quote rules in editor settings and CI pipelines. Use automated formatting tools that normalize quote syntax before committing configuration files.
3. Bare Identifier Keys
Explanation: In JavaScript, { username: "admin" } is valid. In JSON, keys must be strings. Unquoted keys cause the parser to interpret them as undefined variables, triggering a token error.
Fix: Always serialize keys as strings. When manually writing JSON, wrap every key in double quotes. Use code generation tools to eliminate manual key formatting.
Explanation: YAML, TOML, and JavaScript support comments. JSON explicitly excludes them. Douglas Crockford removed comment support to prevent configuration files from executing directives or embedding executable logic.
Fix: Remove all comments before parsing. If documentation is required, maintain a separate .md file or use JSONC/JSON5 for developer-facing configs, then compile to strict JSON for production.
5. Case-Sensitive Literals
Explanation: Python uses True, False, None. JavaScript uses true, false, null, undefined. JSON mandates lowercase true, false, null. Mismatched casing causes immediate parse failures.
Fix: Normalize literals during serialization. When consuming third-party payloads, apply a deterministic normalization pass or reject non-compliant data with clear error messaging.
6. Non-Standard Numeric Values
Explanation: NaN, Infinity, and -Infinity are valid JavaScript numbers but are not part of the JSON number type. undefined has no JSON representation. Serializing these values results in null or dropped fields.
Fix: Replace non-numeric values with null or explicit sentinel values (e.g., -1, 0, or a dedicated status field). Validate numeric ranges before serialization.
7. Control Character Escaping
Explanation: JSON strings must escape control characters (\n, \t, \r, \b, \f) and forward slashes if desired. Unescaped control characters break the parser and can introduce injection vulnerabilities.
Fix: Use native JSON.stringify() for serialization, which handles escaping automatically. Never concatenate strings to build JSON payloads manually.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Public API ingestion | Strict wrapper + boundary middleware | Prevents resource exhaustion from malformed payloads | Low (middleware overhead) |
| Internal service configs | JSONC/JSON5 with compile step | Enables developer comments while enforcing production compliance | Medium (build step) |
| High-throughput data pipelines | Streaming JSON parser + schema validation | Reduces memory footprint and catches errors early | Low (streaming efficiency) |
| Legacy system integration | Pre-sanitization + strict validation | Bridges language-specific syntax gaps safely | Low (regex pass) |
| Developer tooling | Editor linting + pre-commit hooks | Catches violations before code reaches CI | Negligible (local tooling) |
Configuration Template
// json-validation.config.ts
import { z } from 'zod';
import { parseStrictJSON, sanitizeJSONPayload } from './strict-json-parser';
export const telemetrySchema = z.object({
deviceId: z.string().uuid(),
timestamp: z.number().int().positive(),
metrics: z.record(z.string(), z.number()),
tags: z.array(z.string()).optional()
});
export function validateTelemetryPayload(raw: string) {
const sanitized = sanitizeJSONPayload(raw);
return parseStrictJSON(sanitized, telemetrySchema);
}
// eslint.config.mjs (JSON syntax enforcement)
import jsonPlugin from '@eslint/json';
export default [
{
files: ['**/*.json'],
plugins: { json: jsonPlugin },
rules: {
'json/no-trailing-commas': 'error',
'json/no-single-quotes': 'error',
'json/require-quoted-keys': 'error',
'json/valid-json-number': 'error'
}
}
];
Quick Start Guide
- Install dependencies:
npm install zod @eslint/json
- Replace native parsing: Swap
JSON.parse(input) with parseStrictJSON(input, optionalSchema) across your codebase.
- Add linting rules: Integrate the ESLint JSON plugin configuration into your project to catch syntax violations during development.
- Deploy boundary middleware: Attach the validation middleware to your API router to reject non-compliant payloads before they reach business logic.
- Verify with test cases: Run payloads containing trailing commas, single quotes, comments, and Python-style literals through the pipeline to confirm deterministic rejection or sanitization.
Implementing strict JSON compliance at the system boundary eliminates guesswork, reduces runtime failures, and enforces a clear contract between producers and consumers. Treat JSON as a specification, not a suggestion, and your data pipelines will scale with predictable reliability.