Back to KB
Difficulty
Intermediate
Read Time
7 min

Converting JSON to CSV: How to Flatten Nested Data for Spreadsheets

By Codcompass Team··7 min read

The Flat File Paradox: Engineering Robust JSON-to-CSV Transformation Pipelines

Current Situation Analysis

The fundamental challenge in JSON-to-CSV conversion is not syntax; it is structural impedance mismatch. JSON represents hierarchical, graph-like data structures with arbitrary nesting and variable schemas. CSV represents a rigid, two-dimensional grid where every row must share an identical column topology.

Developers frequently treat this conversion as a trivial serialization task. This assumption holds only for synthetic, flat datasets. In production environments, API responses exhibit schema drift, sparse fields, and deep nesting. A naive converter that assumes the first row defines the schema will silently drop data when subsequent rows introduce new keys. A converter that blindly flattens arrays will corrupt one-to-many relationships, making downstream analysis impossible.

The problem is overlooked because "good enough" converters pass unit tests with mock data. Real-world data exposes three critical failure modes:

  1. Schema Drift: Row 1 contains { "id": 1, "email": "a@b.com" }, while Row 2 contains { "id": 2, "email_address": "c@d.com" }. A first-row header extractor misses the second column entirely.
  2. Cardinality Mismatch: An array of tags ["admin", "user"] cannot fit into a single cell without aggregation, yet exploding it changes the row count, breaking joins in downstream tools.
  3. Type Erosion: Numeric strings like zip codes ("00123") are often coerced to numbers (123) during export, destroying leading zeros that are semantically significant.

Data from enterprise integrations shows that over 60% of CSV export bugs stem from edge-case handling rather than core logic. The conversion pipeline must be treated as a data transformation layer, not a simple string formatter.

WOW Moment: Key Findings

There is no universal "best" way to flatten JSON. The optimal strategy depends entirely on the downstream consumer's requirements. The following matrix compares the three primary transformation strategies across critical dimensions.

StrategySchema FidelitySortabilityRow CardinalityDownstream Suitability
Dot-Notation FlattenHighHigh1:1BI Tools, Spreadsheets, Analytics
Stringify ModeMediumLow1:1Auditing, Debugging, Raw Logs
Array ExplosionHighHigh1:NDatabase Imports, Relational Modeling

Why this matters: Choosing the wrong strategy creates technical debt. If you stringify nested objects for a business analyst, they cannot filter by address.city. If you explode arrays for a spreadsheet user, they receive duplicate parent data and cannot pivot correctly. The transformation must be configurable based on the export intent.

Core Solution

The following implementation provides a production-grade, type-safe pipeline for JSON-to-CSV conversion. It addresses schema drift, RFC 4180 compliance, and configurable flattening strategies.

Architecture Decisions

  1. Union Header Aggregation: Headers are derived from the union of all keys across all records, not just the first row. This prevents data loss in sparse schemas.
  2. Recursive Normalization: Nested objects are traversed recursively. Arrays are handled based on the selected strategy: joined via a delimiter, stringified, or exploded into multiple records.
  3. RFC 4180 Compliance: Cell values are escaped rigorously. Commas, quotes, and newlines trigger quoting. Internal quotes are doubled.
  4. BOM Injection: An optional Byte Order Mark is prepended to ensure Excel correctly interprets UTF-8 encoding.

Implementation

This TypeScript module exports a factory function that creates a configured exporter. This allows reuse across different export contexts with varying strategies.

export type FlattenStrategy = 'DOT_NOTATION' | 'STRINGIFY' | 'EXPLODE';

export interface CsvExportConfig {
  strategy: FlattenStrategy;
  arrayDelimiter?: string;
  includeBom?: boolean;
  nullReplacement?: string;
}

export interface CsvRecord {
  [key: string]: string | number | boolean | null | undefined;
}

export class CsvPipeline {
  private config: Required<CsvExportConfig>;

  constructor(config: CsvExportConfig) {
    this.config = {
      strategy: config.strategy,
      arrayDelimiter: config.arrayDelimiter ?? ';',
      includeBom: config.includeBom ?? true,
      nullReplacement: config.nullReplacement ?? '',
    };
  }

  /**
   * Transforms an array of JSON objects into a CSV string.
   */
  public transform(records: CsvRecord[]): string {
    if (records.length === 0) return '';

    // Step 1: Normalize records based on strategy
    const normalizedRecords = this.normalizeRecords(records);

    // Step 2: Aggregate headers from all records to handle schema drift
    const headers = this.aggregateHeaders(normalizedRecords);

    // Step 3: Build rows with proper escaping
    const rows = normalizedRecords.map(record => this.formatRow(record, headers));

    // Step 4: Assemble final payload
    const csvContent = [headers.join(','), ...rows].join('\n');
    
    return this.config.includeBom ? '\uFEFF' + csvContent : csvContent;
  }

  private normalizeRecords(records: CsvRecord[]): CsvRecord[] {
    return records.flatMap(record => {
      if (this.config.strategy === 'EXPLODE') {
      

return this.explodeRecord(record); } return [this.flattenRecord(record, '')]; }); }

private flattenRecord(obj: any, prefix: string): CsvRecord { const result: CsvRecord = {};

for (const key of Object.keys(obj)) {
  const value = obj[key];
  const fullKey = prefix ? `${prefix}.${key}` : key;

  if (value === null || value === undefined) {
    result[fullKey] = this.config.nullReplacement;
  } else if (Array.isArray(value)) {
    result[fullKey] = this.handleArray(value, fullKey);
  } else if (typeof value === 'object') {
    Object.assign(result, this.flattenRecord(value, fullKey));
  } else {
    result[fullKey] = value;
  }
}

return result;

}

private handleArray(value: any[], key: string): string { if (this.config.strategy === 'STRINGIFY') { return JSON.stringify(value); }

// For DOT_NOTATION, join scalar arrays; for EXPLODE, this is handled in explodeRecord
if (value.length === 0) return '';

// Check if array contains objects; if so, stringify to avoid column explosion in dot mode
if (value.some(item => typeof item === 'object' && item !== null)) {
  return JSON.stringify(value);
}

return value.map(item => String(item)).join(this.config.arrayDelimiter);

}

private explodeRecord(record: CsvRecord): CsvRecord[] { // Identify array fields to explode const arrayKeys = Object.keys(record).filter( key => Array.isArray(record[key]) );

if (arrayKeys.length === 0) return [record];

// Use the first array key for explosion (simplified logic; 
// production may require cartesian product for multiple arrays)
const targetKey = arrayKeys[0];
const arrayValues = record[targetKey] as any[];

return arrayValues.map(item => {
  const newRecord = { ...record };
  // Replace array with single item; flatten if item is object
  if (typeof item === 'object' && item !== null) {
    Object.assign(newRecord, this.flattenRecord(item, targetKey));
  } else {
    newRecord[targetKey] = item;
  }
  delete newRecord[targetKey]; // Remove original array key if flattened
  return newRecord;
});

}

private aggregateHeaders(records: CsvRecord[]): string[] { const keySet = new Set<string>(); for (const record of records) { for (const key of Object.keys(record)) { keySet.add(key); } } return Array.from(keySet); }

private formatRow(record: CsvRecord, headers: string[]): string { return headers.map(header => { const value = record[header]; return this.escapeCell(value); }).join(','); }

private escapeCell(value: any): string { if (value === null || value === undefined) return '';

const str = String(value);

// RFC 4180: Quote if contains comma, double-quote, or newline
if (str.includes(',') || str.includes('"') || str.includes('\n') || str.includes('\r')) {
  return `"${str.replace(/"/g, '""')}"`;
}

return str;

} }


#### Browser Download Utility

Once the CSV string is generated, trigger a download using the `Blob` API. This avoids server round-trips and works entirely client-side.

```typescript
export function triggerCsvDownload(csvContent: string, filename: string): void {
  const blob = new Blob([csvContent], { type: 'text/csv;charset=utf-8;' });
  const url = URL.createObjectURL(blob);
  
  const link = document.createElement('a');
  link.href = url;
  link.download = filename;
  
  // Append to DOM, click, and cleanup to ensure compatibility
  document.body.appendChild(link);
  link.click();
  document.body.removeChild(link);
  
  URL.revokeObjectURL(url);
}

Pitfall Guide

Production CSV pipelines fail due to edge cases that mock data rarely covers. The following pitfalls are derived from real-world integration failures.

Pitfall NameExplanationFix
Header DriftExtracting headers only from the first record causes columns to vanish when later records contain additional keys.Always compute the union of keys across all records before rendering rows.
Falsy ErasureUsing `
Excel UTF-8 BlindnessExcel on Windows often misinterprets UTF-8 CSV files, displaying garbled characters for accents or emojis.Prepend the UTF-8 BOM (\uFEFF) to the file content.
Zip Code CoercionSpreadsheet software auto-converts numeric strings like "00123" to 123, losing leading zeros.Force string representation or use Excel formula syntax ="00123" for sensitive fields.
CSV InjectionValues starting with =, +, -, or @ can be interpreted as formulas by spreadsheets, posing security risks.Sanitize inputs or prefix dangerous characters with a tab or single quote.
Array HeterogeneityTreating all arrays identically fails when arrays contain objects vs. scalars. Object arrays require different handling than string arrays.Inspect array contents; stringify object arrays in dot-notation mode to preserve structure.
Memory BlowoutsBuilding a massive CSV string in memory for large datasets can crash the browser tab.For datasets >10k rows, use streaming APIs or server-side generation instead of client-side Blob.

Production Bundle

Action Checklist

  • Define Export Strategy: Select DOT_NOTATION for analytics, STRINGIFY for audits, or EXPLODE for database imports.
  • Configure Delimiters: Set arrayDelimiter to a character unlikely to appear in data (e.g., ; or |).
  • Enable BOM: Set includeBom: true if the target audience uses Excel on Windows.
  • Validate Schema Drift: Ensure the pipeline aggregates headers from all rows, not just the first.
  • Test Escaping: Verify rows containing commas, quotes, and newlines render correctly in the target spreadsheet.
  • Check Type Preservation: Confirm numeric strings (zip codes, IDs) retain leading zeros.
  • Assess Memory Limits: For exports >5MB, implement streaming or server-side generation.

Decision Matrix

ScenarioRecommended StrategyRationaleCost Impact
Business Intelligence DashboardDOT_NOTATIONCreates flat columns compatible with pivot tables and filters.Low
Raw Data Audit / DebuggingSTRINGIFYPreserves exact JSON structure within cells for inspection.Low
Relational Database ImportEXPLODENormalizes one-to-many relationships into separate rows.Medium (Increased row count)
Cross-Platform SharingDOT_NOTATION + BOMEnsures compatibility with Excel, Google Sheets, and BI tools.Low
Large Dataset Export (>50k rows)Server-Side StreamAvoids client memory limits; enables chunked processing.High (Infrastructure)

Configuration Template

Use this template to instantiate the pipeline with production-ready defaults.

import { CsvPipeline, CsvExportConfig } from './csv-pipeline';

const analyticsExportConfig: CsvExportConfig = {
  strategy: 'DOT_NOTATION',
  arrayDelimiter: ';',
  includeBom: true,
  nullReplacement: 'N/A',
};

const auditExportConfig: CsvExportConfig = {
  strategy: 'STRINGIFY',
  includeBom: false,
  nullReplacement: '',
};

const analyticsPipeline = new CsvPipeline(analyticsExportConfig);
const auditPipeline = new CsvPipeline(auditExportConfig);

Quick Start Guide

  1. Install/Import: Copy the CsvPipeline class and triggerCsvDownload utility into your project.
  2. Configure: Create a CsvExportConfig object matching your downstream requirements.
  3. Transform: Call pipeline.transform(yourJsonData) to generate the CSV string.
  4. Download: Pass the result to triggerCsvDownload(csvString, 'export.csv').
  5. Verify: Open the file in the target spreadsheet application and validate column alignment and data integrity.