Developer ToolsMay 22, 2026

JSON and CSV: Converting Between Formats for Analysis and Import

JSON and CSV serve different purposes — and converting between them correctly requires handling nested objects, arrays, special characters, and encoding. This guide covers the conversion mechanics, common failure points, and the tools that handle edge cases automatically.

Olivia Bennett
Olivia Bennett · Full-stack Developer
Full-stack developer focused on developer tooling and web performance. Writes about the formats, patterns, and shortcuts devs reach for every day.
jsoncsvdataconversionparsingetlspreadsheet

JSON and CSV are the two most common data interchange formats in software development. APIs return JSON. Spreadsheets import CSV. Databases export both. When you need to move data between systems — from an API response to a spreadsheet, from a database export to a data pipeline, from a form submission to a reporting tool — you almost always need to convert between these formats.

Converting between JSON and CSV is conceptually simple and practically tricky. The formats have fundamentally different shapes: JSON is hierarchical, CSV is flat. The conversion always involves a mapping decision — and if you do not make that decision explicitly, tools will make it for you in ways that may silently lose data.

This guide covers how both formats work, how conversion works in both directions, what can go wrong, and how to handle the edge cases that break naive converters.


JSON: Hierarchical, Schema-Optional

JSON (JavaScript Object Notation) represents data as a hierarchy of objects, arrays, strings, numbers, booleans, and nulls:

[
  {
    "id": "u_123",
    "name": "Alice Chen",
    "email": "alice@example.com",
    "role": "admin",
    "profile": {
      "timezone": "UTC+8",
      "language": "en"
    },
    "tags": ["beta", "enterprise"]
  },
  {
    "id": "u_456",
    "name": "Bob Kowalski",
    "email": "bob@example.com",
    "role": "viewer",
    "profile": {
      "timezone": "UTC+1",
      "language": "pl"
    },
    "tags": ["enterprise"]
  }
]

JSON arrays of objects — like the example above — are the natural input for CSV conversion. Each object maps to a row, each key maps to a column.


CSV: Flat, Column-Ordered

CSV (Comma-Separated Values) represents tabular data as rows of values separated by delimiters:

id,name,email,role,timezone,language,tags
u_123,Alice Chen,alice@example.com,admin,UTC+8,en,"beta,enterprise"
u_456,Bob Kowalski,bob@example.com,viewer,UTC+1,pl,enterprise

CSV is defined by RFC 4180, though implementations vary significantly. The rules that matter:

  1. Each record is on a separate line (CRLF line endings in the spec, LF in practice)
  2. The first record may be a header row (usually is)
  3. Fields are separated by commas (or semicolons in European locales, tabs in TSV)
  4. Fields containing commas, double quotes, or line breaks must be enclosed in double quotes
  5. A double quote inside a quoted field is escaped by doubling it: "She said ""hello"""

JSON to CSV: The Flattening Problem

Converting JSON to CSV requires mapping a hierarchy to a table. For simple objects (one level deep, no arrays), this is straightforward: each key becomes a column header, each value becomes a cell.

The problem is nested objects and arrays.

Nested objects

{
  "id": "u_123",
  "profile": {
    "timezone": "UTC+8",
    "language": "en"
  }
}

Option 1: Flatten with dot notation

id,profile.timezone,profile.language
u_123,UTC+8,en

The nested object becomes columns with dot-separated names. This is the most common approach — readable, reversible, and compatible with most spreadsheet tools.

Option 2: JSON-encode the nested object

id,profile
u_123,"{""timezone"":""UTC+8"",""language"":""en""}"

The nested object is serialised as a JSON string in the cell. Preserves structure but the cell is unreadable in a spreadsheet and requires parsing on import.

Option 3: Ignore nested objects

Some converters simply omit nested objects. This silently loses data — a dangerous default.

Arrays

{
  "id": "u_123",
  "tags": ["beta", "enterprise"]
}

Option 1: JSON-encode the array

id,tags
u_123,"[""beta"",""enterprise""]"

Option 2: Comma-join the values

id,tags
u_123,"beta,enterprise"

The joined value in a quoted cell is readable in a spreadsheet but loses the distinction between a multi-value field and a string that contains a comma.

Option 3: Expand to multiple columns

id,tags[0],tags[1]
u_123,beta,enterprise

Works if all arrays have the same length. Breaks down with variable-length arrays.

Option 4: Expand to multiple rows

id,tag
u_123,beta
u_123,enterprise

Relational normalisation — each tag gets its own row. Requires the consumer to understand the relationship and group by ID.

None of these options is universally correct. The right choice depends on what the target system expects. Most CSV converters default to option 1 or 2. Use the JSON to CSV tool to see exactly how conversion is applied and adjust the output to match your requirements.


CSV to JSON: The Typing Problem

Converting CSV to JSON is simpler in one direction — everything starts as a string, and you always get an array of objects. But CSV gives you no type information, so every value arrives as a string:

id,age,active,score
1,25,true,9.5

Naive conversion:

[
  {
    "id": "1",
    "age": "25",
    "active": "true",
    "score": "9.5"
  }
]

Everything is a string. Code that expects user.age > 18 will fail, because "25" > 18 is unreliable in JavaScript (coercion-dependent) and throws a type error in strongly-typed languages.

Type inference solves this by inspecting values and converting where unambiguous:

[
  {
    "id": 1,
    "age": 25,
    "active": true,
    "score": 9.5
  }
]

Type inference rules that are usually safe:

  • If all values in a column parse as integers, convert to integer
  • If all values parse as floats, convert to float
  • If all values are true or false (case-insensitive), convert to boolean
  • If a value is empty, convert to null (not "")
  • Everything else stays as string

Use the CSV to JSON tool to inspect how type inference is applied to your specific data before piping it into production code.


Handling Special Characters

Both formats have characters that require escaping:

In CSV

Commas in values — the value must be quoted:

name,address
"Smith, John","123 Main St, Springfield"

Quotes in values — doubled inside a quoted field:

name,quote
Alice,"She said ""Hello, world!"""

Newlines in values — the value must be quoted (the record spans multiple lines):

id,description
1,"Line one
Line two
Line three"

This is valid CSV but breaks parsers that assume one record per line. If you control the data, sanitise newlines before generating CSV.

Special characters to watch:

Character Problem Fix
, Delimiter confusion Quote the field
" Unescaped quote Double it: ""
\n, \r\n Multi-line record Quote the field
; Some locales use semicolons as delimiters Know your delimiter
BOM () Excel adds UTF-8 BOM, some parsers choke Strip or handle BOM

In JSON

JSON has its own escaping requirements inside strings:

Character Escaped form
" \"
\ \\
Newline \n
Tab \t
Unicode \uXXXX

When embedding CSV content as a JSON string value, both layers of escaping apply:

{
  "csv": "name,quote\nAlice,\"She said \\\"hello\\\"\""
}

Delimiter Variants

CSV uses commas by default, but real-world files use several delimiters:

Format Delimiter When you see it
CSV , Most tools, English locales
DSV / semicolon-separated ; European locales (comma is decimal separator)
TSV \t (tab) Database exports, log files
Pipe-separated | Legacy systems, when values contain commas

Excel in European locales (French, German, Dutch) generates semicolon-separated files when saving as CSV, because the comma is the decimal separator in those locales. If you receive a "CSV" file that looks wrong in a comma-based parser, try semicolon.


Encoding: UTF-8 and the Excel Problem

Modern CSV should be UTF-8. But Excel has a decades-long habit of:

  1. Not adding a UTF-8 BOM when saving CSV from non-Latin input
  2. Adding a UTF-8 BOM (\xEF\xBB\xBF) when explicitly saving as "UTF-8 CSV"
  3. Defaulting to the system encoding (CP-1252 on Windows, Mac Roman on older macOS) when saving from legacy workflows

The result: CSV files with non-Latin characters (Chinese, Arabic, Cyrillic, accented European characters) that look correct in Excel but display as garbage when read with a UTF-8 parser — or vice versa.

Diagnosis: open the file in a hex editor or run file -i yourfile.csv (Linux/macOS). Look for the byte sequence EF BB BF at the start (UTF-8 BOM) or check the declared encoding.

Fix on read:

// Node.js — strip BOM if present
const content = fs.readFileSync('data.csv', 'utf8').replace(/^/, '');
# Python — use utf-8-sig to auto-strip BOM
with open('data.csv', encoding='utf-8-sig') as f:
    reader = csv.DictReader(f)

Fix on write (for Excel compatibility):

// Add BOM so Excel opens the file correctly
const csvWithBom = '' + csvContent;
fs.writeFileSync('output.csv', csvWithBom, 'utf8');

Large Files: Streaming vs. Loading

For small files (< 50 MB), loading the entire content into memory and converting is fine. For large files, streaming is necessary.

Streaming CSV to JSON in Node.js

const fs = require('fs');
const csv = require('csv-parser');  // npm install csv-parser

const results = [];

fs.createReadStream('large-file.csv')
  .pipe(csv())
  .on('data', (row) => {
    results.push(row);
    // Or process row-by-row without accumulating all rows:
    // processRow(row);
  })
  .on('end', () => {
    console.log(`Processed ${results.length} rows`);
  });

Streaming CSV to JSON in Python

import csv
import json

with open('large-file.csv', encoding='utf-8-sig') as csvfile:
    reader = csv.DictReader(csvfile)
    with open('output.ndjson', 'w') as outfile:
        for row in reader:
            # Write one JSON object per line (NDJSON/JSON Lines format)
            outfile.write(json.dumps(row) + '\n')

NDJSON (Newline-Delimited JSON) is a streaming-friendly alternative to JSON arrays — each line is a complete JSON object, so you can process files line-by-line without parsing the entire file.


Common Conversion Failures

Empty arrays produce no columns. If the first object in your JSON array has an empty tags: [], the CSV may have no tags column. Later objects with non-empty arrays have no column to write to.

Inconsistent schemas. If object 1 has {"a": 1, "b": 2} and object 2 has {"a": 1, "c": 3}, the CSV needs columns for both b and c. Some converters only use the keys from the first object and silently drop c on row 2.

Null vs missing. {"name": null} and {} (missing key) produce the same empty cell in CSV. On the return trip, both come back as empty strings. You lose the distinction.

Numbers stored as strings in Excel. Excel applies number formatting to cells that look like numbers. A value like "001" (padded ID) becomes 1 when Excel opens the CSV. Prefix with a single quote in Excel, or encode the value with a non-numeric character.

Date formatting. CSV has no native date type. 2026-05-22 is a string — if you write it as 22/05/2026 in one file and 05/22/2026 in another, any parser that tries to infer dates will fail silently on one or both.

Use the CSV to JSON and JSON to CSV tools to validate conversion with your real data before writing any code — it is much faster to catch these issues interactively than to debug them after the fact.


Programmatic Conversion Reference

JavaScript / Node.js

// JSON array → CSV string
function jsonToCsv(data) {
  if (!data.length) return '';
  const headers = Object.keys(data[0]);
  const rows = data.map(row =>
    headers.map(h => {
      const val = row[h] ?? '';
      const str = typeof val === 'object' ? JSON.stringify(val) : String(val);
      return str.includes(',') || str.includes('"') || str.includes('\n')
        ? `"${str.replace(/"/g, '""')}"` : str;
    }).join(',')
  );
  return [headers.join(','), ...rows].join('\n');
}

// CSV string → JSON array
function csvToJson(csv) {
  const lines = csv.trim().split('\n');
  const headers = lines[0].split(',');
  return lines.slice(1).map(line => {
    const values = line.split(',');
    return Object.fromEntries(headers.map((h, i) => [h.trim(), values[i]?.trim() ?? '']));
  });
}

Python

import csv, json, io

def json_to_csv(data: list[dict]) -> str:
    if not data: return ''
    output = io.StringIO()
    writer = csv.DictWriter(output, fieldnames=data[0].keys())
    writer.writeheader()
    writer.writerows(data)
    return output.getvalue()

def csv_to_json(csv_content: str) -> list[dict]:
    reader = csv.DictReader(io.StringIO(csv_content))
    return list(reader)

The Python csv module handles all RFC 4180 quoting and escaping automatically. For production code, prefer the standard library over manual string splitting.

← All guides