Implementing JSON to CSV Conversion: From Data Structure to Table Format
Implementing JSON to CSV Conversion: From Data Structure to Table Format#
I was recently working on a data export feature that required converting API JSON responses to CSV for users to download. I thought it would be simple string concatenation, but I discovered quite a few pitfalls - quote escaping, special characters, encoding issues… So I decided to document these details.
CSV Format Specification (RFC 4180)#
Before writing code, let’s understand the CSV rules:
- Field Delimiter: Usually comma
,, but can also be tab\t, semicolon;, or pipe| - Quote Wrapping: If a field contains a delimiter, newline, or quote itself, it must be wrapped in double quotes
- Quote Escaping: Double quotes within a field must be escaped as two double quotes
"" - Optional Header: The first row can be field names
For example:
name,description,price
iPhone,"15 Pro, 256GB",7999
MacBook,"Pro 14"", long battery",14999
Notice the description field in the second row contains a comma, and the quote in the third row becomes "".
Core Algorithm for JSON to CSV#
Step 1: Extract All Column Names#
In a JSON array, each object may contain different fields, so we need to merge all key names:
const rows = [
{ name: 'Alice', age: 25 },
{ name: 'Bob', age: 30, city: 'New York' }
]
// Get all columns
const columns = Array.from(new Set(rows.flatMap(Object.keys)))
// ['name', 'age', 'city']
The flatMap + Set combination ensures column names are unique and include all fields.
Step 2: Process Each Row#
Iterate through each row, retrieve values by column order, and handle missing fields:
rows.forEach(row => {
const values = columns.map(col => {
const val = row[col]
// Missing field returns empty string
if (val === null || val === undefined) {
return ''
}
// Nested objects to JSON string
if (typeof val === 'object') {
return JSON.stringify(val)
}
return String(val)
})
csv += values.join(',') + '\n'
})
One detail here: nested objects are serialized to JSON strings to avoid structure loss.
Step 3: Quote Handling (Most Critical)#
CSV quoting rules seem simple, but implementation requires handling multiple cases:
function quoteField(str: string, delimiter: string): string {
// No special characters, return as-is
if (!str.includes(delimiter) && !str.includes('\n') && !str.includes('"')) {
return str
}
// Contains special characters, needs quote wrapping
// Escape " in string to ""
return `"${str.replace(/"/g, '""')}"`
}
Test some edge cases:
quoteField('hello', ',') // 'hello' (no special chars)
quoteField('hello,world', ',') // '"hello,world"' (contains delimiter)
quoteField('say "hi"', ',') // '"say ""hi"""' (contains quote)
quoteField('line1\nline2', ',') // '"line1\nline2"' (contains newline)
Step 4: Encoding Handling#
When opening CSV in Excel, Chinese characters may appear garbled. The solution is to add BOM (Byte Order Mark):
const blob = new Blob(['\uFEFF' + csvContent], {
type: 'text/csv;charset=utf-8'
})
\uFEFF is the UTF-8 BOM marker, telling Excel this is UTF-8 encoded.
Complete Implementation#
Combining the above logic, the complete conversion function:
function jsonToCsv(
data: any[],
options: {
delimiter?: string
includeHeader?: boolean
quoteFields?: boolean
} = {}
): string {
const {
delimiter = ',',
includeHeader = true,
quoteFields = true
} = options
// Convert to array
const rows = Array.isArray(data) ? data : [data]
if (rows.length === 0) return ''
// Extract all column names
const columns = Array.from(new Set(rows.flatMap(Object.keys)))
let csv = ''
// Header row
if (includeHeader) {
csv += columns.map(col => quoteField(col, delimiter, quoteFields)).join(delimiter) + '\n'
}
// Data rows
rows.forEach(row => {
const values = columns.map(col => {
const val = row[col]
if (val === null || val === undefined) {
return quoteField('', delimiter, quoteFields)
}
if (typeof val === 'object') {
return quoteField(JSON.stringify(val), delimiter, quoteFields)
}
return quoteField(String(val), delimiter, quoteFields)
})
csv += values.join(delimiter) + '\n'
})
return csv
}
function quoteField(
str: string,
delimiter: string,
shouldQuote: boolean
): string {
if (!shouldQuote) return str
// When contains delimiter, newline, or quote, needs wrapping
if (str.includes(delimiter) || str.includes('\n') || str.includes('"')) {
return `"${str.replace(/"/g, '""')}"`
}
return str
}
Performance Optimization: Handling Large Files#
When JSON arrays contain tens of thousands of records, string concatenation performance degrades. Optimization approaches:
1. Use Arrays Instead of String Concatenation#
const lines: string[] = []
if (includeHeader) {
lines.push(columns.map(quote).join(delimiter))
}
rows.forEach(row => {
lines.push(columns.map(col => quote(String(row[col]))).join(delimiter))
})
return lines.join('\n')
Array push operations are much faster than string concatenation.
2. Chunked Processing#
For very large files (100MB+), consider chunked generation:
async function jsonToCsvChunked(
data: any[],
chunkSize: number = 10000
): Promise<string> {
const chunks: string[] = []
for (let i = 0; i < data.length; i += chunkSize) {
const chunk = data.slice(i, i + chunkSize)
chunks.push(jsonToCsv(chunk, { includeHeader: i === 0 }))
await new Promise(resolve => setTimeout(resolve, 0)) // Avoid UI blocking
}
return chunks.join('')
}
3. Web Worker Background Processing#
Put conversion logic in a Web Worker to avoid UI freezing:
// worker.ts
self.onmessage = (e) => {
const csv = jsonToCsv(e.data)
self.postMessage(csv)
}
// main.tsx
const worker = new Worker('worker.ts')
worker.postMessage(largeJsonArray)
worker.onmessage = (e) => {
downloadCsv(e.data)
}
Edge Cases#
1. Empty Array#
jsonToCsv([]) // Returns empty string
2. Single Object (Non-Array)#
jsonToCsv({ name: 'Alice', age: 25 })
// name,age
// Alice,25
Need to automatically wrap in array for processing.
3. Field Value is Array#
const data = [
{ name: 'Alice', tags: ['frontend', 'React'] }
]
// name,tags
// Alice,"[""frontend"",""React""]"
Array is serialized to JSON string, quotes escaped.
4. Field Value is Date Object#
const data = [
{ name: 'Alice', birthday: new Date('1990-01-01') }
]
// name,birthday
// Alice,1990-01-01T00:00:00.000Z
Date object’s toString() returns ISO format string.
Final Result#
Based on the above approach, I built an online tool: JSON to CSV Converter
Key features:
- Custom delimiters (comma, tab, semicolon, pipe)
- Optional header row
- Optional field quoting
- Chinese support (UTF-8 BOM)
- One-click CSV download
The algorithm itself isn’t complex, but handling all the details requires considering many edge cases. Hope this helps!
Related Tools: JSON Formatter | CSV to JSON | JSON Diff Checker