JSON Compression: From stringify to Network Transmission Optimization#

Recently, while optimizing API performance, I noticed the JSON responses were surprisingly large. Just a few records, yet the response was dozens of KB. Looking closer, it was all line breaks and indentation spaces. I built a JSON compression tool and documented the implementation details.

The Core: One Line of Code#

const compressed = JSON.stringify(JSON.parse(input))

When JSON.stringify receives no third argument, it outputs compact format — no line breaks, no indentation, no extra spaces.

But building a usable tool requires handling more details.

Character Encoding Trap: Blob vs string.length#

The first pitfall in calculating compression ratio is string length.

const json = '{"name":"张三"}'
console.log(json.length)  // 16
console.log(new Blob([json]).size)  // 19

JavaScript’s string.length returns UTF-16 code units, not bytes. Chinese characters take 3 bytes in UTF-8, so string.length underestimates actual size.

The correct approach uses Blob to calculate UTF-8 bytes:

function calculateBytes(str: string): number {
  return new Blob([str]).size
}

// Compression ratio calculation
const originalBytes = new Blob([input]).size
const compressedBytes = new Blob([compressed]).size
const savedPercentage = ((1 - compressedBytes / originalBytes) * 100).toFixed(1)

Error Position: From Character to Line/Column#

When JSON.parse fails, it returns a character position that means nothing to users:

Unexpected token } in JSON at position 45

Convert it to line and column:

function getLineAndColumn(input: string, position: number) {
  const lines = input.substring(0, position).split('\n')
  return {
    line: lines.length,
    column: lines[lines.length - 1].length + 1
  }
}

Take the substring before the error position, split by newlines. Line number = array length, column number = last line’s character count + 1.

Why Not Just Use Regex to Remove Whitespace?#

You might think: why not just regex replace?

// Wrong approach
const compressed = input
  .replace(/\s+/g, '')  // Remove all whitespace
  .replace(/,\s*}/g, '}')  // Handle trailing commas
  .replace(/,\s*]/g, ']')

Several problems with this:

  1. Spaces inside strings get deleted: {"text": "Hello World"} becomes {"text":"HelloWorld"}
  2. Escape character handling is complex: Is \n in "Line1\nLine2" a newline or two characters?
  3. Endless edge cases: Unicode whitespace, zero-width spaces, non-breaking spaces…

The correct approach is parse + stringify, letting the engine handle all edge cases:

function compressJson(input: string): JsonResult {
  try {
    const parsed = JSON.parse(input)
    const compressed = JSON.stringify(parsed)
    return { data: compressed }
  } catch (e) {
    return { error: `JSON parse error: ${(e as Error).message}` }
  }
}

Compression Effectiveness Analysis#

When does compression work best?

High Effectiveness Scenarios#

  • Formatted JSON: 2 or 4 space indentation, 30%-50% compression
  • JSON with comments (requires preprocessing to remove)
  • Deeply nested structures: Each level’s indentation accumulates

Limited Effectiveness Scenarios#

  • Already compressed JSON: No further compression possible
  • Large string content: Strings themselves can’t be compressed

Real-world measurements:

Data Type Original Size Compressed Saved
Formatted config file 15.2 KB 8.1 KB 46.7%
API response (formatted) 42.3 KB 24.6 KB 41.8%
Already compressed JSON 12.4 KB 12.4 KB 0%

Performance Optimization: Large Files#

When JSON reaches MB levels, the main thread will block. Solutions:

Web Worker Background Processing#

// worker.ts
self.onmessage = (e) => {
  try {
    const parsed = JSON.parse(e.data)
    const compressed = JSON.stringify(parsed)
    self.postMessage({ success: true, data: compressed })
  } catch (e) {
    self.postMessage({ success: false, error: (e as Error).message })
  }
}

// main.tsx
const worker = new Worker('worker.ts')
worker.postMessage(largeJson)
worker.onmessage = (e) => {
  if (e.data.success) {
    setOutput(e.data.data)
  }
}

Streaming Processing (Very Large Files)#

For 100MB+ JSON, consider streaming parsing (requires libraries like stream-json).

Practical Applications#

1. Network Transmission Optimization#

Compressed JSON is smaller and transfers faster. Even better with gzip compression.

2. LocalStorage Storage#

Browser LocalStorage has 5-10MB limits. Compression stores more data.

3. Database Storage#

MongoDB, PostgreSQL JSON fields save space with compressed storage.

4. Configuration File Publishing#

Use formatted JSON in development for readability, compress for production deployment.

Complete Implementation#

Based on these ideas, I built: JSON Compress

Features:

  • One-click JSON compression
  • Automatic compression ratio calculation
  • Error location with line/column
  • Large file support

The implementation isn’t complex, but getting the details right takes effort. Hope this helps.


Related: JSON Formatter | JSON Validator