SVG Optimizer Implementation: From Regex to Performance
SVG Optimizer Implementation: From Regex to Performance#
Working on an icon library project, I noticed exported SVG files were often 50KB+. After optimization, they shrank by 60%. But manually uploading and downloading files every time was tedious. So I built my own SVG optimizer. Here’s how it works.
Why Are SVG Files So Large?#
A typical unoptimized SVG:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="24" height="24">
<!-- User icon -->
<metadata>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="">
<dc:title>User Icon</dc:title>
</rdf:Description>
</rdf:RDF>
</metadata>
<title>User Icon</title>
<desc>A simple user icon</desc>
<g fill="none" stroke="currentColor" stroke-width="2">
<path d="M20 21v-2a4 4 0 0 0-4-4H8a4 4 0 0 0-4 4v2"></path>
<circle cx="12" cy="7" r="4"></circle>
</g>
</svg>
This file is 700+ bytes, but only the path and circle are essential. The bloat includes:
- XML declaration and DOCTYPE - Browsers don’t need them
- metadata - Editor garbage
- title and desc - No rendering impact
- Comments - Leftover from development
- Whitespace - Newlines and indentation
After optimization, it shrinks to ~200 bytes - a 70% reduction.
Regex Implementation#
1. Remove DOCTYPE#
function removeDoctype(svg: string): string {
return svg.replace(/<!DOCTYPE[^>]*>/gi, '')
}
[^>]* matches everything before >. The gi flags mean global and case-insensitive.
2. Remove Comments#
function removeComments(svg: string): string {
return svg.replace(/<!--[\s\S]*?-->/g, '')
}
The key is [\s\S]*?:
[\s\S]matches all characters including newlines*?is non-greedy to avoid matching across comments
3. Remove Metadata#
function removeMetadata(svg: string): string {
svg = svg.replace(/<metadata[\s\S]*?<\/metadata>/gi, '')
svg = svg.replace(/<title[\s\S]*?<\/title>/gi, '')
svg = svg.replace(/<desc[\s\S]*?<\/desc>/gi, '')
return svg
}
Gotcha: Tag names might be uppercase, so the i flag is required. Also, <title> is optional in SVG but required in HTML - don’t confuse them.
4. Remove Empty Attributes#
function removeEmptyAttrs(svg: string): string {
return svg.replace(/\s+=""|\s+=''/g, '')
}
Matches both quote styles: ="" and =''. The \s+ ensures there’s whitespace before, avoiding accidental deletion inside strings.
5. Collapse Whitespace#
function collapseWhitespace(svg: string): string {
// Multiple spaces to one
svg = svg.replace(/\s+/g, ' ')
// Remove whitespace between tags
svg = svg.replace(/>\s+</g, '><')
// Remove whitespace before >
svg = svg.replace(/\s+>/g, '>')
return svg.trim()
}
Order matters:
- Compress all consecutive whitespace first
- Remove whitespace between tags (e.g.,
</path> <circle>→</path><circle>) - Remove whitespace before
>(e.g.,<path d="M10 20" >→<path d="M10 20">)
Complete Implementation#
interface OptimizationOptions {
removeDoctype: boolean
removeComments: boolean
removeMetadata: boolean
removeEmptyAttrs: boolean
collapseWhitespace: boolean
}
function optimizeSvg(svg: string, options: OptimizationOptions): string {
let result = svg
if (options.removeDoctype) {
result = result.replace(/<!DOCTYPE[^>]*>/gi, '')
}
if (options.removeComments) {
result = result.replace(/<!--[\s\S]*?-->/g, '')
}
if (options.removeMetadata) {
result = result.replace(/<metadata[\s\S]*?<\/metadata>/gi, '')
result = result.replace(/<title[\s\S]*?<\/title>/gi, '')
result = result.replace(/<desc[\s\S]*?<\/desc>/gi, '')
}
if (options.removeEmptyAttrs) {
result = result.replace(/\s+=""|\s+=''/g, '')
}
if (options.collapseWhitespace) {
result = result.replace(/\s+/g, ' ')
result = result.replace(/>\s+</g, '><')
}
result = result.replace(/\s+>/g, '>')
return result.trim()
}
Performance Optimization#
1. Avoid Repeated Regex Compilation#
Each replace call compiles the regex. Pre-compile them:
const REGEX = {
doctype: /<!DOCTYPE[^>]*>/gi,
comment: /<!--[\s\S]*?-->/g,
metadata: /<metadata[\s\S]*?<\/metadata>/gi,
title: /<title[\s\S]*?<\/title>/gi,
desc: /<desc[\s\S]*?<\/desc>/gi,
emptyAttr: /\s+=""|\s+=''/g,
whitespace: /\s+/g,
betweenTags: />\s+</g,
beforeClose: /\s+>/g
}
function optimizeSvg(svg: string, options: OptimizationOptions): string {
let result = svg
if (options.removeDoctype) result = result.replace(REGEX.doctype, '')
// ... other replacements
return result.trim()
}
2. Large File Handling#
For large SVGs (maps, charts), direct processing blocks the UI:
// Use Web Worker for async processing
const worker = new Worker('svg-optimizer-worker.js')
function optimizeAsync(svg: string): Promise<string> {
return new Promise((resolve) => {
worker.postMessage({ svg, options })
worker.onmessage = (e) => resolve(e.data.result)
})
}
// worker.js
self.onmessage = (e) => {
const result = optimizeSvg(e.data.svg, e.data.options)
self.postMessage({ result })
}
3. Streaming Processing#
For huge files (>10MB), process in chunks:
async function optimizeLargeSvg(svg: string): Promise<string> {
const CHUNK_SIZE = 1024 * 1024 // 1MB
const chunks: string[] = []
for (let i = 0; i < svg.length; i += CHUNK_SIZE) {
const chunk = svg.slice(i, i + CHUNK_SIZE)
chunks.push(optimizeChunk(chunk))
// Yield to main thread
await new Promise(resolve => setTimeout(resolve, 0))
}
return chunks.join('')
}
Edge Cases#
1. CDATA Sections#
SVG may contain CDATA blocks with <!-- strings:
<script><![CDATA[
// This <!-- is not a comment
var x = "<!-- not a comment -->";
]]></script>
Simple regex would delete them incorrectly:
// Wrong: deletes CDATA content too
svg.replace(/<!--[\s\S]*?-->/g, '')
Correct approach: extract CDATA first, optimize, then restore:
function preserveCdata(svg: string): { svg: string, cdatas: string[] } {
const cdatas: string[] = []
const result = svg.replace(/<!\[CDATA\[[\s\S]*?\]\]>/g, (match) => {
cdatas.push(match)
return `__CDATA_${cdatas.length - 1}__`
})
return { svg: result, cdatas }
}
function restoreCdata(svg: string, cdatas: string[]): string {
return cdatas.reduce((result, cdata, i) => {
return result.replace(`__CDATA_${i}__`, cdata)
}, svg)
}
2. Inline Styles#
CSS inside <style> tags may contain special characters:
<style>
.icon { fill: red; }
/* comment */
</style>
Preserve <style> content, only remove comments:
function optimizeStyles(svg: string): string {
return svg.replace(/<style>([\s\S]*?)<\/style>/gi, (match, css) => {
const optimized = css.replace(/\/\*[\s\S]*?\*\//g, '')
return `<style>${optimized}</style>`
})
}
3. XML Entities#
SVG may contain XML entities like < > &:
<text><script>alert('XSS')</script></text>
Don’t decode them during optimization:
// Wrong: turns < into <
result.replace(/</g, '<') // Don't do this!
Real Results#
Based on this implementation, I built: SVG Optimizer
Test results:
| SVG Type | Original | Optimized | Reduction |
|---|---|---|---|
| Simple icon | 1.2 KB | 0.4 KB | 66% |
| Complex chart | 15 KB | 8 KB | 47% |
| Map vector | 120 KB | 85 KB | 29% |
Simple icons benefit most. Complex SVGs have limited optimization potential since path data dominates.
Advanced Optimization#
Regex only handles surface optimization. Deeper optimization requires parsing the SVG structure:
- Path simplification:
M10 20 L30 40→M10 20 30 40(L is optional) - Merge paths: Adjacent paths can be combined
- Remove hidden elements: Elements with
display="none"can be deleted - Simplify transforms:
<g transform="translate(10, 20)">can merge into children
These require professional tools like SVGO or custom XML tree parsing.
Related: Image Compress | Base64 Encoder