Extracting Color Palettes from Images: A Practical Guide to Color Quantization#

I was building a design tool recently that needed to extract dominant colors from user-uploaded images. After diving into color quantization algorithms, I found it more interesting than expected.

The Core Problem#

A 1920×1080 image has 2 million pixels, each with its own RGB color. Direct counting could yield hundreds of thousands of unique colors—not very useful.

What we need: Find the 8-10 most representative colors from hundreds of thousands. This is the color quantization problem.

The Naive Approach: Color Bucket Counting#

I started with the simplest method:

function extractColors(imageData: ImageData, numColors: number = 8) {
  const { data, width, height } = imageData
  const colorMap = new Map<string, { r: number; g: number; b: number; count: number }>()
  
  // Iterate all pixels
  for (let i = 0; i < data.length; i += 4) {
    const r = data[i]
    const g = data[i + 1]
    const b = data[i + 2]
    const key = `${r},${g},${b}`
    
    if (colorMap.has(key)) {
      colorMap.get(key)!.count++
    } else {
      colorMap.set(key, { r, g, b, count: 1 })
    }
  }
  
  // Sort by count, take top N
  return Array.from(colorMap.values())
    .sort((a, b) => b.count - a.count)
    .slice(0, numColors)
}

This has a fatal flaw: color space is too scattered.

Consider a sky photo with 100 shades of blue:

  • rgb(100, 150, 255) - 1200 pixels
  • rgb(101, 149, 254) - 980 pixels
  • rgb(99, 151, 255) - 850 pixels

Visually identical, but counted separately. The extracted 8 colors might all be blue variants, lacking diversity.

Improved: Color Bucketing#

The solution is to group similar colors. Simplest approach: reduce color precision:

const r = Math.round(data[i] / 32) * 32      // 0-255 → 0, 32, 64, ..., 224, 255
const g = Math.round(data[i + 1] / 32) * 32
const b = Math.round(data[i + 2] / 32) * 32

Now rgb(100, 150, 255) and rgb(101, 149, 254) both map to rgb(96, 160, 256).

Why divide by 32? RGB has 8 bits per channel (0-255). Dividing by 32 gives 8 levels per channel, resulting in 8×8×8 = 512 color buckets. Enough to merge similar colors while preserving detail.

function extractColors(imageData: ImageData, numColors: number = 8) {
  const { data, width, height } = imageData
  const colorMap = new Map<string, { r: number; g: number; b: number; count: number }>()
  
  // Sampling step: large images don't need every pixel
  const step = Math.max(1, Math.floor(width * height / 10000))
  
  for (let i = 0; i < data.length; i += 4 * step) {
    // Color bucketing
    const r = Math.round(data[i] / 32) * 32
    const g = Math.round(data[i + 1] / 32) * 32
    const b = Math.round(data[i + 2] / 32) * 32
    const key = `${r},${g},${b}`
    
    if (colorMap.has(key)) {
      colorMap.get(key)!.count++
    } else {
      colorMap.set(key, { r, g, b, count: 1 })
    }
  }
  
  return Array.from(colorMap.values())
    .sort((a, b) => b.count - a.count)
    .slice(0, numColors)
    .map(c => ({
      hex: rgbToHex(c.r, c.g, c.b),
      r: c.r,
      g: c.g,
      b: c.b,
      count: c.count
    }))
}

function rgbToHex(r: number, g: number, b: number): string {
  return '#' + [r, g, b].map(x => x.toString(16).padStart(2, '0')).join('')
}

Performance: Sampling Strategy#

Iterating 2 million pixels is slow. We don’t need exact counts—sampling works fine:

const step = Math.max(1, Math.floor(width * height / 10000))
for (let i = 0; i < data.length; i += 4 * step) {
  // Process only 1/step of pixels
}

For a 2-megapixel image, step = 200, sampling 10,000 pixels. 200× faster with negligible quality loss.

Advanced: Median Cut#

Color bucketing has a problem: fixed bucket sizes. If an image is mostly blue, blue buckets overflow while others sit empty.

A smarter approach is Median Cut:

  1. Put all pixels in a color cube
  2. Find the longest edge (R/G/B channel with highest variance)
  3. Split at the median into two cubes
  4. Repeat until you have N cubes
  5. Each cube’s centroid is a dominant color
function medianCut(pixels: RGB[], numColors: number): RGB[] {
  if (pixels.length === 0 || numColors === 1) {
    return [averageColor(pixels)]
  }
  
  // Find channel with highest variance
  const ranges = ['r', 'g', 'b'].map(channel => {
    const values = pixels.map(p => p[channel])
    return Math.max(...values) - Math.min(...values)
  })
  const maxChannel = ['r', 'g', 'b'][ranges.indexOf(Math.max(...ranges))]
  
  // Sort and split by that channel
  pixels.sort((a, b) => a[maxChannel] - b[maxChannel])
  const mid = Math.floor(pixels.length / 2)
  
  // Recursive split
  const left = medianCut(pixels.slice(0, mid), numColors / 2)
  const right = medianCut(pixels.slice(mid), numColors / 2)
  
  return [...left, ...right]
}

Median Cut adaptively splits color space—blue-heavy regions get finer splits, preserving color diversity.

Canvas Implementation#

In the browser, use Canvas to read image pixels:

function processImage(file: File) {
  const reader = new FileReader()
  reader.onload = () => {
    const img = new Image()
    img.onload = () => {
      const canvas = document.createElement('canvas')
      const ctx = canvas.getContext('2d')
      
      // Downscale for performance
      const maxSize = 200
      const scale = Math.min(maxSize / img.width, maxSize / img.height, 1)
      canvas.width = img.width * scale
      canvas.height = img.height * scale
      
      ctx.drawImage(img, 0, 0, canvas.width, canvas.height)
      const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height)
      
      const colors = extractColors(imageData, 8)
      console.log(colors)
    }
    img.src = reader.result
  }
  reader.readAsDataURL(file)
}

Key points:

  1. Downscaling: Resize to max 200px to reduce pixel count
  2. CORS: For cross-origin images, set crossOrigin = 'anonymous'
  3. FileReader: Convert File to DataURL for Image

The Result#

Based on these ideas, I built: Color Palette Extractor

Features:

  • Drag-and-drop image upload
  • Extracts 8 dominant colors
  • Shows HEX and RGB values
  • Click to copy color codes

The algorithm is simple but works well for most images. For more precision, consider k-means clustering or Octree algorithms—though implementation complexity increases significantly.

Lessons Learned#

1. Handling Transparent Pixels#

PNGs may have alpha channels—filter them out:

for (let i = 0; i < data.length; i += 4) {
  const alpha = data[i + 3]
  if (alpha === 0) continue  // Skip transparent pixels
  
  // Process color...
}

2. HEX Format Padding#

Zero-padding matters for HEX:

function rgbToHex(r: number, g: number, b: number): string {
  return '#' + [r, g, b]
    .map(x => x.toString(16).padStart(2, '0'))  // padStart is crucial
    .join('')
}

rgbToHex(0, 0, 0)  // '#000000', not '#000'

3. Sorting by Luminance#

Sort extracted colors by brightness for easier use:

function getLuminance(r: number, g: number, b: number): number {
  return 0.299 * r + 0.587 * g + 0.114 * b
}

colors.sort((a, b) => getLuminance(b.r, b.g, b.b) - getLuminance(a.r, a.g, a.b))

Related: Color Contrast Checker | CSS Gradient Generator