UUID Generator Algorithms: From v1 to v4 Implementation#

Building a distributed system recently, I needed globally unique IDs. Started with auto-increment database IDs, then hit a wall with sharding—different databases would generate conflicting IDs. After some research, I went with UUID. Here’s what I learned.

What is UUID?#

UUID (Universally Unique Identifier) is a 128-bit unique identifier, typically shown as a 36-character string:

550e8400-e29b-41d4-a716-446655440000

The format is 8-4-4-4-12, separated by hyphens. The key feature: theoretically never duplicates. More precisely, generating 1 billion UUID v4s has a collision probability of about 0.00000000006%.

The 5 UUID Versions#

UUID has 5 versions, each with different use cases:

Version Generation Method Characteristics Use Case
v1 Timestamp + MAC address Ordered, traceable Time-ordered scenarios
v2 DCE Security Rarely used DCE environments
v3 MD5 hash + namespace Deterministic Same input = same ID
v4 Random numbers Simple, efficient Most common
v5 SHA-1 hash + namespace Deterministic, more secure Replaces v3

In practice, v4 accounts for 90%+ of usage, followed by v1 and v5.

UUID v4 Implementation#

v4 is the simplest version—just random numbers. But not any random 32 hex characters; it has strict format requirements:

xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
  • Position 13 must be 4 (version number)
  • Position 17 (y) can only be 8, 9, a, b (variant identifier)

JavaScript implementation:

function uuidv4() {
  return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, (c) => {
    const r = (Math.random() * 16) | 0
    const v = c === 'x' ? r : (r & 0x3) | 0x8
    return v.toString(16)
  })
}

console.log(uuidv4())
// Output: 550e8400-e29b-41d4-a716-446655440000

Bitwise Operations Explained#

(Math.random() * 16) | 0 generates a random integer from 0-15. | 0 is a fast truncation trick, equivalent to Math.floor().

(r & 0x3) | 0x8 ensures the y position is in the range 8-11 (binary 10xx):

  • r & 0x3: takes the lower 2 bits of r (0-3)
  • | 0x8: sets the 3rd bit to 1 (adds 8)

So y can only be 8, 9, 10, 11, corresponding to hex 8, 9, a, b.

Native Browser API#

Modern browsers provide a native method:

// Simplest approach
const uuid = crypto.randomUUID()
console.log(uuid)
// Output: 550e8400-e29b-41d4-a716-446655440000

crypto.randomUUID() is part of the Web Crypto API, with better performance and stronger randomness (using the OS’s CSPRNG).

UUID v1 Implementation#

v1 is generated from timestamp and MAC address, providing ordering. The structure:

time_low (32 bits) | time_mid (16 bits) | time_hi_and_version (16 bits) |
clock_seq_hi_and_reserved (8 bits) | clock_seq_low (8 bits) |
node (48 bits)

Timestamp Part#

UUID v1 uses 100-nanosecond intervals since 1582-10-15 00:00:00 (Gregorian calendar reform):

const GREGORIAN_EPOCH = 122192928000000000n // 1582-10-15 100ns count

function getUuidV1Timestamp() {
  const now = Date.now() * 10000n + GREGORIAN_EPOCH
  return now
}

Simplified Implementation#

A complete v1 implementation needs MAC address access, impossible in browsers. A simplified version uses random numbers:

function uuidv1() {
  const now = Date.now()
  const random = Math.random().toString(16).substring(2, 10)
  
  return `${now.toString(16).padStart(8, '0')}-${random.substring(0, 4)}-1xxx-yxxx-${random.substring(4)}xxx`
    .replace(/[xy]/g, (c) => {
      const r = (Math.random() * 16) | 0
      const v = c === 'x' ? r : (r & 0x3) | 0x8
      return v.toString(16)
    })
}

v1 Problems#

v1 has two main issues:

  1. Privacy leak: MAC address exposes device information
  2. Predictability: Knowing the timestamp allows inferring nearby UUIDs

So v1 isn’t suitable for security-sensitive scenarios.

UUID v3/v5: Deterministic Generation#

v3 and v5 are generated from namespace and name—same input always produces same output. The difference is the hash algorithm:

  • v3: MD5 (128 bits, used directly)
  • v5: SHA-1 (160 bits, truncated to 128 bits)
import { createHash } from 'crypto'

function uuidv5(namespace, name) {
  const hash = createHash('sha1')
    .update(namespace + name)
    .digest()
  
  // Set version and variant
  hash[6] = (hash[6] & 0x0f) | 0x50 // Version 5
  hash[8] = (hash[8] & 0x3f) | 0x80 // Variant
  
  const hex = hash.toString('hex')
  return `${hex.slice(0,8)}-${hex.slice(8,12)}-${hex.slice(12,16)}-${hex.slice(16,20)}-${hex.slice(20,32)}`
}

// Example: Generate fixed UUID for URL
const namespace = '6ba7b810-9dad-11d1-80b4-00c04fd430c8' // DNS namespace
console.log(uuidv5(namespace, 'example.com'))
// Same output every time

Use cases:

  • Generate fixed IDs for same URLs
  • Data deduplication (same content = same ID)
  • Idempotency in distributed systems

Collision Probability Analysis#

How big is UUID’s 128-bit space? About 3.4 × 10^38 possible values.

According to the birthday paradox, the probability of collision after generating n UUIDs:

p(n) ≈ 1 - e^(-n²/(2×2^122))

Actual data:

UUID Count Collision Probability
1 million 0.00000000006%
1 billion 0.00000006%
1 trillion 0.06%

To reach 50% collision probability, you need to generate about 2.71 × 10^18 UUIDs. At 1 billion per second, that would take 85 years.

Performance Comparison#

Testing 1 million UUID generation in Node.js:

// Test code
const start = performance.now()
for (let i = 0; i < 1000000; i++) {
  crypto.randomUUID()
}
console.log(`crypto.randomUUID: ${(performance.now() - start).toFixed(2)}ms`)

const start2 = performance.now()
for (let i = 0; i < 1000000; i++) {
  uuidv4()
}
console.log(`Custom uuidv4: ${(performance.now() - start2).toFixed(2)}ms`)

Results:

  • crypto.randomUUID(): ~200ms
  • Custom uuidv4(): ~400ms

Native API is twice as fast, but both are millisecond-level—minimal difference in practice.

Real-World Pitfalls#

1. Database Primary Key#

Using UUID as primary key:

-- Problem: UUID is unordered, causing frequent B+ tree splits
CREATE TABLE users (
  id CHAR(36) PRIMARY KEY,  -- UUID
  name VARCHAR(100)
)

Solutions:

  • Use UUID v1 (ordered)
  • Or use ULID (timestamp-prefixed UUID variant)
  • Or use Snowflake algorithm

2. Index Efficiency#

UUID’s 36-character string has poor index efficiency:

// Optimization: Store as binary
const uuid = '550e8400-e29b-41d4-a716-446655440000'
const binary = Buffer.from(uuid.replace(/-/g, ''), 'hex')  // 16 bytes

MySQL can use BINARY(16) storage, halving the index size.

3. Frontend Performance#

Batch UUID generation blocks the main thread:

// Wrong: Generating 100k will freeze
for (let i = 0; i < 100000; i++) {
  uuids.push(crypto.randomUUID())
}

// Right: Process in batches
async function generateBatch(count, batchSize = 1000) {
  const results = []
  for (let i = 0; i < count; i += batchSize) {
    await new Promise(resolve => setTimeout(resolve, 0))
    for (let j = 0; j < batchSize && i + j < count; j++) {
      results.push(crypto.randomUUID())
    }
  }
  return results
}

Online Tool#

For easy testing, I built an online UUID Generator supporting:

  • v1 and v4 version switching
  • Batch generation (up to 100)
  • Format options (uppercase, no hyphens)

The core code is under 50 lines, but handling version differences and format options took some time.


Related: Password Generator | Hash Generator