From MD5 to SHA-512: Hash Algorithm Principles and Browser Implementation#

While building a file verification tool recently, I revisited hash algorithms. Many developers know terms like MD5 and SHA, but few understand the underlying principles and security implications. This article covers everything from algorithm theory to browser implementation.

The Essence of Hash: One-Way Trapdoor Function#

A hash function has three critical properties:

  1. Deterministic: Same input always produces same output
  2. One-way: Cannot reverse-engineer original data from hash
  3. Collision-resistant: Computationally infeasible to find two inputs with same hash
// Mathematical definition of hash function
hash(input: string)  fixedLengthHexString
// Output length is fixed regardless of input size
hash("hello")   "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"
hash("hello world this is a very long string...")
            "a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e"

Hashing is NOT encryption. Hashing is NOT encryption. Hashing is NOT encryption. Encryption is reversible; hashing is not. This is a common misconception among beginners.

MD5: Classic but Broken#

MD5 (Message-Digest Algorithm 5) was born in 1991, producing 128-bit hash values (32 hex characters).

The core algorithm has four steps:

// Simplified MD5 flow
function md5(input: string): string {
  // 1. Padding to 512 bits
  const padded = paddingTo512Bits(input)

  // 2. Initialize four 32-bit link variables
  let [a, b, c, d] = [0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476]

  // 3. Process in 512-bit blocks
  for (let i = 0; i < padded.length; i += 64) {
    const block = padded.slice(i, i + 64)
    // 4 rounds of compression, 16 steps each
    [a, b, c, d] = processBlock(block, a, b, c, d)
  }

  // 4. Concatenate output
  return toHex(a) + toHex(b) + toHex(c) + toHex(d)
}

Why MD5 is Insecure:

In 2004, Wang Xiaoyun’s team discovered collision attacks against MD5. This means attackers can construct two different files with identical MD5 hashes.

# Classic MD5 collision example
File A: evil.exe MD5 = "5d41402abc4b2a76b9719d911017c592"
File B: good.exe MD5 = "5d41402abc4b2a76b9719d911017c592"
# Two completely different files with same MD5

Real-world impact: digital signature forgery, malware disguise. MD5 should only be used for non-security file integrity checks, never for password storage or digital signatures.

SHA Family: Secure Hash Standard#

SHA (Secure Hash Algorithm) is published by NIST with multiple versions:

Algorithm Output Length Status Use Cases
SHA-0 160-bit Deprecated -
SHA-1 160-bit Not Recommended Git (transitioning)
SHA-256 256-bit Secure Bitcoin, SSL Certificates
SHA-512 512-bit Secure High-security needs

How SHA-256 Works#

SHA-256 is based on sponge construction with 64 compression rounds:

async function sha256(input: string): Promise<string> {
  const encoder = new TextEncoder()
  const data = encoder.encode(input)

  // Native Web Crypto API
  const hashBuffer = await crypto.subtle.digest('SHA-256', data)

  // Convert to hex string
  const hashArray = Array.from(new Uint8Array(hashBuffer))
  return hashArray.map(b => b.toString(16).padStart(2, '0')).join('')
}

// Usage
await sha256("hello")
// "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"

Web Crypto API Advantages:

  1. Performance: Browser native implementation, 10x faster than JavaScript
  2. Security: Uses OS-level cryptographic libraries
  3. Async: Doesn’t block main thread

Browser Implementation: Parallel Four-Algorithm Computation#

In practice, we compute MD5, SHA-1, SHA-256, and SHA-512 simultaneously:

import { useCallback, useState } from 'react'

type HashType = 'md5' | 'sha1' | 'sha256' | 'sha512'

export function useHashGenerator() {
  const [results, setResults] = useState<Record<HashType, string>>({
    md5: '',
    sha1: '',
    sha256: '',
    sha512: '',
  })

  const computeHashes = useCallback(async (input: string) => {
    const encoder = new TextEncoder()
    const data = encoder.encode(input)

    // Web Crypto API for all three SHA algorithms
    const [sha1Buffer, sha256Buffer, sha512Buffer] = await Promise.all([
      crypto.subtle.digest('SHA-1', data),
      crypto.subtle.digest('SHA-256', data),
      crypto.subtle.digest('SHA-512', data),
    ])

    // MD5 requires third-party library (e.g., crypto-js)
    // Web Crypto API doesn't support MD5 (because it's insecure)
    const md5 = await computeMD5(input)

    setResults({
      md5,
      sha1: bufferToHex(sha1Buffer),
      sha256: bufferToHex(sha256Buffer),
      sha512: bufferToHex(sha512Buffer),
    })
  }, [])

  return { results, computeHashes }
}

function bufferToHex(buffer: ArrayBuffer): string {
  return Array.from(new Uint8Array(buffer))
    .map(b => b.toString(16).padStart(2, '0'))
    .join('')
}

Performance Optimization:

  1. Promise.all parallel computation: Three SHA algorithms compute simultaneously
  2. Web Worker: Large file hashing in background thread
  3. Streaming: Process large files in chunks to avoid memory overflow

Practical Applications#

1. Password Storage#

Wrong Approach (plaintext or simple hash):

// ❌ Never do this
database.save({
  username: "alice",
  password: md5(password)  // Vulnerable to rainbow table attacks
})

Correct Approach (salt + slow hash):

import { promisify } from 'util'
import { scrypt, randomBytes } from 'crypto'

const scryptAsync = promisify(scrypt)

async function hashPassword(password: string): Promise<string> {
  const salt = randomBytes(16).toString('hex')
  const derivedKey = await scryptAsync(password, salt, 64) as Buffer
  return `${salt}:${derivedKey.toString('hex')}`
}

async function verifyPassword(password: string, stored: string): Promise<boolean> {
  const [salt, hash] = stored.split(':')
  const derivedKey = await scryptAsync(password, salt, 64) as Buffer
  return derivedKey.toString('hex') === hash
}

Slow-hash algorithms like scrypt, bcrypt, and Argon2 are specifically designed for password storage with adjustable computational cost to resist brute-force attacks.

2. File Verification#

async function verifyFileIntegrity(file: File, expectedSha256: string): Promise<boolean> {
  const buffer = await file.arrayBuffer()
  const hashBuffer = await crypto.subtle.digest('SHA-256', buffer)
  const actual = bufferToHex(hashBuffer)
  return actual === expectedSha256
}

// Use case: Downloading Ubuntu ISO
// Official provides SHA-256, user calculates and compares locally

3. Data Deduplication#

class DeduplicationCache {
  private cache = new Map<string, Blob>()

  async store(data: Blob): Promise<string> {
    const buffer = await data.arrayBuffer()
    const hashBuffer = await crypto.subtle.digest('SHA-256', buffer)
    const hash = bufferToHex(hashBuffer)

    if (!this.cache.has(hash)) {
      this.cache.set(hash, data)
    }
    return hash
  }

  get(hash: string): Blob | undefined {
    return this.cache.get(hash)
  }
}

Cloud storage services (Dropbox, Google Drive) use this technique for instant uploads: detect file hash, skip upload if already exists.

Common Questions#

Q1: Are MD5 and SHA-1 Still Usable?#

Limited use cases:

  • Non-security file integrity checks (e.g., download verification)
  • Git commit IDs (SHA-1, but Git community is migrating to SHA-256)

Never use for:

  • Password storage
  • Digital signatures
  • SSL certificates

Q2: SHA-256 vs SHA-512?#

SHA-256 is slightly slower on 64-bit systems (SHA-512 is optimized for 64-bit), but SHA-256 provides sufficient security with shorter output. Recommended for general use.

Q3: Why No MD5 in Web Crypto API?#

Because MD5 is proven insecure. Browser vendors intentionally exclude it to prevent developer misuse.

Complete Tool Implementation#

Based on these principles, I built an online hash generator: Hash Generator

Features:

  • Supports MD5, SHA-1, SHA-256, SHA-512
  • Real-time computation
  • One-click copy
  • Large text support (chunked processing)

The core code is under 100 lines, but understanding hash principles makes using it much more confident.


Related: File Hash Calculator | Password Strength Checker