NPM Dependency Analysis: From package.json to Dependency Tree Visualization#

I inherited a legacy project recently with 800MB of node_modules and 200+ dependencies in package.json. Every npm install took forever, and CI builds were painfully slow. I decided to build a tool to analyze the dependency situation, and here’s what I learned.

Semantic Versioning: The ^ and ~ Traps#

Version numbers in package.json look simple, but there are pitfalls:

{
  "dependencies": {
    "lodash": "^4.17.21",    // Compatible with 4.x.x, but not guaranteed 5.0.0
    "axios": "~0.27.2",      // Compatible with 0.27.x, but not 0.28.0
    "react": "18.2.0"        // Exact lock
  }
}
  • ^ (caret): Allows minor and patch updates, ^4.17.21 allows >=4.17.21 <5.0.0
  • ~ (tilde): Only allows patch updates, ~0.27.2 allows >=0.27.2 <0.28.0
  • No prefix: Exact lock, suitable for critical dependencies

Parsing version ranges:

function parseVersionRange(version: string) {
  if (version.startsWith('^')) {
    const [major, minor, patch] = version.slice(1).split('.').map(Number)
    return {
      type: 'caret',
      min: version.slice(1),
      max: major === 0 
        ? `${major}.${minor + 1}.0`  // ^0.x.y allows up to 0.(x+1).0
        : `${major + 1}.0.0`         // ^x.y.z allows up to (x+1).0.0
    }
  }
  
  if (version.startsWith('~')) {
    const [major, minor, patch] = version.slice(1).split('.').map(Number)
    return {
      type: 'tilde',
      min: version.slice(1),
      max: `${major}.${minor + 1}.0`
    }
  }
  
  return { type: 'exact', version }
}

Note the special case for ^0.x.y: according to semver spec, 0.x versions are unstable, so ^0.2.3 only allows >=0.2.3 <0.3.0.

Version Parsing: Handling Various Formats#

Real-world version strings are diverse:

{
  "react": "18.2.0",
  "webpack": "5.x",
  "eslint": ">=8.0.0 <9.0.0",
  "typescript": "4.9.x || 5.0.x",
  "node": ">=16",
  "some-git-pkg": "github:user/repo#v1.0.0",
  "private-pkg": "git+ssh://git@company.com/pkg.git#develop"
}

Parsing logic needs to handle these cases:

interface VersionInfo {
  type: 'npm' | 'git' | 'file' | 'link' | 'range'
  value: string
  raw: string
}

function parseDependencyVersion(version: string): VersionInfo {
  // Git URL
  if (version.startsWith('git+') || version.includes('github:')) {
    const match = version.match(/(?:github:|git\+.*?\/\/).*?\/([^#]+)(?:#(.+))?$/)
    return {
      type: 'git',
      value: match?.[2] || 'latest',
      raw: version
    }
  }
  
  // file: or link:
  if (version.startsWith('file:') || version.startsWith('link:')) {
    return { type: 'file', value: version, raw: version }
  }
  
  // Complex range
  if (version.includes('||') || version.includes(' ')) {
    return { type: 'range', value: version, raw: version }
  }
  
  // npm package
  return { type: 'npm', value: version, raw: version }
}

Dependency Tree Traversal: BFS to Avoid Stack Overflow#

Parsing package.json is just the first step. The real dependency tree lives in node_modules. Use BFS (Breadth-First Search) to avoid deep recursion stack overflow:

interface DepNode {
  name: string
  version: string
  depth: number
  dependencies: DepNode[]
}

async function buildDependencyTree(pkgPath: string): Promise<DepNode> {
  const rootPkg = await readPackageJson(pkgPath)
  const root: DepNode = {
    name: rootPkg.name,
    version: rootPkg.version,
    depth: 0,
    dependencies: []
  }
  
  const queue: { node: DepNode; path: string }[] = [{ node: root, path: pkgPath }]
  const visited = new Set<string>()
  
  while (queue.length > 0) {
    const { node, path } = queue.shift()!
    const nodeModulesPath = join(dirname(path), 'node_modules')
    
    for (const [name, version] of Object.entries(node.dependencies || {})) {
      const depPath = join(nodeModulesPath, name, 'package.json')
      
      if (!existsSync(depPath)) continue
      
      const depPkg = await readPackageJson(depPath)
      const key = `${name}@${depPkg.version}`
      
      // Detect circular dependencies
      if (visited.has(key)) continue
      visited.add(key)
      
      const depNode: DepNode = {
        name,
        version: depPkg.version,
        depth: node.depth + 1,
        dependencies: []
      }
      
      node.dependencies.push(depNode)
      queue.push({ node: depNode, path: depPath })
    }
  }
  
  return root
}

Circular Dependency Detection#

Circular dependencies cause infinite loops and must be detected:

// A -> B -> C -> A  circular
function detectCircular(deps: Record<string, string[]>, start: string): string[] | null {
  const visited = new Set<string>()
  const path: string[] = []
  
  function dfs(node: string): string[] | null {
    if (visited.has(node)) {
      const cycleStart = path.indexOf(node)
      return [...path.slice(cycleStart), node]
    }
    
    visited.add(node)
    path.push(node)
    
    for (const dep of deps[node] || []) {
      const cycle = dfs(dep)
      if (cycle) return cycle
    }
    
    path.pop()
    return null
  }
  
  return dfs(start)
}

Latest Version Query: npm Registry API#

Query latest versions using the npm registry API:

async function getLatestVersion(packageName: string): Promise<string> {
  const response = await fetch(`https://registry.npmjs.org/${packageName}`)
  
  if (!response.ok) {
    throw new Error(`Package ${packageName} not found`)
  }
  
  const data = await response.json()
  return data['dist-tags'].latest
}

// Batch query to avoid too many concurrent requests
async function batchGetLatestVersions(packageNames: string[]): Promise<Record<string, string>> {
  const results: Record<string, string> = {}
  const batchSize = 10
  
  for (let i = 0; i < packageNames.length; i += batchSize) {
    const batch = packageNames.slice(i, i + batchSize)
    const versions = await Promise.all(
      batch.map(name => getLatestVersion(name).catch(() => null))
    )
    
    batch.forEach((name, idx) => {
      if (versions[idx]) {
        results[name] = versions[idx]
      }
    })
    
    // Avoid npm registry rate limiting
    if (i + batchSize < packageNames.length) {
      await new Promise(resolve => setTimeout(resolve, 100))
    }
  }
  
  return results
}

Version Comparison Algorithm#

Don’t use string comparison for version numbers:

function compareVersions(a: string, b: string): number {
  const partsA = a.split('.').map(Number)
  const partsB = b.split('.').map(Number)
  
  for (let i = 0; i < Math.max(partsA.length, partsB.length); i++) {
    const numA = partsA[i] || 0
    const numB = partsB[i] || 0
    
    if (numA > numB) return 1
    if (numA < numB) return -1
  }
  
  // Handle pre-release versions: 1.0.0-alpha < 1.0.0
  const preA = a.match(/-(.+)/)?.[1]
  const preB = b.match(/-(.+)/)?.[1]
  
  if (preA && !preB) return -1
  if (!preA && preB) return 1
  if (preA && preB) return preA.localeCompare(preB)
  
  return 0
}

Outdated Dependency Detection#

Check if current version range satisfies the latest version:

import semver from 'semver'

function checkOutdated(current: string, latest: string): {
  outdated: boolean
  type: 'major' | 'minor' | 'patch' | 'none'
} {
  // Exact version
  if (!current.startsWith('^') && !current.startsWith('~')) {
    const diff = semver.diff(current, latest)
    return {
      outdated: current !== latest,
      type: diff || 'none'
    }
  }
  
  const range = current
  const satisfies = semver.satisfies(latest, range)
  
  if (!satisfies) {
    const currentClean = semver.minVersion(range)!
    const diff = semver.diff(currentClean.version, latest)
    return { outdated: true, type: diff || 'major' }
  }
  
  return { outdated: false, type: 'none' }
}

Security Vulnerability Detection#

npm audit queries the npm advisory database:

interface Advisory {
  id: number
  module_name: string
  vulnerable_versions: string
  patched_versions: string
  severity: 'low' | 'moderate' | 'high' | 'critical'
  title: string
  url: string
}

async function auditDependencies(deps: Record<string, string>): Promise<Advisory[]> {
  const response = await fetch('https://registry.npmjs.org/-/npm/v1/security/audits', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      dependencies: deps,
      requires: deps
    })
  })
  
  const data = await response.json()
  return Object.values(data.advisories || {})
}

Performance Optimization: Parallel + Cache#

When analyzing large projects, performance is critical:

1. Parallel Queries#

async function analyzeDependencies(pkgJson: string) {
  const pkg = JSON.parse(pkgJson)
  const allDeps = { ...pkg.dependencies, ...pkg.devDependencies }
  
  // Query latest versions in parallel
  const latestVersions = await batchGetLatestVersions(Object.keys(allDeps))
  
  // Check vulnerabilities in parallel
  const advisories = await auditDependencies(allDeps)
  
  return { latestVersions, advisories }
}

2. Result Caching#

const versionCache = new Map<string, { version: string; timestamp: number }>()
const CACHE_TTL = 1000 * 60 * 60  // 1 hour

async function getCachedLatestVersion(name: string): Promise<string> {
  const cached = versionCache.get(name)
  
  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.version
  }
  
  const version = await getLatestVersion(name)
  versionCache.set(name, { version, timestamp: Date.now() })
  
  return version
}

Real-World Pitfalls#

1. peerDependencies Conflicts#

{
  "react-select": {
    "peerDependencies": {
      "react": "^16.8.0 || ^17.0.0"
    }
  },
  "your-project": {
    "dependencies": {
      "react": "^18.0.0",  // Conflict!
      "react-select": "^5.0.0"
    }
  }
}

Detecting peer dependency conflicts:

function checkPeerDeps(
  pkg: PackageJson,
  installed: Record<string, string>
): string[] {
  const conflicts: string[] = []
  
  for (const [name, version] of Object.entries(pkg.dependencies || {})) {
    const pkgPath = join('node_modules', name, 'package.json')
    const peerDeps = require(pkgPath).peerDependencies || {}
    
    for (const [peerName, peerRange] of Object.entries(peerDeps)) {
      const installedVersion = installed[peerName]
      
      if (!installedVersion) {
        conflicts.push(`${name} requires ${peerName}@${peerRange}, but not installed`)
        continue
      }
      
      if (!semver.satisfies(installedVersion, peerRange as string)) {
        conflicts.push(`${name} requires ${peerName}@${peerRange}, found ${installedVersion}`)
      }
    }
  }
  
  return conflicts
}

2. Phantom Dependencies#

Code uses import 'some-pkg' but it’s not declared in package.json. This works with flattened node_modules, but breaks with different package managers or configurations.

3. Lock File Inconsistency#

package-lock.json and package.json versions being inconsistent leads to unpredictable installs:

function checkLockConsistency(pkg: PackageJson, lock: PackageLock): string[] {
  const issues: string[] = []
  
  for (const [name, version] of Object.entries(pkg.dependencies || {})) {
    const lockVersion = lock.dependencies[name]?.version
    
    if (!lockVersion) {
      issues.push(`${name} in package.json but not in lock file`)
      continue
    }
    
    if (!semver.satisfies(lockVersion, version)) {
      issues.push(`${name}: package.json wants ${version}, lock has ${lockVersion}`)
    }
  }
  
  return issues
}

The Result#

Based on these ideas, I built: NPM Dependency Analyzer

Features:

  • Upload package.json or paste content
  • Parse dependencies and devDependencies
  • Show current vs latest version comparison
  • Detect outdated dependencies and potential issues

The implementation isn’t complex, but getting the details right takes effort. Hope this helps.


Related: JSON Formatter | Code Analyzer