NPM Dependency Analysis: From package.json to Dependency Tree Visualization
NPM Dependency Analysis: From package.json to Dependency Tree Visualization#
I inherited a legacy project recently with 800MB of node_modules and 200+ dependencies in package.json. Every npm install took forever, and CI builds were painfully slow. I decided to build a tool to analyze the dependency situation, and here’s what I learned.
Semantic Versioning: The ^ and ~ Traps#
Version numbers in package.json look simple, but there are pitfalls:
{
"dependencies": {
"lodash": "^4.17.21", // Compatible with 4.x.x, but not guaranteed 5.0.0
"axios": "~0.27.2", // Compatible with 0.27.x, but not 0.28.0
"react": "18.2.0" // Exact lock
}
}
^(caret): Allows minor and patch updates,^4.17.21allows>=4.17.21 <5.0.0~(tilde): Only allows patch updates,~0.27.2allows>=0.27.2 <0.28.0- No prefix: Exact lock, suitable for critical dependencies
Parsing version ranges:
function parseVersionRange(version: string) {
if (version.startsWith('^')) {
const [major, minor, patch] = version.slice(1).split('.').map(Number)
return {
type: 'caret',
min: version.slice(1),
max: major === 0
? `${major}.${minor + 1}.0` // ^0.x.y allows up to 0.(x+1).0
: `${major + 1}.0.0` // ^x.y.z allows up to (x+1).0.0
}
}
if (version.startsWith('~')) {
const [major, minor, patch] = version.slice(1).split('.').map(Number)
return {
type: 'tilde',
min: version.slice(1),
max: `${major}.${minor + 1}.0`
}
}
return { type: 'exact', version }
}
Note the special case for ^0.x.y: according to semver spec, 0.x versions are unstable, so ^0.2.3 only allows >=0.2.3 <0.3.0.
Version Parsing: Handling Various Formats#
Real-world version strings are diverse:
{
"react": "18.2.0",
"webpack": "5.x",
"eslint": ">=8.0.0 <9.0.0",
"typescript": "4.9.x || 5.0.x",
"node": ">=16",
"some-git-pkg": "github:user/repo#v1.0.0",
"private-pkg": "git+ssh://git@company.com/pkg.git#develop"
}
Parsing logic needs to handle these cases:
interface VersionInfo {
type: 'npm' | 'git' | 'file' | 'link' | 'range'
value: string
raw: string
}
function parseDependencyVersion(version: string): VersionInfo {
// Git URL
if (version.startsWith('git+') || version.includes('github:')) {
const match = version.match(/(?:github:|git\+.*?\/\/).*?\/([^#]+)(?:#(.+))?$/)
return {
type: 'git',
value: match?.[2] || 'latest',
raw: version
}
}
// file: or link:
if (version.startsWith('file:') || version.startsWith('link:')) {
return { type: 'file', value: version, raw: version }
}
// Complex range
if (version.includes('||') || version.includes(' ')) {
return { type: 'range', value: version, raw: version }
}
// npm package
return { type: 'npm', value: version, raw: version }
}
Dependency Tree Traversal: BFS to Avoid Stack Overflow#
Parsing package.json is just the first step. The real dependency tree lives in node_modules. Use BFS (Breadth-First Search) to avoid deep recursion stack overflow:
interface DepNode {
name: string
version: string
depth: number
dependencies: DepNode[]
}
async function buildDependencyTree(pkgPath: string): Promise<DepNode> {
const rootPkg = await readPackageJson(pkgPath)
const root: DepNode = {
name: rootPkg.name,
version: rootPkg.version,
depth: 0,
dependencies: []
}
const queue: { node: DepNode; path: string }[] = [{ node: root, path: pkgPath }]
const visited = new Set<string>()
while (queue.length > 0) {
const { node, path } = queue.shift()!
const nodeModulesPath = join(dirname(path), 'node_modules')
for (const [name, version] of Object.entries(node.dependencies || {})) {
const depPath = join(nodeModulesPath, name, 'package.json')
if (!existsSync(depPath)) continue
const depPkg = await readPackageJson(depPath)
const key = `${name}@${depPkg.version}`
// Detect circular dependencies
if (visited.has(key)) continue
visited.add(key)
const depNode: DepNode = {
name,
version: depPkg.version,
depth: node.depth + 1,
dependencies: []
}
node.dependencies.push(depNode)
queue.push({ node: depNode, path: depPath })
}
}
return root
}
Circular Dependency Detection#
Circular dependencies cause infinite loops and must be detected:
// A -> B -> C -> A circular
function detectCircular(deps: Record<string, string[]>, start: string): string[] | null {
const visited = new Set<string>()
const path: string[] = []
function dfs(node: string): string[] | null {
if (visited.has(node)) {
const cycleStart = path.indexOf(node)
return [...path.slice(cycleStart), node]
}
visited.add(node)
path.push(node)
for (const dep of deps[node] || []) {
const cycle = dfs(dep)
if (cycle) return cycle
}
path.pop()
return null
}
return dfs(start)
}
Latest Version Query: npm Registry API#
Query latest versions using the npm registry API:
async function getLatestVersion(packageName: string): Promise<string> {
const response = await fetch(`https://registry.npmjs.org/${packageName}`)
if (!response.ok) {
throw new Error(`Package ${packageName} not found`)
}
const data = await response.json()
return data['dist-tags'].latest
}
// Batch query to avoid too many concurrent requests
async function batchGetLatestVersions(packageNames: string[]): Promise<Record<string, string>> {
const results: Record<string, string> = {}
const batchSize = 10
for (let i = 0; i < packageNames.length; i += batchSize) {
const batch = packageNames.slice(i, i + batchSize)
const versions = await Promise.all(
batch.map(name => getLatestVersion(name).catch(() => null))
)
batch.forEach((name, idx) => {
if (versions[idx]) {
results[name] = versions[idx]
}
})
// Avoid npm registry rate limiting
if (i + batchSize < packageNames.length) {
await new Promise(resolve => setTimeout(resolve, 100))
}
}
return results
}
Version Comparison Algorithm#
Don’t use string comparison for version numbers:
function compareVersions(a: string, b: string): number {
const partsA = a.split('.').map(Number)
const partsB = b.split('.').map(Number)
for (let i = 0; i < Math.max(partsA.length, partsB.length); i++) {
const numA = partsA[i] || 0
const numB = partsB[i] || 0
if (numA > numB) return 1
if (numA < numB) return -1
}
// Handle pre-release versions: 1.0.0-alpha < 1.0.0
const preA = a.match(/-(.+)/)?.[1]
const preB = b.match(/-(.+)/)?.[1]
if (preA && !preB) return -1
if (!preA && preB) return 1
if (preA && preB) return preA.localeCompare(preB)
return 0
}
Outdated Dependency Detection#
Check if current version range satisfies the latest version:
import semver from 'semver'
function checkOutdated(current: string, latest: string): {
outdated: boolean
type: 'major' | 'minor' | 'patch' | 'none'
} {
// Exact version
if (!current.startsWith('^') && !current.startsWith('~')) {
const diff = semver.diff(current, latest)
return {
outdated: current !== latest,
type: diff || 'none'
}
}
const range = current
const satisfies = semver.satisfies(latest, range)
if (!satisfies) {
const currentClean = semver.minVersion(range)!
const diff = semver.diff(currentClean.version, latest)
return { outdated: true, type: diff || 'major' }
}
return { outdated: false, type: 'none' }
}
Security Vulnerability Detection#
npm audit queries the npm advisory database:
interface Advisory {
id: number
module_name: string
vulnerable_versions: string
patched_versions: string
severity: 'low' | 'moderate' | 'high' | 'critical'
title: string
url: string
}
async function auditDependencies(deps: Record<string, string>): Promise<Advisory[]> {
const response = await fetch('https://registry.npmjs.org/-/npm/v1/security/audits', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
dependencies: deps,
requires: deps
})
})
const data = await response.json()
return Object.values(data.advisories || {})
}
Performance Optimization: Parallel + Cache#
When analyzing large projects, performance is critical:
1. Parallel Queries#
async function analyzeDependencies(pkgJson: string) {
const pkg = JSON.parse(pkgJson)
const allDeps = { ...pkg.dependencies, ...pkg.devDependencies }
// Query latest versions in parallel
const latestVersions = await batchGetLatestVersions(Object.keys(allDeps))
// Check vulnerabilities in parallel
const advisories = await auditDependencies(allDeps)
return { latestVersions, advisories }
}
2. Result Caching#
const versionCache = new Map<string, { version: string; timestamp: number }>()
const CACHE_TTL = 1000 * 60 * 60 // 1 hour
async function getCachedLatestVersion(name: string): Promise<string> {
const cached = versionCache.get(name)
if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
return cached.version
}
const version = await getLatestVersion(name)
versionCache.set(name, { version, timestamp: Date.now() })
return version
}
Real-World Pitfalls#
1. peerDependencies Conflicts#
{
"react-select": {
"peerDependencies": {
"react": "^16.8.0 || ^17.0.0"
}
},
"your-project": {
"dependencies": {
"react": "^18.0.0", // Conflict!
"react-select": "^5.0.0"
}
}
}
Detecting peer dependency conflicts:
function checkPeerDeps(
pkg: PackageJson,
installed: Record<string, string>
): string[] {
const conflicts: string[] = []
for (const [name, version] of Object.entries(pkg.dependencies || {})) {
const pkgPath = join('node_modules', name, 'package.json')
const peerDeps = require(pkgPath).peerDependencies || {}
for (const [peerName, peerRange] of Object.entries(peerDeps)) {
const installedVersion = installed[peerName]
if (!installedVersion) {
conflicts.push(`${name} requires ${peerName}@${peerRange}, but not installed`)
continue
}
if (!semver.satisfies(installedVersion, peerRange as string)) {
conflicts.push(`${name} requires ${peerName}@${peerRange}, found ${installedVersion}`)
}
}
}
return conflicts
}
2. Phantom Dependencies#
Code uses import 'some-pkg' but it’s not declared in package.json. This works with flattened node_modules, but breaks with different package managers or configurations.
3. Lock File Inconsistency#
package-lock.json and package.json versions being inconsistent leads to unpredictable installs:
function checkLockConsistency(pkg: PackageJson, lock: PackageLock): string[] {
const issues: string[] = []
for (const [name, version] of Object.entries(pkg.dependencies || {})) {
const lockVersion = lock.dependencies[name]?.version
if (!lockVersion) {
issues.push(`${name} in package.json but not in lock file`)
continue
}
if (!semver.satisfies(lockVersion, version)) {
issues.push(`${name}: package.json wants ${version}, lock has ${lockVersion}`)
}
}
return issues
}
The Result#
Based on these ideas, I built: NPM Dependency Analyzer
Features:
- Upload
package.jsonor paste content - Parse dependencies and devDependencies
- Show current vs latest version comparison
- Detect outdated dependencies and potential issues
The implementation isn’t complex, but getting the details right takes effort. Hope this helps.
Related: JSON Formatter | Code Analyzer