Deep Dive into Linux ps Command: From Process States to Performance Monitoring
Deep Dive into Linux ps Command: From Process States to Performance Monitoring#
As a Linux sysadmin, ps is one of the most frequently used tools. But most people only know ps aux without understanding the implementation behind it. Let’s dive deep into this command.
The Core: Reading /proc Filesystem#
ps doesn’t call system APIs directly. Instead, it reads the /proc virtual filesystem:
# ps essentially reads these files
ls /proc/1234/
# cmdlin comm cwd exe fd maps stat status ...
Each process has a directory under /proc named by its PID, containing various files:
- cmdline: Command-line arguments (null-separated)
- comm: Process name
- stat: Process status (machine-readable)
- status: Process status (human-readable)
- fd/: Directory of open file descriptors
- exe: Symlink to the executable
Understanding Every Column in ps aux#
ps aux
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# root 1 0.0 0.1 169424 11200 ? Ss May08 0:05 /sbin/init
Key Fields Explained#
VSZ (Virtual Memory Size)
- Process virtual memory size (KB)
- Includes heap, stack, shared libraries, unallocated memory
- Usually large, but doesn’t represent actual usage
RSS (Resident Set Size)
- Actual physical memory used (KB)
- Excludes swapped memory
- Real memory consumption
STAT (Process State)
- R: Running (executing or ready)
- S: Sleeping (interruptible, waiting for event)
- D: Disk sleep (uninterruptible, usually waiting for I/O)
- Z: Zombie (terminated but not reaped by parent)
- T: Stopped (paused)
State modifiers:
+: Foreground process group-: Session leaderl: Multi-threaded process<: High-priority processN: Low-priority processs: Session leader
The %CPU Calculation Pitfall#
ps calculates CPU usage with:
%CPU = (Total CPU time / Total runtime) * 100
Here’s the catch: ps aux shows the average CPU usage since process start, not real-time!
A process that runs 1 second of CPU then sleeps for 1 hour will show very low %CPU.
For real-time CPU usage, use top or pidstat.
Practical Cases: Finding High CPU Processes#
Case 1: Find Top CPU Consumers#
# --sort=-%cpu sorts by CPU descending
ps aux --sort=-%cpu | head -10
# Output
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# mysql 10234 78.5 15.2 4523124 1.2g ? Sl May08 123:45 /usr/sbin/mysqld
Case 2: View Process Threads#
# -L shows threads, LWP is thread ID
ps -Lp 10234
# PID LWP TTY STAT TIME COMMAND
# 10234 10234 ? Sl 0:05 mysqld
# 10234 10235 ? Sl 0:12 mysqld
# 10234 10236 ? Sl 0:08 mysqld
LWP (Light Weight Process) is the thread ID. In Linux, threads are essentially lightweight processes.
Case 3: View Process Tree#
# --forest shows parent-child relationships
ps auxf
# Or use pstree
pstree -p 10234
Case 4: Find Zombie Processes#
# Find processes with state Z
ps aux | awk '$8 ~ /Z/ {print}'
# Output
# user 12345 0.0 0.0 0 0 pts/0 Z+ 10:23 0:00 [python] <defunct>
Zombie processes show <defunct> in CMD.
ps vs top vs htop#
| Tool | Feature | Use Case |
|---|---|---|
| ps | Snapshot, one-time query | Process info lookup, scripting |
| top | Real-time refresh, interactive | Live monitoring, dynamic observation |
| htop | Colorful UI, mouse support | User-friendly live monitoring |
Performance difference:
ps auxscans all processes in ~10-50mstoprefreshes every second, continuous CPU usagehtopuses more resources than top (color rendering, more calculations)
Advanced Techniques#
1. Custom Output Format#
# -o specifies columns to display
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem
# PID PPID CMD %MEM %CPU
# 1234 1 /usr/sbin/mysqld 15.2 78.5
# 5678 1 /usr/bin/dockerd 8.3 12.4
2. View Process Open Files#
# All files opened by process 1234
ls -l /proc/1234/fd/
# Or use lsof
lsof -p 1234
3. View Process Environment Variables#
# Environment variables at process start
cat /proc/1234/environ | tr '\0' '\n'
4. View Process Memory Maps#
cat /proc/1234/maps
# Output format
# Address range Perms Offset Dev Inode Path
# 00400000-0040b000 r-xp 00000000 08:01 262210 /usr/bin/ps
# 0060a000-0060b000 r--p 0000a000 08:01 262210 /usr/bin/ps
Common Pitfalls#
1. Zombie Processes Can’t Be Killed#
kill -9 12345 # Doesn't work on zombies
Zombie processes are already terminated. kill -9 has no effect. Correct approach:
- Find parent:
ps -ef | grep 12345 - Restart or fix the parent to call
wait()and reap the child
2. VSZ ≠ Actual Memory Usage#
ps aux | grep mysql
# VSZ 4523124 (4.3GB)
# RSS 1258291 (1.2GB) <- This is real usage
3. D-State Processes Are Uninterruptible#
ps aux | awk '$8 ~ /D/'
# Processes in D state are usually waiting for NFS, disk I/O
# kill -9 won't work, must wait for I/O to complete
Web Implementation: Browser-Based Process Monitor#
Browsers can’t access /proc directly, but can proxy through an API:
// Backend API: /api/processes
export async function GET() {
const fs = require('fs')
const processes = []
// Read all numeric directories under /proc (processes)
const pids = fs.readdirSync('/proc').filter(d => /^\d+$/.test(d))
for (const pid of pids) {
try {
const stat = fs.readFileSync(`/proc/${pid}/stat`, 'utf-8')
const comm = fs.readFileSync(`/proc/${pid}/comm`, 'utf-8').trim()
// Parse stat file (complex format, space-separated)
const parts = stat.split(' ')
const utime = parseInt(parts[13]) // User mode time
const stime = parseInt(parts[14]) // Kernel mode time
processes.push({
pid: parseInt(pid),
name: comm,
utime: utime,
stime: stime,
state: parts[2] // Process state
})
} catch (e) {
// Process may have exited
}
}
return Response.json(processes)
}
Frontend display:
function ProcessList() {
const [processes, setProcesses] = useState([])
useEffect(() => {
const interval = setInterval(async () => {
const res = await fetch('/api/processes')
const data = await res.json()
setProcesses(data)
}, 1000)
return () => clearInterval(interval)
}, [])
return (
<table>
<thead>
<tr>
<th>PID</th>
<th>Name</th>
<th>State</th>
<th>CPU Time</th>
</tr>
</thead>
<tbody>
{processes.map(p => (
<tr key={p.pid}>
<td>{p.pid}</td>
<td>{p.name}</td>
<td>{p.state}</td>
<td>{p.utime + p.stime}</td>
</tr>
))}
</tbody>
</table>
)
}
Summary#
The ps command seems simple but contains core knowledge of Linux process management:
- Data source:
/procvirtual filesystem - Key fields: VSZ (virtual), RSS (real), STAT (state)
- Performance metrics: %CPU is average, not real-time
- Advanced usage: Custom formats, sorting, thread inspection
- Common pitfalls: Zombies can’t be killed, D-state is uninterruptible
Mastering ps is fundamental to Linux performance troubleshooting.
Related Tools: