Linux wc Command: More Than Just Counting Lines#

When writing code, you often need to quickly check file stats. Counting lines, measuring code size, checking file sizes—wc handles all of this.

Basic Usage#

wc stands for Word Count, but it does much more:

# Count lines, words, and bytes
wc file.txt
# Output:   12    48   256 file.txt
#          lines words bytes

# Individual counts
wc -l file.txt   # Lines only: 12
wc -w file.txt   # Words only: 48
wc -c file.txt   # Bytes only: 256
wc -m file.txt   # Characters (Unicode-aware)
wc -L file.txt   # Longest line length

Real-World Use Cases#

1. Count Code Lines#

# Count all .js files in current directory
find . -name "*.js" -exec wc -l {} + | tail -1

# Count by file, sorted
find . -name "*.ts" -exec wc -l {} + | sort -rn

# Exclude node_modules
find . -path ./node_modules -prune -o -name "*.py" -print -exec wc -l {} +

2. Count Files#

# Number of files in current directory
ls | wc -l

# Count specific file type
ls *.json | wc -l

# Recursive count
find . -type f | wc -l

3. Pipe Usage#

# Count grep matches
grep -r "TODO" --include="*.js" . | wc -l

# Count errors in log
grep -c "ERROR" app.log

# Count command output
history | wc -l

Advanced Tricks#

Multiple Files with Total#

# Multiple files with total
wc -l *.txt

# Output:
#   100 file1.txt
#   200 file2.txt
#   300 total

Using xargs#

# Find files over 1000 lines
find . -name "*.md" -exec wc -l {} + | awk '$1 > 1000'

# CSV output
echo "filename,lines,words,bytes"
wc -lwc *.txt | awk '{print $3","$1","$2","$3}' | tail -n +2

Count Comments#

# C/C++ comments
grep -E '^\s*(//|/\*|\*)' *.cpp | wc -l

# Python comments
grep -E '^\s*#' *.py | wc -l

# Empty lines
grep -c '^$' file.txt

How It Works#

The implementation is straightforward:

// Simplified implementation
while (fgets(line, sizeof(line), file)) {
    lines++;
    words += count_words(line);
    bytes += strlen(line);
}
  • Lines: Counts newline \n characters
  • Words: Counts whitespace-separated strings
  • Bytes: Uses strlen() or fstat()

Common Pitfalls#

1. Line Endings#

Windows uses \r\n, Unix uses \n. This affects line counts:

# Convert to Unix format first
dos2unix file.txt 2>/dev/null || sed -i 's/\r$//' file.txt
wc -l file.txt

2. Large Files#

wc is fast even for GB-sized files (single pass). But you can optimize:

# Use buffering
wc -l < largefile.txt

# Show progress (requires pv)
pv largefile.txt | wc -l

3. Encoding#

-c counts bytes, -m counts characters. With UTF-8:

echo "中文" | wc -c   # Output: 6 (bytes)
echo "中文" | wc -m   # Output: 2 (characters)

For more advanced statistics:

  • cloc - Count lines by language
  • tokei - Faster code statistics
  • loc - Modern line counter

Online tools: Linux Commands | Word Counter | Text Deduplicate