Linux wc Command: More Than Just Counting Lines
Linux wc Command: More Than Just Counting Lines#
When writing code, you often need to quickly check file stats. Counting lines, measuring code size, checking file sizes—wc handles all of this.
Basic Usage#
wc stands for Word Count, but it does much more:
# Count lines, words, and bytes
wc file.txt
# Output: 12 48 256 file.txt
# lines words bytes
# Individual counts
wc -l file.txt # Lines only: 12
wc -w file.txt # Words only: 48
wc -c file.txt # Bytes only: 256
wc -m file.txt # Characters (Unicode-aware)
wc -L file.txt # Longest line length
Real-World Use Cases#
1. Count Code Lines#
# Count all .js files in current directory
find . -name "*.js" -exec wc -l {} + | tail -1
# Count by file, sorted
find . -name "*.ts" -exec wc -l {} + | sort -rn
# Exclude node_modules
find . -path ./node_modules -prune -o -name "*.py" -print -exec wc -l {} +
2. Count Files#
# Number of files in current directory
ls | wc -l
# Count specific file type
ls *.json | wc -l
# Recursive count
find . -type f | wc -l
3. Pipe Usage#
# Count grep matches
grep -r "TODO" --include="*.js" . | wc -l
# Count errors in log
grep -c "ERROR" app.log
# Count command output
history | wc -l
Advanced Tricks#
Multiple Files with Total#
# Multiple files with total
wc -l *.txt
# Output:
# 100 file1.txt
# 200 file2.txt
# 300 total
Using xargs#
# Find files over 1000 lines
find . -name "*.md" -exec wc -l {} + | awk '$1 > 1000'
# CSV output
echo "filename,lines,words,bytes"
wc -lwc *.txt | awk '{print $3","$1","$2","$3}' | tail -n +2
Count Comments#
# C/C++ comments
grep -E '^\s*(//|/\*|\*)' *.cpp | wc -l
# Python comments
grep -E '^\s*#' *.py | wc -l
# Empty lines
grep -c '^$' file.txt
How It Works#
The implementation is straightforward:
// Simplified implementation
while (fgets(line, sizeof(line), file)) {
lines++;
words += count_words(line);
bytes += strlen(line);
}
- Lines: Counts newline
\ncharacters - Words: Counts whitespace-separated strings
- Bytes: Uses
strlen()orfstat()
Common Pitfalls#
1. Line Endings#
Windows uses \r\n, Unix uses \n. This affects line counts:
# Convert to Unix format first
dos2unix file.txt 2>/dev/null || sed -i 's/\r$//' file.txt
wc -l file.txt
2. Large Files#
wc is fast even for GB-sized files (single pass). But you can optimize:
# Use buffering
wc -l < largefile.txt
# Show progress (requires pv)
pv largefile.txt | wc -l
3. Encoding#
-c counts bytes, -m counts characters. With UTF-8:
echo "中文" | wc -c # Output: 6 (bytes)
echo "中文" | wc -m # Output: 2 (characters)
Related Tools#
For more advanced statistics:
cloc- Count lines by languagetokei- Faster code statisticsloc- Modern line counter
Online tools: Linux Commands | Word Counter | Text Deduplicate
Read other posts