sed Stream Editor: From Basic to Advanced Text Processing
sed Stream Editor: From Basic to Advanced Text Processing#
When I started with Linux, I thought sed was just a simple find-and-replace tool: sed 's/old/new/g' file.txt. But as I gained experience, I realized sed is a Turing-complete stream editor capable of incredibly complex text transformations.
The Core Concept#
sed’s working principle is simple: read input line by line, execute a series of editing commands on each line, output the result. This streaming approach lets sed process files of any size—no need to load the entire file into memory.
The most common command is s (substitute):
sed 's/pattern/replacement/flags' file.txt
But sed has 25 built-in commands: delete (d), insert (i), append (a), print (p), branch (b), hold space operations (h, g, x), and more.
Address Ranges: Precise Control#
By default, sed applies commands to every line. But you can limit processing with addresses:
# Only process line 5
sed '5s/error/warning/' app.log
# Process lines 10-20
sed '10,20s/debug/info/' app.log
# Process from "function" match to end of file
sed '/function/,$s/var/let/g' code.js
# Only process lines containing "TODO"
sed '/TODO/s/^/\/\/ /' source.ts
Addresses can be line numbers, regex patterns, or combinations. This gives precise control over which lines get processed.
Regex Backreferences#
sed supports capture groups and backreferences—key for complex replacements:
# Swap first and last name
echo "John Smith" | sed 's/\(.*\) \(.*\)/\2 \1/' # Smith John
# Convert YYYY-MM-DD to DD/MM/YYYY
echo "2026-05-06" | sed 's/\([0-9]\{4\}\)-\([0-9]\{2\}\)-\([0-9]\{2\}\)/\3\/\2\/\1/'
# Output: 06/05/2026
# Add type annotations to function parameters (TypeScript)
sed 's/function \([a-zA-Z]*\)(\([a-zA-Z]*\))/function \1(\2: string)/' code.js
Backreferences use \1, \2, \3 for the 1st, 2nd, 3rd capture group. With extended regex (-E or -r), use () instead of \(\):
# Extended regex (cleaner syntax)
echo "2026-05-06" | sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/'
Hold Space: Multi-line Processing#
sed has two buffers: pattern space (current line) and hold space (temporary storage). Hold space enables cross-line operations.
Merge consecutive lines:
# Merge adjacent lines
seq 1 6 | sed 'N;s/\n/ /'
# Output:
# 1 2
# 3 4
# 5 6
N appends the next line to pattern space, s/\n/ / replaces newline with space.
A more complex example: reverse all lines in a file:
# Reverse line order
sed -n '1!G;h;$p' file.txt
Workflow:
1!G: For all lines except the first, append hold space to pattern spaceh: Copy pattern space to hold space$p: On the last line, print pattern space
Once you understand pattern/hold space interaction, you can implement stacks, queues, and other data structures in sed.
Real-World Example: Log Processing#
Extract all 404 error URLs from an Nginx log and count occurrences:
# Extract 404 URLs
sed -n '/" 404 "/p' nginx.log | \
sed -E 's/.*"GET ([^ ]+).*/\1/' | \
sort | uniq -c | sort -rn | head -20
First sed filters lines with 404, second extracts the URL, then sort and uniq count them.
Batch replace domains in a project:
# Replace domain in all JS files
find ./src -name "*.js" -exec sed -i 's/old\.example\.com/new\.example\.com/g' {} +
# macOS requires a backup suffix (empty string = no backup)
find ./src -name "*.js" -exec sed -i '' 's/old\.example\.com/new\.example\.com/g' {} +
-i means in-place editing—modify the file directly instead of printing to stdout.
Performance Tips for Large Files#
For multi-GB log files, sed’s streaming advantage is clear. But some commands slow it down:
- Limit address ranges: Reduce regex match attempts
- Quit early: Exit after processing target content
# Read only first 1000 lines, then quit
sed '1000q' huge.log
# Quit immediately after matching "END"
sed '/END/q' data.txt
- Use awk for complex logic: If your sed command has 10+ semicolons, awk might be clearer.
Common Pitfalls#
1. Delimiter Choice#
sed’s delimiter is customizable—doesn’t have to be /:
# Replace file paths (avoid escaping slashes)
sed 's|/usr/local|/opt/local|' config.txt
# Use # as delimiter
sed 's#old#new#g' file.txt
2. Greedy Matching#
Regex is greedy by default, may match too much:
# Wrong: greedy match consumes closing tag
echo '<div>content</div>' | sed 's/<.*>/replacement/'
# Output: replacement (entire tag replaced)
# Correct: use negated character class
echo '<div>content</div>' | sed 's/<[^>]*>/replacement/'
# Output: replacementcontentreplacement
3. Command Sequencing#
Multiple -e options execute in order:
# Replace error first, then warning
sed -e 's/error/err/g' -e 's/warning/warn/g' log.txt
# Or use semicolons (note: not all commands allow this)
sed 's/error/err/g; s/warning/warn/g' log.txt
Summary#
sed looks simple but is incredibly powerful. Master address ranges, backreferences, and hold space, and you can accomplish with one line what takes dozens in Python. For complex text processing, awk or Python are better choices, but in server environments and CI/CD pipelines, sed’s speed and convenience are unmatched.
To test sed commands in the browser, try: Linux Commands Reference—detailed syntax and common examples included.
Related: Grep Command Guide | Regex Tester