Linux tee Command Deep Dive: Pipe Splitting and Dual-Output Implementation
Linux tee Command Deep Dive: Pipe Splitting and Dual-Output Implementation#
Ever hit this problem? You pipe data through a chain of commands, but you can’t see the intermediate results. Like cat access.log | grep 404 | wc -l — you only get the count, not the matching lines. That’s where tee comes in.
The Core: Data Stream Forking#
The name “tee” comes from plumbing — the T-junction fitting. Its job is simple: read from stdin, write to both stdout and a file.
The implementation is straightforward:
#include <unistd.h>
#include <fcntl.h>
void tee_impl(const char* filename, int append) {
char buf[4096];
int flags = O_WRONLY | O_CREAT | (append ? O_APPEND : O_TRUNC);
int fd = open(filename, flags, 0644);
ssize_t n;
while ((n = read(STDIN_FILENO, buf, sizeof(buf))) > 0) {
write(STDOUT_FILENO, buf, n); // output to screen
write(fd, buf, n); // write to file
}
close(fd);
}
The key: two write calls — one to stdout, one to the file descriptor. That’s tee in a nutshell: data duplication and distribution.
Practical Scenarios#
Scenario 1: Debugging Pipeline Intermediate Results#
# Only see the final count
cat access.log | grep 404 | wc -l
# See matching lines AND count
cat access.log | grep 404 | tee matches.txt | wc -l
# Screen shows count, file saves matching lines
# Or display directly on terminal
cat access.log | grep 404 | tee /dev/tty | wc -l
/dev/tty is the current terminal device — writing there displays on screen.
Scenario 2: Build Logs with Simultaneous Monitoring#
make 2>&1 | tee build.log
2>&1 redirects stderr to stdout, so error messages get captured by tee too. Real-time output on screen, full log saved to file.
Scenario 3: Append Mode#
# Overwrites each run
echo "run 1" | tee output.txt
# Appends each run
echo "run 2" | tee -a output.txt
The -a flag maps to O_APPEND, appending instead of truncating.
Scenario 4: Writing to Multiple Files#
echo "critical config" | tee config.prod.yaml config.staging.yaml config.dev.yaml
tee accepts multiple file arguments — one command, synchronized writes.
Advanced: Process Substitution and Stream Splitting#
tee’s real power emerges with process substitution:
# Output to terminal and filtered result simultaneously
df -h | tee >(grep "/$" > root_disks.txt) > all_disks.txt
>() is process substitution — it creates a named pipe. tee sends data to both grep and the raw output file.
# Real-time monitoring + error log + warning log
tail -f app.log | tee >(grep ERROR >> errors.txt) >(grep WARN >> warns.txt) | cat
One command monitors the log stream, extracts errors, extracts warnings — all at once.
Parameter Breakdown#
| Flag | Purpose | Use Case |
|---|---|---|
-a |
Append mode | Log accumulation, cumulative results |
-i |
Ignore SIGINT | Prevent Ctrl+C from interrupting writes |
-p |
Diagnose write errors | Detect disk full, permission issues |
The -i flag works by registering a signal handler:
#include <signal.h>
signal(SIGINT, SIG_IGN); // Ignore Ctrl+C
This ensures tee finishes writing buffered data before exiting.
Web Implementation: Browser-Side Stream Splitting#
Implementing similar functionality in JavaScript:
class TeeStream {
private outputs: WritableStream[] = [];
addOutput(stream: WritableStream) {
this.outputs.push(stream);
}
async pipe(input: ReadableStream) {
const reader = input.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
// Write to all output streams simultaneously
await Promise.all(
this.outputs.map(stream => {
const writer = stream.getWriter();
return writer.write(value).then(() => writer.releaseLock());
})
);
}
}
}
// Usage example
const tee = new TeeStream();
tee.addOutput(new WritableStream({
write: chunk => console.log(new TextDecoder().decode(chunk))
}));
tee.addOutput(fileWritableStream);
await tee.pipe(fetchResponse.body);
The Web Streams API design philosophy mirrors Unix pipes.
Performance Considerations#
tee’s overhead comes from two write system calls. For typical usage, this is negligible. Real concerns:
- Disk I/O bottleneck: File output speed may become the limiting factor
- Buffer size: Default 4KB; increase for high-throughput scenarios
- Signal handling overhead: The
-iflag adds signal processing logic
Benchmark: Processing 1GB data streams, tee adds ~2-3% CPU overhead. The bottleneck is disk I/O.
Common Pitfalls#
1. File Permission Issues#
ls /root | tee /root/output.txt
# Permission denied
tee writes with current user permissions. Use sudo:
ls /root | sudo tee /root/output.txt
2. Pipe Buffer Limits#
Linux pipes default to 64KB buffer. If the downstream process is slow, upstream gets blocked:
# Slow downstream blocks upstream
tail -f large.log | tee >(sleep 1; cat) | cat
Use buffer to increase buffer size:
tail -f large.log | buffer -s 1MB | tee output.txt
3. Signal Propagation#
By default, Ctrl+C terminates the entire pipeline. tee might be interrupted mid-write, causing incomplete data. Use -i:
long_running_command | tee -i output.txt
Wrapping Up#
tee looks simple, but it’s the backbone of data flow management. From low-level read/write syscalls to process substitution tricks, mastering tee makes pipeline operations effortless.
Key takeaways:
- Core function: data duplication, one read → multiple writes
- Process substitution enables multi-way splitting
-afor append,-ifor signal protection are common flags- Web Streams API provides browser-side equivalent capabilities
Next time you need “both this AND that” in a pipeline, just tee it.
Related: Linux xargs Command | Linux grep Command