When cat writes to stdout, it doesn't block waiting for grep to process that data.
It will certainly do that if the buffer is full.
prevents the implicit blocking
No, that's exactly the case of implicit blocking mentioned above.
Does anyone else find this article rather AI-ish? The extreme verbosity and repetitiveness, the use of dashes, and "The limitation isn't conceptual—it's syntactic" are notable artifacts.
> Does anyone else find this article rather AI-ish?
After reading the whole thing, yes! Specifically it feels incoherent in the way AI text often is. It starts by praising unix pipes for their simple design and the explicit tradeoffs they make, and then proceeds explaining how we could and should make the complete opposite set of tradeoffs.
prog1 -input input_file -output tmp1_file
prog2 -input tmp1_file -output tmp2_file && del tmp1_file
prog3 -input tmp2_file -output tmp1_file && del tmp2_file
...
progN -input tmpX_file -output output_file && del tmpX_file
is more in line with the author's claimed benefits of the pipes than the piped style itself. The process isolation is absolute, they are separated not just in space, but in time as well, entirely!
File management suddenly becomes an issue. If old file tmp1_file remains from a previous run, then prog1 fails, you get "old" output. Pipes avoid file management entirely.
> This cross-language composition remains remarkably rare in modern development, where we typically force everything into a single language ecosystem and its assumptions.
I think IPC via HTTP, gRPC, Kafka, files, etc allows language decoupling pretty well. Intra-process communication is primarily single-language, though you can generally call from language X into C-language libs. Cross-process, I don't see where the assertion comes from.
Wouldn't passing comms through a C ABI still be placing everything into a single language? Or am I conflating communication protocol with 'language'? My parser/combinator/interpreter senses are tingling.
But without a common runtime the closest you could really get to that in Unix would be to pass JSON or XML about, and have every program have a "pipe" mode that accepted that as input.
Which seems like an awful lot of work and unlikely to get the kind of buy in you'd need to make it work widely.
Unix pipelines got something right by being a syntactic sugar for chaining pure function application. It's easy to get excited when you don't understand this.
For instance sqrt(sin(cos(theta))) can be notated < theta | cos | sin | sqrt.
Pipeline syntax implemented in functional languages expands into chained function invocation.
Everything follows from that: what we know about combining functions applies to pipes.
> When cat writes to stdout, it doesn't block waiting for grep to process that data.
That says nothing more than that nested function invocations admit non-strict evaluation strategies. E.g. the argument of a function need not be reduced to a value before it is passed to another, which can proceed with a calculation which depends on that result before obtaining it.
When you expand the actual data dependencies into a tree, it's obvious to see what can be done in parallel.
Also viewing Unix pipes as some special class of file descriptor because your Intro to OS professor didn't teach you anything more sophisticated than shell pipe syntax is kinda dumb.
File descriptor-based IPC has none of the restrictions discussed in this article. They're not restricted to text (and the author does point this out), they're not restricted to linear topologies, they work perfectly fine in parallel environments (I have no idea what this section is talking about), and in Unix-land processes and threads are identically "heavy" (Windows is different).
> The lack of fan-out makes it awkward to express combinations where one sender feeds many receivers. In 1970, avoiding garbage collection was a practical necessity, but today garbage collection is available in most programming workflows and fan-out could be implemented much more easily through message copying rather than consumption.
Fanout has precisely zero dependency on GC. For example ‘tee’ has been around for decades and it can copy io streams just fine.
When cat writes to stdout, it doesn't block waiting for grep to process that data.
It will certainly do that if the buffer is full.
prevents the implicit blocking
No, that's exactly the case of implicit blocking mentioned above.
Does anyone else find this article rather AI-ish? The extreme verbosity and repetitiveness, the use of dashes, and "The limitation isn't conceptual—it's syntactic" are notable artifacts.
> Does anyone else find this article rather AI-ish?
After reading the whole thing, yes! Specifically it feels incoherent in the way AI text often is. It starts by praising unix pipes for their simple design and the explicit tradeoffs they make, and then proceeds explaining how we could and should make the complete opposite set of tradeoffs.
That would explain the strangeness of the recent spherical cows article from the same site, as well.
Also the headings are just sprinkled at intervals and don't really fit the text.
If anything, the pre-pipe style of
is more in line with the author's claimed benefits of the pipes than the piped style itself. The process isolation is absolute, they are separated not just in space, but in time as well, entirely!File management suddenly becomes an issue. If old file tmp1_file remains from a previous run, then prog1 fails, you get "old" output. Pipes avoid file management entirely.
> It will certainly do that if the buffer is full.
You can consider that an OS/resource specific limitation, rather than a limitation in the concept.
Nah. Having built-in automatic backpressure is one of the most underappreciated things about the UNIX pipes.
Fully agree. This is still a representation of the available resources.
> This cross-language composition remains remarkably rare in modern development, where we typically force everything into a single language ecosystem and its assumptions.
I think IPC via HTTP, gRPC, Kafka, files, etc allows language decoupling pretty well. Intra-process communication is primarily single-language, though you can generally call from language X into C-language libs. Cross-process, I don't see where the assertion comes from.
Something like Kafka should be part of the core operating system. Its API has been stable for years (decade+?) now.
Isn't dbus pretty much that (not that it's particularly good)
Wouldn't passing comms through a C ABI still be placing everything into a single language? Or am I conflating communication protocol with 'language'? My parser/combinator/interpreter senses are tingling.
I'm a big fan of how PowerShell passes objects.
But without a common runtime the closest you could really get to that in Unix would be to pass JSON or XML about, and have every program have a "pipe" mode that accepted that as input.
Which seems like an awful lot of work and unlikely to get the kind of buy in you'd need to make it work widely.
Unix pipelines got something right by being a syntactic sugar for chaining pure function application. It's easy to get excited when you don't understand this.
For instance sqrt(sin(cos(theta))) can be notated < theta | cos | sin | sqrt.
Pipeline syntax implemented in functional languages expands into chained function invocation.
Everything follows from that: what we know about combining functions applies to pipes.
> When cat writes to stdout, it doesn't block waiting for grep to process that data.
That says nothing more than that nested function invocations admit non-strict evaluation strategies. E.g. the argument of a function need not be reduced to a value before it is passed to another, which can proceed with a calculation which depends on that result before obtaining it.
When you expand the actual data dependencies into a tree, it's obvious to see what can be done in parallel.
This looks and reads like AI slop.
Also viewing Unix pipes as some special class of file descriptor because your Intro to OS professor didn't teach you anything more sophisticated than shell pipe syntax is kinda dumb.
File descriptor-based IPC has none of the restrictions discussed in this article. They're not restricted to text (and the author does point this out), they're not restricted to linear topologies, they work perfectly fine in parallel environments (I have no idea what this section is talking about), and in Unix-land processes and threads are identically "heavy" (Windows is different).
I find Paul's take on simplicity(and complexity) very illuminating.
> The lack of fan-out makes it awkward to express combinations where one sender feeds many receivers. In 1970, avoiding garbage collection was a practical necessity, but today garbage collection is available in most programming workflows and fan-out could be implemented much more easily through message copying rather than consumption.
Fanout has precisely zero dependency on GC. For example ‘tee’ has been around for decades and it can copy io streams just fine.
There has been some effort to built fanout shells too. With a discussion in HN earlier this month on one called dgsh https://news.ycombinator.com/item?id=45425298
Edit: I agree with other comments that this feels like AI slop
"It's limited to unstructured text" requires ignoring ASCII unit and record separators. The people who came up with this stuff weren't dumb.