Hi everyone. Posting on HN for the first time.
I'd like to share Rill - a toolkit for composable channel-based concurrency, that makes it easy to build concurrent programs from simple, reusable parts
Example of what it looks like:
// Convert a slice into a channel
ids := rill.FromSlice([]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil)
// Read users from API with concurrency=3
users := rill.Map(ids, 3, func(id int) (*User, error) {
return api.GetUser(ctx, id)
})
// Process users with concurrency=2
err := rill.ForEach(users, 2, func(u *User) error {
if !u.IsActive {
u.IsActive = true
return api.SaveUser(ctx, u)
}
return nil
})
// Handle errors
fmt.Println("Error:", err)
Key features:
- Makes concurrent code composable and clean
- Works for both simple cases and complex pipelines
- Built-in batching support
- Order preservation when needed
- Centralized error handling
- Zero dependencies
The library grew from solving real concurrency challenges in production. Happy to discuss the design decisions or answer any questions.
No, it's the opposite - the library treats channels as streams, processing items as they arrive without needing to know the total size in advance. This is why it can handle infinite streams and large datasets that don't fit in memory.
Batching is a pattern I’ve had to manually build in the past to push large amounts of analytic data to a database. I’d push individual events to be logged, map reduce those in batches and then perform insert on duplicate update queries on the database, otherwise the threshold of incoming events was enough to saturate the connection pool making the app inoperable.
Even optimizing to where if an app instance new it ran the inert on update for a specific unique index by storing that in a hash map and only running updates from there on out to increase the count of occurrences of that event was enough to find significant performance gains as well.
As the author of https://github.com/ahmetb/go-linq, it's hard to find adoption for libraries offering "syntactic sugar" in Go, as the language culture discourages those kind of abstractions and keeping the code straightforward.
> Here's a practical example: finding the first occurrence of a specific string among 1000 large files hosted online. Downloading all files at once would consume too much memory, processing them sequentially would be too slow, and traditional concurrency patterns do not preserve the order of files, making it challenging to find the first match.
But this example will process ALL items, it won't break when a batch of 5 finds something?
urls := rill.Generate(func(send func(string), sendErr func(error)) {
for i := 0; i < 1000 && ctx.Err() == nil; i++ {
send(fmt.Sprintf("https://example.com/file-%d.txt", i))
}
})
...
One of the reasons I've ommited context cancellation in this and some other examples is because everything's happening inside the main function. I'll probably add cancellations to avoid confusion.
Nice! I do a lot of concurrency work with DAG's in https://github.com/purpleidea/mgmt/ and I would love to swap out some of those concurrency runners with a lib if possible.
I was wondering if this could be it... Any thoughts in that direction, please let me know!
This is great. I am working on a robotics application and this seems like a better abstraction than alternatives such local messaging servers. How do you deal with something like back pressure or not keeping up with incoming data?
The lib is based on channels and inherits the channel behavior in terms of backpressure. Simply put if no-one reads on one side of the pipeline, it wouldn't be possible to write anything on the other side. Still, it's possible to add buffering at arbitrary points in the pipeline using the rill.Buffer function.
I think it is time to face the fact CSP style channels are a bad idea if you don’t also have the occam semantics for channel scope. (I know there are other people here that understood that sentence).
The problem in golang is the channel cleanup, particularly. is a mess. In occam they come and go simply by existing as variables in seq or par blocks. Occam is very static though, so the equivalent to goroutines are all allocated at build. (Occam pi attempted to resolve this iirc).
Some of the patterns in this library are reminiscent of constructs occam has as basic constructs, such as the for each block, although the occam one must have the number of blocks known at build time.
The fact so many people in golang reach for mutexes constantly is a sign things are not all well there.
Channels are just another synchronization primitive in your toolbox. They do make some things much simpler, but there's no reason to reach for one if a mutex does the job.
The usage of mutexes doesn't make channels "bad" for the same reason that usage of atomics doesn't make mutexes bad.
If you write comprehensive unit tests, it is not easy to have bugs in golang. Especially as things change over time. A library like this isn't going to protect you from having bugs.
TIL: HN doesn't like writing tests. The downvotes on this are hilarious. "Job security" ¯\_(ツ)_/¯.
Hard disagree on this. Large production apps that use channels have very subtle bugs that cause all kinds of annoying issues that only come up under load in prod. I have been using go for ten years and still pick it as my language of choice for most projects however I stay away from channels and especially any complex use of them unless it’s 100% required to scale the application. Even then you can most of the time come up with a better solution by re architecting that part of the application. For pet projects go crazy with them though.
What are you disagreeing with exactly, are you trying to argue against testing? Are you trying to argue that using a library protects you from bugs somehow?
You stay away from something you don't understand after 10 years of working with it? What kind of logic is that? Channels aren't magic.
Subtle bugs in what? Have you considered that maybe you have bugs because you aren't writing tests?
If you aren't unit testing that stuff, then how are you able to fix/change things and know it is resolved?
My experience is that I built a binary that had to run perfectly on 30,000+ servers across 7 data centers. It was full of concurrency. Without a litany of automated tests, there is no way that I would have trusted this to work... and it worked perfectly. The entire deployment cycle was fully automated. If it passed CI, I knew that it would work.
It wasn't easy, it took a lot of effort to build that level of testing. But it was also totally bug free in production, even over years of use and development.
I think you're getting downvoted for the unsupported assertion that "If you write comprehensive unit tests, it is not easy to have bugs in golang." Probably because you made that assertion in the context of a discussion of channels, widely believed to have underlying concurrency semantics that are subtle and easy to misunderstand, making "write comprehensive unit tests" seem like a strategy that's apt to let real-world problems slip through (because a programmer's belief that their tests are "comprehensive" is likely to be mistaken).
Go makes it easier to write concurrent code, but it's a serious chore to iron out all of the kinks in more complex tasks. I've missed some weird stuff over the years.
I don't blame Go. It's an inherently difficult problem space. As a result, testing isn't a trivial job either. I wish it was.
You’re getting downvoted because you’re essentially arguing that a language abstraction which is a known source of bugs can be solved simply by writing better code.
They're also suggesting a method of testing which almost certainly doesn't offer sufficient assurance under most circumstances will uncover all possible bugs. When I've got concurrency in an application, I'll use unit tests here and there, but mostly I want assurance that the entire system behaves as expected. It's too much complexity to rely on unit tests.
You might be interested by something that has been designed specifically for this problem. I created a state machine library for Go on top of which I mapped retry[1] and some other patterns. And funnily enough one of the first applications I implemented is an RSS reader[2]
For now, the library is context-agnostic by design. For HTTP timeouts, you'd use Go's standard approaches: either set the HTTP client timeout or pass a context with timeout to each request. Please let me know more about your use case - I'll let you know if Rill isn't a good fit.
Based on the examples and documentation, rill doesn't manage context for you. You'd simply set the client timeout or give each http call a timeout context.
I am thinking on it. To be honest, the current design works fine for my use cases: simply put, the function that defines a pipeline should have context.WithCancel() and defer cancel() calls.
I need a feedback on this. What kind of builtin context support would work for you? Do you need something like errgroup's ability to automatically cancel the context on first error?
Sourcegraph Conc is broadly similar in providing pool helpers, but doesn't provide the same fine grained batching options: https://github.com/sourcegraph/conc
Uber CFF does code generation, and has more of a focus on readability and complex dependency chains: https://github.com/uber-go/cff
The batching concept is a cool idea and could be useful in the right context. That said, this feels like a JavaScript engineer's take on Go. Abstractions like Map and ForEach don't align with Go's emphasis on simplicity and explicitness. The lack of context.Context handling also seems like an oversight, especially when considering concurrency.
Judging by the praise, I'm probably in the minority, but as a code reviewer, I’d much rather see straightforward loops, channels, and Go's native constructs over something like Rill.
> Map and ForEach don't align with Go's emphasis on simplicity and explicitness
I've never paid my bills with Go, but `Map` and `ForEach` don't seem all that different than `for _, u := range Users` to me. Yes, the former is "functional" but only mildly.
Hi everyone. Posting on HN for the first time. I'd like to share Rill - a toolkit for composable channel-based concurrency, that makes it easy to build concurrent programs from simple, reusable parts
Example of what it looks like:
Key features: The library grew from solving real concurrency challenges in production. Happy to discuss the design decisions or answer any questions.I'm curious - what other technologies/libraries/APIs did you draw on for inspiration, or would you say are similar to Rill?
Is there an underlying assumption that the channels are containers and not streams?
No, it's the opposite - the library treats channels as streams, processing items as they arrive without needing to know the total size in advance. This is why it can handle infinite streams and large datasets that don't fit in memory.
Very intuitive API. Thanks!
what sort of environment do you need to be in to have to compose concurrency like this instead of relying on native go's scaling?
Batching is a pattern I’ve had to manually build in the past to push large amounts of analytic data to a database. I’d push individual events to be logged, map reduce those in batches and then perform insert on duplicate update queries on the database, otherwise the threshold of incoming events was enough to saturate the connection pool making the app inoperable.
Even optimizing to where if an app instance new it ran the inert on update for a specific unique index by storing that in a hash map and only running updates from there on out to increase the count of occurrences of that event was enough to find significant performance gains as well.
Looks good, similar to https://github.com/sourcegraph/conc which we've been using for a while. Will give this a look.
There are also libraries like https://github.com/Jeffail/tunny or https://pkg.go.dev/go.uber.org/goleak or https://github.com/fatih/semgroup to help deal with concurrency limits and goroutine lifecycle management.
As the author of https://github.com/ahmetb/go-linq, it's hard to find adoption for libraries offering "syntactic sugar" in Go, as the language culture discourages those kind of abstractions and keeping the code straightforward.
Love the idea, some weirdness though:
> Here's a practical example: finding the first occurrence of a specific string among 1000 large files hosted online. Downloading all files at once would consume too much memory, processing them sequentially would be too slow, and traditional concurrency patterns do not preserve the order of files, making it challenging to find the first match.
But this example will process ALL items, it won't break when a batch of 5 finds something?
It will. Otherwise the example wouldn't make sense. There's one important detail I haven't clarified enough in that part of the readme.
For proper pipeline termination the context has to be cancelled. So it should have been be Like:
func main() { ctx, cancel := context.WithCancel(context.Background()) defer cancel()
One of the reasons I've ommited context cancellation in this and some other examples is because everything's happening inside the main function. I'll probably add cancellations to avoid confusion.Nice! I do a lot of concurrency work with DAG's in https://github.com/purpleidea/mgmt/ and I would love to swap out some of those concurrency runners with a lib if possible.
I was wondering if this could be it... Any thoughts in that direction, please let me know!
I think that using iterators in the public API would have been better than channels.
Rill might look like it tries to be a replacement for iterators, but it's not the case. It's a concurrency library, that's why it's based on channels
I disagree that using channels is necessary for concurrency. Consider the following iterator based signature for your Map function:
that makes Scala look easy.
This is great. I am working on a robotics application and this seems like a better abstraction than alternatives such local messaging servers. How do you deal with something like back pressure or not keeping up with incoming data?
The lib is based on channels and inherits the channel behavior in terms of backpressure. Simply put if no-one reads on one side of the pipeline, it wouldn't be possible to write anything on the other side. Still, it's possible to add buffering at arbitrary points in the pipeline using the rill.Buffer function.
I think it is time to face the fact CSP style channels are a bad idea if you don’t also have the occam semantics for channel scope. (I know there are other people here that understood that sentence).
The problem in golang is the channel cleanup, particularly. is a mess. In occam they come and go simply by existing as variables in seq or par blocks. Occam is very static though, so the equivalent to goroutines are all allocated at build. (Occam pi attempted to resolve this iirc).
https://en.m.wikipedia.org/wiki/Occam_(programming_language)
Some of the patterns in this library are reminiscent of constructs occam has as basic constructs, such as the for each block, although the occam one must have the number of blocks known at build time.
The fact so many people in golang reach for mutexes constantly is a sign things are not all well there.
Channels are just another synchronization primitive in your toolbox. They do make some things much simpler, but there's no reason to reach for one if a mutex does the job.
The usage of mutexes doesn't make channels "bad" for the same reason that usage of atomics doesn't make mutexes bad.
this is a real problem in go, very easy to have bugs when working with channels and the way it handles errors etc.
If you write comprehensive unit tests, it is not easy to have bugs in golang. Especially as things change over time. A library like this isn't going to protect you from having bugs.
TIL: HN doesn't like writing tests. The downvotes on this are hilarious. "Job security" ¯\_(ツ)_/¯.
Hard disagree on this. Large production apps that use channels have very subtle bugs that cause all kinds of annoying issues that only come up under load in prod. I have been using go for ten years and still pick it as my language of choice for most projects however I stay away from channels and especially any complex use of them unless it’s 100% required to scale the application. Even then you can most of the time come up with a better solution by re architecting that part of the application. For pet projects go crazy with them though.
What are you disagreeing with exactly, are you trying to argue against testing? Are you trying to argue that using a library protects you from bugs somehow?
You stay away from something you don't understand after 10 years of working with it? What kind of logic is that? Channels aren't magic.
Subtle bugs in what? Have you considered that maybe you have bugs because you aren't writing tests?
If you aren't unit testing that stuff, then how are you able to fix/change things and know it is resolved?
My experience is that I built a binary that had to run perfectly on 30,000+ servers across 7 data centers. It was full of concurrency. Without a litany of automated tests, there is no way that I would have trusted this to work... and it worked perfectly. The entire deployment cycle was fully automated. If it passed CI, I knew that it would work.
It wasn't easy, it took a lot of effort to build that level of testing. But it was also totally bug free in production, even over years of use and development.
I think you're getting downvoted for the unsupported assertion that "If you write comprehensive unit tests, it is not easy to have bugs in golang." Probably because you made that assertion in the context of a discussion of channels, widely believed to have underlying concurrency semantics that are subtle and easy to misunderstand, making "write comprehensive unit tests" seem like a strategy that's apt to let real-world problems slip through (because a programmer's belief that their tests are "comprehensive" is likely to be mistaken).
Go makes it easier to write concurrent code, but it's a serious chore to iron out all of the kinks in more complex tasks. I've missed some weird stuff over the years.
I don't blame Go. It's an inherently difficult problem space. As a result, testing isn't a trivial job either. I wish it was.
You’re getting downvoted because you’re essentially arguing that a language abstraction which is a known source of bugs can be solved simply by writing better code.
which misses the point of the OP.
They're also suggesting a method of testing which almost certainly doesn't offer sufficient assurance under most circumstances will uncover all possible bugs. When I've got concurrency in an application, I'll use unit tests here and there, but mostly I want assurance that the entire system behaves as expected. It's too much complexity to rely on unit tests.
Very true. As an author of a several multithreaded applications, I concur that unit testing thread interactions is hard and seldom exhaustive.
Hi! Looks great. I might use this to fan out/in my RSS reader HTTP calls.
How would I implement timeout? In case a HTTP call takes too long?
You might be interested by something that has been designed specifically for this problem. I created a state machine library for Go on top of which I mapped retry[1] and some other patterns. And funnily enough one of the first applications I implemented is an RSS reader[2]
[1] https://pkg.go.dev/git.sr.ht/~mariusor/ssm#example-Retry
[2] https://git.sr.ht/~mariusor/frankenjack/tree/master/item/sou...
For now, the library is context-agnostic by design. For HTTP timeouts, you'd use Go's standard approaches: either set the HTTP client timeout or pass a context with timeout to each request. Please let me know more about your use case - I'll let you know if Rill isn't a good fit.
Based on the examples and documentation, rill doesn't manage context for you. You'd simply set the client timeout or give each http call a timeout context.
any plans to add context support ?
I am thinking on it. To be honest, the current design works fine for my use cases: simply put, the function that defines a pipeline should have context.WithCancel() and defer cancel() calls.
I need a feedback on this. What kind of builtin context support would work for you? Do you need something like errgroup's ability to automatically cancel the context on first error?
The API looks really nice and intuitive! What motivated you to build this?
Very handy!
It looks great. What are other existing tools? And how do they compare to them?
Sourcegraph Conc is broadly similar in providing pool helpers, but doesn't provide the same fine grained batching options: https://github.com/sourcegraph/conc
Uber CFF does code generation, and has more of a focus on readability and complex dependency chains: https://github.com/uber-go/cff
The batching concept is a cool idea and could be useful in the right context. That said, this feels like a JavaScript engineer's take on Go. Abstractions like Map and ForEach don't align with Go's emphasis on simplicity and explicitness. The lack of context.Context handling also seems like an oversight, especially when considering concurrency.
Judging by the praise, I'm probably in the minority, but as a code reviewer, I’d much rather see straightforward loops, channels, and Go's native constructs over something like Rill.
I don’t agree with your comment about Map and ForEach, just by virtue of the fact that sync.Map exists in Go’s standard library.
But your point about the lack of contexts is definitely a deal breaker for me personally too.
> Map and ForEach don't align with Go's emphasis on simplicity and explicitness
I've never paid my bills with Go, but `Map` and `ForEach` don't seem all that different than `for _, u := range Users` to me. Yes, the former is "functional" but only mildly.
In that case there's no particular reason to use them. As far as Go's philosophy goes.
touché
If you were to build a library like `rill` in the Go-way, what would your Batch API usage look like?
:= for assinging a variable sounds and looks weird to me
It's the walrus operator. Pascal and Python use it as well. You get used to it pretty quickly.
Pascal: https://www.freepascal.org/docs-html/ref/refse104.html#x224-... Python: https://docs.python.org/3/whatsnew/3.8.html#assignment-expre...
It was something introduced like a 1 april joke. Clean implementation of any programming language won't have that.
That ship has long sailed. ALGOL 1958 used := and Pascal popularized it.
https://en.m.wikipedia.org/wiki/Assignment_(computer_science...
I think they scan the code by two characters because one is not enough for <= and => so what is why assignment is := or =:.