Show HN: Rill – Composable concurrency toolkit for Go

(github.com)

208 points | by destel 7 months ago ago

127 comments

destel 7 months ago

Hi everyone. Posting on HN for the first time. I'd like to share Rill - a toolkit for composable channel-based concurrency, that makes it easy to build concurrent programs from simple, reusable parts

Example of what it looks like:

    // Convert a slice into a channel
    ids := rill.FromSlice([]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil)


    // Read users from API with concurrency=3
    users := rill.Map(ids, 3, func(id int) (*User, error) {
        return api.GetUser(ctx, id)
    })

    // Process users with concurrency=2
    err := rill.ForEach(users, 2, func(u *User) error {
        if !u.IsActive {
            u.IsActive = true
            return api.SaveUser(ctx, u)
        }
        return nil
    })

    // Handle errors
    fmt.Println("Error:", err)

Key features:

  - Makes concurrent code composable and clean
  - Works for both simple cases and complex pipelines
  - Built-in batching support
  - Order preservation when needed
  - Centralized error handling
  - Zero dependencies

The library grew from solving real concurrency challenges in production. Happy to discuss the design decisions or answer any questions.

[-]

slantedview 7 months ago

I'm curious - what other technologies/libraries/APIs, including in other languages, did you draw on for inspiration, or would you say are similar to Rill?

[-]

destel 7 months ago

The short answer would be: I kept writing a code that spawns goroutines, that read from a channel, do some processing and write results to another channel. Add some wait/err groups to this and we'll get a lot of boilerplate repeated all over the place. I viewed this as "channel transformations" and wanted to abstract it away. When generics came out it became technically possible.

Surprisingly, part of my inspiration came from scala (which I haven't touched since 2014). Back then Scala had transformable streams and the "Try" typethen.

akshayshah 7 months ago

The channel-focused approach to stream processing reminds me of Heka [0]. It was a contemporary of Samza and Heron, and it was fairly prominent in the early Go ecosystem (maybe 10 years ago). As I recall it, quite foggily and quite a long while later, one of the final nails in Heka's coffin was that channel throughput didn't scale well. Do you have benchmarks for Rill, or is it not intended for high-throughput use cases?

[0]: https://github.com/mozilla-services/heka

[-]

destel 7 months ago

I have some benchmarks in the project's wiki on Github. I can confirm your point: Rill's main bottleneck is channel operations, the library itself adds negligible overhead on top. Of course, for multi-stage pipelines the number of channel operations grows.

To summarize, I believe Rill's performance would be fine for any problem solvable with Go channels. I've used it for a wide variety of use cases - while I'm not sure I can call them high-throughput, I've had pipelines transferring hundreds of GBs of data with no performance issues.

minus7 7 months ago

Hey,

I found your library a few weeks ago when I was annoyed by nothing like this being built into the standard library. It’s been a breeze to use so far.

A neat trick I found to gauge bottlenecks in pipelines is using Buffers between steps and running a goroutine that periodically prints `len(buffered)/cap(buffered)`

[-]

destel 7 months ago

Thank you very much for the feedback. I thought about something similar some time ago. Buffer of size one, then measure the average time each item spends in the buffer. But for debugging your approach is simpler and more practical.

limit499karma 7 months ago

Is there an underlying assumption that the channels are containers and not streams?

[-]

destel 7 months ago

No, it's the opposite - the library treats channels as streams, processing items as they arrive without needing to know the total size in advance. This is why it can handle infinite streams and large datasets that don't fit in memory.

fillskills 7 months ago

Very intuitive API. Thanks!

Groxx 7 months ago

Hmmm. Some stuff to like, but I do feel like this should have a big, noticeable cautionary note that it does not wait if you end early (e.g. via Err). Any pending actions continue running and draining in background goroutines, and are potentially VERY delayed if internal/core.Delay is ever exposed or your funcs sleep.

I've seen that kind of pattern lead to TONS of surprise race conditions in user-code, because everyone always thinks that "it returned" means "it's done". Which is reasonable, nothing else is measurable on the caller side - changing that won't be noticeable by callers and may violate their expectations and cause crashes.

[-]

destel 7 months ago

Thank you for the feedback. I agree with your point.

The current solution is to make pipeline stages context-aware (which is often happens automatically) and cancel the context before returning. This is the responsibility of the user and can lead to problems you described.

I haven't yet found a better solution to this. On the other hand, exact same thing happens if your function spawns a goroutine and returns. That goroutine would run until done, unless it's context aware.

Regarding the "Delay" and "infiniteBuffer" functions - these are part of the work on adding support for feedback loops to Rill. I haven't yet found a reliable and user friendly way to do it, so this work is on hold for now.

jbendotnet21 7 months ago

Looks good, similar to https://github.com/sourcegraph/conc which we've been using for a while. Will give this a look.

[-]

alpb 7 months ago

There are also libraries like https://github.com/Jeffail/tunny or https://pkg.go.dev/go.uber.org/goleak or https://github.com/fatih/semgroup to help deal with concurrency limits and goroutine lifecycle management.

As the author of https://github.com/ahmetb/go-linq, it's hard to find adoption for libraries offering "syntactic sugar" in Go, as the language culture discourages those kind of abstractions and keeping the code straightforward.

purpleidea 7 months ago

Nice! I do a lot of concurrency work with DAG's in https://github.com/purpleidea/mgmt/ and I would love to swap out some of those concurrency runners with a lib if possible.

I was wondering if this could be it... Any thoughts in that direction, please let me know!

[-]

destel 7 months ago

Thank you! I took a quick look at mgmt, it's quite a large and complex project. I'd need to better understand your DAG-based concurrency patterns to say if Rill would be a good fit. Could you share some examples of the concurrent runners you're thinking about? This would help me understand if/how Rill might be useful for your case.

[-]

purpleidea 7 months ago

So we have two concurrency "engines" in mgmt:

(1) One for running the function graph, and (2) one for running the resource graph.

(1) https://github.com/purpleidea/mgmt/tree/master/lang/funcs/da... (2) https://github.com/purpleidea/mgmt/tree/master/engine/graph

TBQH, I think I'm decent at concurrency, but I never considered it my passion or expertise and while both of those _work_ I do know that I have some concurrency bugs hanging out, and I'd love real professional help here.

In the odd chance you'd be willing to hack on things with me, I'd be particularly elated. There might be a possible symbiosis!

I'm @purpleidea on Mastodon, Matrix (where we have an #mgmtconfig room) and a few other places in case you'd like to chat more!

gregwebs 7 months ago

Thanks for sharing what is working for you in production. I made a somewhat similar library that also does batching (docs are sparse, but I am updating docs on my libraries this week) [1].

I would call this parallelism rather than concurrency.

The main issue I have with this library's implementation is how errors are handled. Errors are retrieved rather than assigned- but assignment is preferable because it gets verified by tools. In my library I used a channel for errors- that gives ultimate flexiblitiy- it can be converted to wait and collect them to a slice or to perform a cancellation on first error.

[1] https://github.com/gregwebs/go-parallel

[-]

destel 7 months ago

Thank you for the feedback. My design decision is of course a tradeoff. When multiple channels are exposed to the users (not encapsulated inside the lib), this forces them to use "select". And this is very error prone in my experience

[-]

gregwebs 7 months ago

I never needed to use select on an error channel for my use cases because at the point I operate on the error channel I want to block for completion. And I provide helpers for the desired behavior for the channel so I don't even directly receive from it. I see that some of Rill is designed to operate on continuous streams, and in that light the design decision makes sense. For my use cases though the stream always had an end.

tschellenbach 7 months ago

this is a real problem in go, very easy to have bugs when working with channels and the way it handles errors etc.

[-]

latchkey 7 months ago

If you write comprehensive unit tests, it is not easy to have bugs in golang. Especially as things change over time. A library like this isn't going to protect you from having bugs.

TIL: HN doesn't like writing tests. The downvotes on this are hilarious. "Job security" ¯\_(ツ)_/¯.

[-]

bogota 7 months ago

Hard disagree on this. Large production apps that use channels have very subtle bugs that cause all kinds of annoying issues that only come up under load in prod. I have been using go for ten years and still pick it as my language of choice for most projects however I stay away from channels and especially any complex use of them unless it’s 100% required to scale the application. Even then you can most of the time come up with a better solution by re architecting that part of the application. For pet projects go crazy with them though.

[-]

latchkey 7 months ago

What are you disagreeing with exactly, are you trying to argue against testing? Are you trying to argue that using a library protects you from bugs somehow?

You stay away from something you don't understand after 10 years of working with it? What kind of logic is that? Channels aren't magic.

Subtle bugs in what? Have you considered that maybe you have bugs because you aren't writing tests?

If you aren't unit testing that stuff, then how are you able to fix/change things and know it is resolved?

My experience is that I built a binary that had to run perfectly on 30,000+ servers across 7 data centers. It was full of concurrency. Without a litany of automated tests, there is no way that I would have trusted this to work... and it worked perfectly. The entire deployment cycle was fully automated. If it passed CI, I knew that it would work.

It wasn't easy, it took a lot of effort to build that level of testing. But it was also totally bug free in production, even over years of use and development.

[-]

edvinbesic 7 months ago

You asserted that bugs are hard if you write unit tests. The parent stated that some issues only occur under production load and a unit test will not catch it. Nowhere was it implied that unit tests are useless.

Perhaps a less defensive posture might invite more discussion.

[-]

latchkey 7 months ago

> The parent stated that some issues only occur under production load and a unit test will not catch it.

I can't think of a single production problem that can't be replicated with a unit test. If you're seeing a problem in production, you need to fix it. How do you fix it? You write a test that replicates the problem and then fix the code, which fixes the test.

[-]

zaptheimpaler 7 months ago

> If you write comprehensive unit tests, it is not easy to have bugs in golang.

First you claimed before that unit tests will catch your subtle concurrency bugs before they happen, and that's just not often the case. They are subtle, might involve many systems and weird edge cases and often don't get caught BEFORE they happen. Of course anyone can write a test to replicate the problem after seeing it in production and spending hours/days debugging it.

More importantly, "Write comprehensive tests" is technically a strategy to avoid any bug ever. You can tell C programmers not to segfault by writing comprehensive tests but that doesn't negate the point that the language makes it easy to write segfaults. "Write more tests" is not a rebuttal to saying C makes some classes of errors easy to write. Writing comprehensive tests often takes a lot of time, is often not prioritized by companies, and is especially hard with distributed systems, concurrency, mocks and complex applications. If we just said "git gud noob" in the face of error prone and difficult abstractions, we might as well all be using assembly.

[-]

latchkey 7 months ago

Why are you comparing golang to C?

[-]

ctvo 7 months ago

Your replies here have been less than useless. I clicked on your profile and saw you were "founder and CEO" of some company.

I guarantee you I won't be using your product. Just something to consider.

[-]

latchkey 7 months ago

Thank you for sharing your perspective. I genuinely appreciate honest feedback. My goal is always to add value to discussions, but it seems I’ve fallen short in this instance. If there’s a specific way I could clarify or improve my comments, I’d be grateful to hear it.

Regarding my company, I respect your decision, but I hope that if our paths cross again, I might have the opportunity to change your mind through actions that demonstrate the value we provide to our customers.

[-]

random_mutex 7 months ago

value := int16((any(0xcafe)).(int))

mplewis 7 months ago

Then you have not worked on complicated systems.

[-]

latchkey 7 months ago

Why would anyone even say this to someone? Totally rude.

[-]

noop2714 7 months ago

Reading through the thread, I can tell you have depth of experience.

Perhaps bringing it down a notch could help you connect with others’ perspectives as well.

[-]

latchkey 7 months ago

If people were actually sharing perspectives (like the guy above), that would be fantastic.

glzone1 7 months ago

The original comment was about how concurrency expands / makes it easier for there to be errors in go (which avoids LOTs of other errors just with compile time / type safety stuff).

"very easy to have bugs when working with channels and the way it handles errors etc"

If you've done some programming you'll find this to be true. You have to think a LOT harder if doing concurrency, and you generally have to do a lot more tests.

Go - WITHOUT that much testing, is often surprisingly error free compared to more dynamic languages just out of the box both language side and how a lot of development happens. Python by contrast, you can have errors in dependencies, in deployment environment (even if code is fine), based on platform differences (tz data on windows), and plenty of runtime messes.

Channels are not as default safe / simple after compile as a lot of other go.

Try programming without channels in go and this may become clearer.

tmoertel 7 months ago

I think you're getting downvoted for the unsupported assertion that "If you write comprehensive unit tests, it is not easy to have bugs in golang." Probably because you made that assertion in the context of a discussion of channels, widely believed to have underlying concurrency semantics that are subtle and easy to misunderstand, making "write comprehensive unit tests" seem like a strategy that's apt to let real-world problems slip through (because a programmer's belief that their tests are "comprehensive" is likely to be mistaken).

[-]

steve_adams_86 7 months ago

Go makes it easier to write concurrent code, but it's a serious chore to iron out all of the kinks in more complex tasks. I've missed some weird stuff over the years.

I don't blame Go. It's an inherently difficult problem space. As a result, testing isn't a trivial job either. I wish it was.

[-]

latchkey 7 months ago

It is not a chore, it is our job. This is what we do. We write code. Of course you've missed stuff, we all have. Tests help alleviate the missed stuff. Even better is that they protect us over time, especially when we want to refactor things or find bugs in production. How do you fix a production bug without breaking something else? You write tests so that you know your code works.

Again with the HN downvotes, hilarious. People really hate the truth.

[-]

tmoertel 7 months ago

I think what you're missing is that "you write tests so that you know your code works" doesn't actually work for some important classes of "works," security and concurrency (the subject of this HN discussion) being two prominent ones. That's because testing only shows that your code works for the cases you test. And, when it comes to security and concurrency, identifying the cases that matter is a hard enough problem that if you can figure out how to do it more reliably, that's publishable research.

Think about it: If you're writing code and don't realize that it can deadlock under certain conditions, how are you going to realize that you need to test for whether it deadlocks under those conditions? If you're writing code that interpolates one kind of string into another and don't realize that you may have created an XSS vulnerability, are you suddenly going to gain that insight when you're writing the tests?

[-]

latchkey 7 months ago

You run your code in production, you see it is deadlocking and you fix it. How do you fix it? You write a test the reproduces the deadlock, you fix the code and your test passes. Just like any other testing.

I'm not arguing that you magically predict everything that's going to happen. But, without those tests and the culture of writing tests and writing your code to be testable, you're screwed when you do have bugs in production.

[-]

tmoertel 7 months ago

What you wrote was "you write tests so that you know your code works," but what you seem to have meant is "you write tests so that you when you get burned in production by problems that your tests didn't anticipate, you can write more tests at that time to make sure those newly discovered problems don't burn you again."

That's nice, but it's far from "knowing your code works". When code works, it doesn't burn you in production.

[-]

latchkey 7 months ago

This nit pick is peak HN obstinacy.

How do you know your code is going to work in production before you got there? You wrote tests.

[-]

tmoertel 7 months ago

You're missing something important.

You know your code is going to work in production before you got there not because you wrote tests but because you thought about what it means for your code to work in production and then came up with a plan to generate the required confidence that it actually will work in production. And, for any nontrivial system, tests can only satisfy part of that plan.

The goal isn't to have well-tested code. The goal is to have code that you can easily be confident will work as intended. And testing is only good at establishing some kinds of that confidence. For the other kinds, such as confidence you're not going to launch a bunch of security disasters or concurrency landmines, you need to do something else: types, proofs, correctness by construction, model checking, and so on.

I wrote about this idea almost twenty years ago: https://blog.moertel.com/posts/2006-10-10-unit-testing-is-a-...

I elaborated in a later post: https://blog.moertel.com/posts/2012-04-15-test-like-youre-be...

That you seem to believe otherwise is probably why a lot of people are having trouble with your claim that that world doesn't need better concurrency abstractions, just more tests because "if you write comprehensive unit tests, it is not easy to have bugs in golang."

[-]

latchkey 7 months ago

Not a single person has spoken up to say that they write a lot of tests AND they have a lot of bugs.

All the negativity (downvotes) has come from people who are trying to argue that writing tests doesn't solve the problem of bugs. The same people who don't write a lot of tests AND have a lot of bugs. ¯\_(ツ)_/¯

I write a lot of tests and I don't have bugs. I have decades of experience and millions of lines of code, with this simple fact. I know it is true, at least for me.

I don't know what else to bike shed here other than the constant downvoting by people who somehow don't believe my claim. The loss in karma doesn't bother me, I know I'm right on this and it appears as though the only people who disagree with me are the same people who don't write tests (and have a lot of bugs).

golang is a relatively simple language. It is why I like it so much. Occasionally there are somewhat difficult things to reason about, but if you write golang code that is easily testable (and this requires thought and planning), then my experience is that even the "harder" channel/goroutine code can always be tested in one way or another.

[-]

tmoertel 7 months ago

> I know I'm right on this and it appears as though the only people who disagree with me are the same people who don't write tests (and have a lot of bugs).

For the record, I write lots of tests, and don't have bugs. I even wrote a testing framework. Nobody is arguing that writing tests is dumb. The pushback is on your insistence that writing tests is all you need:

> All the negativity (downvotes) has come from people who are trying to argue that writing tests doesn't solve the problem of bugs.

Writing tests doesn't solve the problem of security bugs. Writing tests doesn't solve the problem of concurrency bugs. Writing tests to prove your code is bug free in those cases is expensive and error prone.

People who care about these things know to go beyond testing when testing isn't enough. That's why things like model checkers exist.

Nobody is arguing that tests are dumb. The argument is that if writing tests is all you're doing to get the bugs out of your code, you probably aren't very effective at preventing certain classes of problems.

For instance: Show me the tests you'd write to prove your software doesn't have XSS vulnerabilities.

[-]

latchkey 7 months ago

> Show me the tests you'd write to prove your software doesn't have XSS vulnerabilities.

I'd have tests around the code that renders 3rd party user input and integration tests for the display of the data.

I've built some of the most heavily trafficked websites on the planet (porn), with user input (comments) and have never had an XSS issue.

[-]

tmoertel 7 months ago

Show me the tests.

If you can't show me the tests you'd use to prove you don't have XSS problems, it's hard to believe that your tests are effective at preventing XSS problems.

> I've built some of the most heavily trafficked websites on the planet (porn), with user input (comments) and have never had an XSS issue.

Right, because the gold standard for proof in the security field is "we never had [read: noticed] an issue."

[-]

latchkey 7 months ago

It was code written in 2009 and private, not open source and I of course didn't take it with me when I left the company. I ran it for 4 years and we never had a single security incident. We took it very seriously. Partly because our code (in Java) was a rewrite from some really buggy PHP, that did in fact have a bunch of holes in it (and no testing).

You're also being absurd. We started this talking about golang testing and it has somehow gone off the rails to me having to prove things to you about XSS? Come on, what is with the hostility? Is this how you treated people while working at Google?

[-]

tmoertel 7 months ago

I'm only asking you to show me how you'd write tests to detect XSS (or concurrency) problems. In Go or the language of your choice. You've claimed that writing tests as all you need. I'm asking you to show how it's done. Just in general. No need to share actual code you've written in the past.

steve_adams_86 7 months ago

> It is not a chore, it is our job. This is what we do.

I'm not sure how to meaningfully distinguish the two here. I'm saying it takes extra effort, not denying that it's my job. It's non-trivial, that's all I'm trying to say.

> How do you fix a production bug without breaking something else? You write tests so that you know your code works.

Of course, you're right. Sometimes writing the tests can be harder than writing the code, though.

hnlmorg 7 months ago

You’re getting downvoted because you’re essentially arguing that a language abstraction which is a known source of bugs can be solved simply by writing better code.

which misses the point of the OP.

[-]

steve_adams_86 7 months ago

They're also suggesting a method of testing which almost certainly doesn't offer sufficient assurance under most circumstances will uncover all possible bugs. When I've got concurrency in an application, I'll use unit tests here and there, but mostly I want assurance that the entire system behaves as expected. It's too much complexity to rely on unit tests.

[-]

hnlmorg 7 months ago

Very true. As an author of a several multithreaded applications, I concur that unit testing thread interactions is hard and seldom exhaustive.

[-]

latchkey 7 months ago

It is not exhaustive because you haven't taken the effort to do it. It isn't easy, you have to write you code in a way that can be tested. It takes planning and effort to do this, but it pays off with having applications that aren't full of bugs.

[-]

macintux 7 months ago

You sound like the people who argue that, despite decades of security vulnerabilities offering evidence otherwise, C is perfectly safe if you know what you’re doing and just put more effort into it.

Technically you may be right, but it’s not a helpful viewpoint. What the world needs are abstractions that are easier to understand and program correctly, not assertions that everyone else is doing it wrong and just needs to be smarter/work harder.

[-]

latchkey 7 months ago

That's absurd, I'm not arguing anything of the sort.

If you want an analogy, I'm arguing that a condom helps prevent STD's and unwanted pregnancies. It isn't perfect, but at least it is better than not wearing a condom at all. Nobody loves wearing a condom, but we do it cause it is the right thing to do.

[-]

dymk 7 months ago

You can't prove away chlamydia or an unwanted pregnancy, but you can provably eliminate whole classes of logic errors by having more powerful type systems.

[-]

latchkey 7 months ago

100%. Compilers for the win for sure. It was a big reason why I started to write Java code and move away from PHP. This lead me to co-found Java @ Apache.

hnlmorg 7 months ago

It’s not exhaustive because complex multi-threaded software has a plethora of hidden edge cases, many of which actually fall outside the traditional remit of a unit test.

This is where other forms of software testing come into play. Such as integration tests.

[-]

latchkey 7 months ago

wtf is a hidden edge case? Is that like flying a plane with blinders on? Come on...

You write tests to cover edge cases. If you miss one, you write more tests when you come across them. This isn't magic.

[-]

hnlmorg 7 months ago

> If you miss one, you write more tests when you come across them.

…before such point you have hard to find bugs in your software.

And that’s the crux of everyone’s argument against your “just write better code” fallacy ;)

> This isn't magic.

No it’s not. That’s why people disagree with your assessment that unit tests can catch every type of bug.

If unit tests really were that magical then we wouldn’t need for other methods of tests.

I mean, do you even know what a unit test is? It’s meant to be self-contained but the problem with multithreaded code is that you can start to introduce side effects that happen outside that functions scope.

I’ve got exactly that issue right now calling OS APIs from cgo. Some of those APIs (particularly with SDL) are very thread sensitive and you cannot unit test for that because the problem lies outside that functions scope. So the only way to test for that is with e2e or integration tests instead.

[-]

latchkey 7 months ago

I didn't say "just write better code", I said write more tests. Writing those tests will produce better code.

I also never said unit tests were the ONLY tests.

[-]

hnlmorg 7 months ago

But you were ONLY talking about unit tests for catching concurrency bugs; bugs which often fall outside the scope of a unit being tested.

Nobody disagrees with your point about the importance of unit tests. But the way you present that point is more than a little naive (and that’s the polite way of putting it).

lenkite 7 months ago

> If you write comprehensive unit tests...

latchkey 7 months ago

All this "complexity" can be unit tested, I've done it.

Trying to handwave and say that your code is too complex to be tested is very strange to me. Maybe take a step back and ask yourself why your code is too complicated to test. Maybe refactor it a bit to be less complicated or more easily testable.

dboreham 7 months ago

This is exactly the kind of bug that unit tests are poor at exposing.

kermatt 7 months ago

Name overlap with another Go based package: https://github.com/rilldata/rill

[-]

taffit 7 months ago

This also came to my mind when I heard `rill`, coming from https://www.rilldata.com/ .

dangoodmanUT 7 months ago

Love the idea, some weirdness though:

> Here's a practical example: finding the first occurrence of a specific string among 1000 large files hosted online. Downloading all files at once would consume too much memory, processing them sequentially would be too slow, and traditional concurrency patterns do not preserve the order of files, making it challenging to find the first match.

But this example will process ALL items, it won't break when a batch of 5 finds something?

[-]

destel 7 months ago

It will. Otherwise the example wouldn't make sense. There's one important detail I haven't clarified enough in that part of the readme.

For proper pipeline termination the context has to be cancelled. So it should have been be Like:

func main() { ctx, cancel := context.WithCancel(context.Background()) defer cancel()

    urls := rill.Generate(func(send func(string), sendErr func(error)) {
        for i := 0; i < 1000 && ctx.Err() == nil; i++ {
                send(fmt.Sprintf("https://example.com/file-%d.txt", i))
        }
    })
    ...

One of the reasons I've ommited context cancellation in this and some other examples is because everything's happening inside the main function. I'll probably add cancellations to avoid confusion.

destel 7 months ago

I've also just pushed few small changes to the readme that clarify this things.

7 months ago

[deleted]

lspears 7 months ago

This is great. I am working on a robotics application and this seems like a better abstraction than alternatives such local messaging servers. How do you deal with something like back pressure or not keeping up with incoming data?

[-]

destel 7 months ago

The lib is based on channels and inherits the channel behavior in terms of backpressure. Simply put if no-one reads on one side of the pipeline, it wouldn't be possible to write anything on the other side. Still, it's possible to add buffering at arbitrary points in the pipeline using the rill.Buffer function.

0x696C6961 7 months ago

I think that using iterators in the public API would have been better than channels.

[-]

destel 7 months ago

Rill might look like it tries to be a replacement for iterators, but it's not the case. It's a concurrency library, that's why it's based on channels

[-]

0x696C6961 7 months ago

I disagree that using channels is necessary for concurrency. Consider the following iterator based signature for your Map function:

    func Map[A, B any](in iter.Seq[Try[A]], n int, f func(A) (B, error)) iter.Seq[Try[B]]

[-]

jerf 7 months ago

Channels are many-to-many; many goroutines can write to it simultaneously, as well as read from it. This library is pitched right at where people are using this, so it's rather a key feature. An iterator is not. Even if you wrap an iterator around a channel you're still losing features that channels have, such as, channels can also participate in select calls, which has varying uses on the read and write sides, but are both useful for a variety of use cases that iterators are not generally used for.

They may not be "necessary" for concurrency but given the various primitives available they're the clear choice in this situation. They do everything an iterator does, plus some stuff. The only disadvantage is that their operations are relatively expensive, but as, again, we are already in a concurrency context, that is already a consideration in play.

[-]

0x696C6961 7 months ago

The ability to participate in select statements is a good call out. Thanks for taking the time to reply.

eweise 7 months ago

that makes Scala look easy.

[-]

0x696C6961 7 months ago

For context, here's the existing Map signature from the linked library:

    func Map[A, B any](in <-chan Try[A], n int, f func(A) (B, error)) <-chan Try[B]

Are you suggesting that this channel based signature is significantly easier to understand than the one I shared?

[-]

eweise 7 months ago

No, it was more of a general comment that once Go added support for generics, doing functional style programming starts to look as complex (or more actually) than languages that built support from the beginning.

[-]

0x696C6961 7 months ago

Which part of the signature are you struggling with? Maybe I can help you understand.

destel 7 months ago

[dead]

hu3 7 months ago

Hi! Looks great. I might use this to fan out/in my RSS reader HTTP calls.

How would I implement timeout? In case a HTTP call takes too long?

[-]

mariusor 7 months ago

You might be interested by something that has been designed specifically for this problem. I created a state machine library for Go on top of which I mapped retry[1] and some other patterns. And funnily enough one of the first applications I implemented is an RSS reader[2]

[1] https://pkg.go.dev/git.sr.ht/~mariusor/ssm#example-Retry

[2] https://git.sr.ht/~mariusor/frankenjack/tree/master/item/sou...

destel 7 months ago

For now, the library is context-agnostic by design. For HTTP timeouts, you'd use Go's standard approaches: either set the HTTP client timeout or pass a context with timeout to each request. Please let me know more about your use case - I'll let you know if Rill isn't a good fit.

lsaferite 7 months ago

Based on the examples and documentation, rill doesn't manage context for you. You'd simply set the client timeout or give each http call a timeout context.

linhns 7 months ago

Looks super neat! Just a small question: Why did you chose the name Try instead of Result?

[-]

destel 7 months ago

Thank you. I "stealed" the name from scala. They have the similar value+error type. Maybe in context it rill the better name could have been "Item"

qaq 7 months ago

any plans to add context support ?

[-]

destel 7 months ago

I am thinking on it. To be honest, the current design works fine for my use cases: simply put, the function that defines a pipeline should have context.WithCancel() and defer cancel() calls.

I need a feedback on this. What kind of builtin context support would work for you? Do you need something like errgroup's ability to automatically cancel the context on first error?

destel 7 months ago

I've just pushed few small changes to the readme that better explain context usage

lyxell 7 months ago

The API looks really nice and intuitive! What motivated you to build this?

[-]

destel 7 months ago

Thank you! There are two pieces of motivation here. The first one is removing boilerplate of spawning goroutines that read from one channel and write to another. Such code also needs wait/err group to properly close the output channel. I wanted to abstract this away as "channel transformation" with user controlled concurrency.

Another part is to provide solutions for some real problems I had. Most notably batching, error handling (in multi stage pipelines) and order preservation. I thought that they are generic enough to be the part of general purpose library.

7 months ago

[deleted]

icar 7 months ago

This reminds me of rxjs.

fidotron 7 months ago

I think it is time to face the fact CSP style channels are a bad idea if you don’t also have the occam semantics for channel scope. (I know there are other people here that understood that sentence).

The problem in golang is the channel cleanup, particularly. is a mess. In occam they come and go simply by existing as variables in seq or par blocks. Occam is very static though, so the equivalent to goroutines are all allocated at build. (Occam pi attempted to resolve this iirc).

https://en.m.wikipedia.org/wiki/Occam_(programming_language)

Some of the patterns in this library are reminiscent of constructs occam has as basic constructs, such as the for each block, although the occam one must have the number of blocks known at build time.

The fact so many people in golang reach for mutexes constantly is a sign things are not all well there.

[-]

dboreham 7 months ago

+1 for understanding the sentence.

0x696C6961 7 months ago

Channels are just another synchronization primitive in your toolbox. They do make some things much simpler, but there's no reason to reach for one if a mutex does the job.

The usage of mutexes doesn't make channels "bad" for the same reason that usage of atomics doesn't make mutexes bad.

pajeetz 7 months ago

what sort of environment do you need to be in to have to compose concurrency like this instead of relying on native go's scaling?

[-]

jerf 7 months ago

The same sort of environment in which one uses such abstractions like "functions" instead of relying on the language's native ability to run sequential instructions.

It's generally good for languages to provide relatively low-level functionality and let libraries be able to build on top of it, because as the programming language development world has now learned many times over, the hardest code to change is the code in the language and its standard library. It isn't the job of the language itself to provide every possible useful iteration on the base primitives it provides.

dougbarrett 7 months ago

Batching is a pattern I’ve had to manually build in the past to push large amounts of analytic data to a database. I’d push individual events to be logged, map reduce those in batches and then perform insert on duplicate update queries on the database, otherwise the threshold of incoming events was enough to saturate the connection pool making the app inoperable.

Even optimizing to where if an app instance new it ran the inert on update for a specific unique index by storing that in a hash map and only running updates from there on out to increase the count of occurrences of that event was enough to find significant performance gains as well.

c4pt0r 7 months ago

Very handy!

pezo1919 7 months ago

How does it compare to RxGo?

[-]

destel 7 months ago

I have to be honest - I haven't ever heard about it. Just checked and found it's very mature and popular, though it seems to have had no commits in the last 2 years.

After quick scanning, I'd highlight these 2 differences: - Rill is type safe and uses generics - Rill does not try to hide channels, it just provides primitives to transform them

noctane 7 months ago

It looks great. What are other existing tools? And how do they compare to them?

[-]

Scaevolus 7 months ago

Sourcegraph Conc is broadly similar in providing pool helpers, but doesn't provide the same fine grained batching options: https://github.com/sourcegraph/conc

Uber CFF does code generation, and has more of a focus on readability and complex dependency chains: https://github.com/uber-go/cff

[-]

noctane 7 months ago

Thanks for sharing!

izolate 7 months ago

The batching concept is a cool idea and could be useful in the right context. That said, this feels like a JavaScript engineer's take on Go. Abstractions like Map and ForEach don't align with Go's emphasis on simplicity and explicitness. The lack of context.Context handling also seems like an oversight, especially when considering concurrency.

Judging by the praise, I'm probably in the minority, but as a code reviewer, I’d much rather see straightforward loops, channels, and Go's native constructs over something like Rill.

[-]

ARandomerDude 7 months ago

> Map and ForEach don't align with Go's emphasis on simplicity and explicitness

I've never paid my bills with Go, but `Map` and `ForEach` don't seem all that different than `for _, u := range Users` to me. Yes, the former is "functional" but only mildly.

[-]

prisenco 7 months ago

In that case there's no particular reason to use them. As far as Go's philosophy goes.

[-]

ARandomerDude 7 months ago

touché

zendist 7 months ago

If you were to build a library like `rill` in the Go-way, what would your Batch API usage look like?

hnlmorg 7 months ago

I don’t agree with your comment about Map and ForEach, just by virtue of the fact that sync.Map exists in Go’s standard library.

But your point about the lack of contexts is definitely a deal breaker for me personally too.

[-]

akshayshah 7 months ago

The "map" under discussion here is very different from sync.Map. The discussion here is focused on the "map" primitive from functional programming - transforming a collection by applying a function to each element.

sync.Map is a concurrency-safe hash map. Same name, totally different thing.

7 months ago

[deleted]

fatih-erikli-cg 7 months ago

:= for assinging a variable sounds and looks weird to me

[-]

steve_adams_86 7 months ago

It's the walrus operator. Pascal and Python use it as well. You get used to it pretty quickly.

Pascal: https://www.freepascal.org/docs-html/ref/refse104.html#x224-... Python: https://docs.python.org/3/whatsnew/3.8.html#assignment-expre...

[-]

fatih-erikli-cg 7 months ago

It was something introduced like a 1 april joke. Clean implementation of any programming language won't have that.

[-]

hnlmorg 7 months ago

I actually think it’s more readable because it makes the distinction between assignment and equivalence very clear.

bugs originating from the similarity of == vs = has probably cost the industry millions over the last 3 decades.

kortex 7 months ago

That ship has long sailed. ALGOL 1958 used := and Pascal popularized it.

https://en.m.wikipedia.org/wiki/Assignment_(computer_science...

[-]

fatih-erikli-cg 7 months ago

I think they scan the code by two characters because one is not enough for <= and => so what is why assignment is := or =:. Probably + is ++ too.

linhns 7 months ago

I think it's good. 2 characters for declaration + assign.

indulona 7 months ago

it's just a shorthand for var foo int = 5 vs foo := 5 where the type is derived from the assigned value.

[-]

7 months ago

[deleted]