Async Rust in Three Parts

(jacko.io)

162 points | by oconnor663 9 months ago ago

70 comments

In part two, the author explains trait objects in a way that is, I think, a little misleading.

They're right that trait objects are dynamically sized types, which means they can't be passed by value to functions, but wrong that they need to be boxed; they can instead be put behind a reference. Both of the following are valid types.

  type DynFutureBox = Pin<Box<dyn Future<Output = ()>>>;
  type DynFutureRef<'f> = Pin<&'f dyn Future<Output = ()>>;

You can see this in the Rust Playground here: https://play.rust-lang.org/?version=stable&mode=debug&editio...

[-]

LoganDark 9 months ago

Technically trait objects aren't entirely a thing at all. They're a concept that only makes sense in the concept of a pointer (references are safe pointers, `Box`es are smart pointers). You can refer to something as a trait object but the trait of the object is a property of the pointer and not the object. So if you have some struct that implements a trait you can cast a pointer to that struct to a pointer to a trait object, but that struct never stops being the struct, a trait object is just a different way of referring to the struct.

[-]

iTokio 9 months ago

What about impl Trait then? In that case traits make sense without pointers.

To me traits are like a definition of capabilities. A way to duck type things.

[-]

kaoD 9 months ago

"Trait objects" is lingo for `dyn Trait`. They are a distinct thing from just "trait". They allow virtual dispatch at runtime.

See https://doc.rust-lang.org/reference/types/trait-object.html (`dyn Trait`, runtime dynamic dispatch) vs https://doc.rust-lang.org/reference/types/impl-trait.html (`impl Trait`, compile-time monomorphization)

LoganDark 9 months ago

`impl Trait` is sort like of syntax sugar for generics (this is not the full story, for example TAIT/type_alias_impl_trait... but it's close enough). It's monomorphized just like generics are. If you have a method that takes an `impl Trait` then a new copy of the method will be emitted by the compiler for each unique type you pass to that `impl Trait` parameter.

Traits conceptually are kind of like definitions of capabilities. So you're not really wrong about that, that understanding probably may even help you.

keyle 9 months ago

My only experience with Rust has been synchronous mostly with little to show for in terms of async. And I liked Rust. When it ran, I was damn sure it was going to run. There is comfort in that: "Tell me what to fix". The clippy stuff etc. was great too.

I read the 3 parts of this website and 'Wowsa'... I'm definitely not going in that direction with Rust. I'll stick to dumb Go code, Swift or Python if I do async heavy stuff.

It's hard enough to write good code, I don't see the point of finding walls to smash my head into.

Think about it, if you write a lot of async code, chances are you have a ton of latency, waiting on I/O, disk, network etc. Using Rust for this in the first place isn't the wisest since performance isn't as important, most of your performance is wasted 'waiting' on things. Besides Rust wants purity and I/O is gritty little dirt.

Sorry my comment turned into a bit of a hot take, feel free to disagree. This async stuff, doesn't look fun at all.

[-]

worik 9 months ago

> My only experience with Rust has been synchronous

It is a shame that the dominance of the "async/await" paradigm has made us think in terms of "synchronous" or "async/await"

> Think about it, if you write a lot of async code, chances are you have a ton of latency, waiting on I/O, disk, network etc

Yes. For almost all code anyone writes blocking code and threads are perfectly OK

Asynchronous programming is more trouble, but if trying to deal with a lot of access to those high latency resources asynchronous code really shines.

The trouble is that "async/await" is a really bad fit for rust. Every time you say `await` you start invisible magic happening. (A state machine starts spinning I believe in Rust - I may be mistaken)

"No invisible magic" was a promise that Rust made to us. What you say is what you mean, and what you mean is what you get.

No more, if you use async/await Rust

I really do not understand why people who are comfortable with "invisible magic" are not using a language with a garbage collector - that *really* useful invisible magic.

Asynchronous programming is the bees knees. It lets you get so much more from your hardware. I learnt to do it implementing telephone switching systems on MS-DOS. We could run seven telephone lines on a 486, with DOS, in (?) about 1991.

Async/await has so poisoned the well in Rust that many Rust people do not understand there is more to asynchronous programming than that

[-]

the__alchemist 9 months ago

I notice this as well; there is a false dichotomy of "Async/Await" or "blocking". I see this in embedded too. I think a lot of rust embedded programmers learned on Embassy, and to them, not using Async means blocking.

[-]

zavec 9 months ago

Is the alternative the more traditional spawning threads and using channels, or is there another paradigm? That's definitely something I'd be interested in learning more about.

[-]

onjectic 9 months ago

I think they mean that there is more than one asynchronous paradigm. Actors is one alternative I can think of.

[-]

palata 9 months ago

Where an actor has its own thread and communicates with a channel, right?

[-]

pdimitar 9 months ago

Not necessarily, a lot of actors can be sharing the same OS thread and be preemptively switched away from when they hit a certain limit of instructions or start an I/O call. Erlang demonstrates this very well because it works there without a glitch 99.99% of the time (and the remaining 0.01% are because of native code that does not yield control).

GolDDranks 9 months ago

There is also readyness-based polling in a loop, without the async/await sugar.

omani 9 months ago

can you elaborate on this?

[-]

the__alchemist 9 months ago

Writing programs that perform multiple tasks asynchronously can be done in many ways. It is a broad and important tool in software, and the Async/Await language is only one way to do it. Examples:

  - Multiple cores
  - DMA or other dedicated hardware
  - GPU programming
  - Distributed systems (e.g. the CAN network in your car)
  - Threads
  - Interrupts
  - Event loops
  - Coroutines

rcxdude 9 months ago

>very time you say `await` you start invisible magic happening. (A state machine starts spinning I believe in Rust - I may be mistaken)

It's more that an async function in Rust is compiled completely differently: it's turned into a state machine at that point, with the code between 'awaits' being the transitions. In and of itself, it's not actually particularly difficult to grok (I'd say you have about as much an idea of what the resulting machine code looks like as with an optimized non-async function), the headaches are all in the edges of what the language can currently support when compiling under this model.

kod 9 months ago

> "No invisible magic" was a promise that Rust made to us.

Honest question, where did you get that promise from?

The 1.0 release didn't really emphasize that: https://blog.rust-lang.org/2015/05/15/Rust-1.0.html

The current rhetoric is more about empowering more people to have confidence in systems programming: https://doc.rust-lang.org/book/foreword.html

Some of graydon's ideas starting almost a decade before 1.0 might have included that https://github.com/graydon/rust-prehistory/blob/df8cc964772b...

but his recent posts on what he would have done differently if he was BDFL include a bunch of stuff that's arguably more magical, not less: https://graydon2.dreamwidth.org/307291.html

[-]

zavec 9 months ago

No invisible magic sounds more like something I've read about zig, maybe they were confusing it with that?

_flux 9 months ago

However, if you need a state machine, async/await is a super-elegant way to express it. No other language provides as nice way to do it.

I believe best you can do in other languages is using continuations as the state.

[-]

metaltyphoon 9 months ago

C# async transform into a state machine too.

worik 9 months ago

State machines are the easy part.

pkolaczk 9 months ago

Performance is not only wall clock time. With high latency, I/O bound tasks, the cost will be often determined by how much memory you need. And in the cloud, you can’t pay for memory alone. The more memory you need, the more vcores you have to reserve. You might end up in a situation your cpus are mostly idling, but you can’t use fewer cores because of RAM needed to keep the state of thousands of concurrent sessions. In this case Rust async can bring a tremendous advantage over other languages, particularly Java and Go.

[-]

palata 9 months ago

> you can’t use fewer cores because of RAM needed to keep the state of thousands of concurrent sessions. In this case Rust async can bring a tremendous advantage over other languages, particularly Java and Go.

Can you elaborate on that? What about green threads?

[-]

vlovich123 9 months ago

Green threads can use more memory than async if you have a lot of I/O operations in flight since the stack is not precisely sized at pause points. Similarly, switching in new stacks is a bit more expensive than continuing a paused async task. Implementation details matter of course.

[-]

palata 9 months ago

Right, that's interesting. Still curious to know if that is "slightly better" or if that qualifies for the "tremendous advantage" mentioned above!

[-]

vlovich123 9 months ago

It’s hard to say. If you want to use green threads, you may want to look at the may crate [1]. The reason Rust prioritized stackless async just like C++ did is that it fits better the systems programming needs (i.e. it’s syntactic sugar for a state machine you could hand implement) while not preventing things like stackful coroutines.

If Rust manages to solve the coloring problem of async (e.g. by adopting effect systems [2] or alternatives), then stackful and stackless coroutines syntactic sugar could conceivably exist within the std language (perhaps leaving out stackless on nostd).

The reason you don’t see both stackless and stackful coroutines in a single language like Rust is the coloring problem is made 50% worse.

[1] https://crates.io/crates/may

[2] https://blog.yoshuawuyts.com/extending-rusts-effect-system/

[-]

steveklabnik 9 months ago

Note that May has soundness issues that the authors handwave away. You can get UB in safe code while using it.

[-]

vlovich123 9 months ago

You mean the TLS issue called out or something else?

I wasn't trying to recommend may specifically of course. Or are you saying that stackful coroutines must have soundness issues due to missing language features to make it safe?

[-]

steveklabnik 9 months ago

The TLS thing, yeah.

I am unsure if it's inherent to stackful coroutines or not, it's been a minute since I've dug into that.

[-]

vlovich123 9 months ago

Yeah, for TLS to work safely I suspect the only way would be language support which knows to media TLS through a "context pointer" so that the green threads could restore the context when resuming. In C++ the story is even worse because someone could take the address of a TLS variable & use it elsewhere. I think in general it's very very tricky to mix TLS and stackful coroutines that can be scheduled on arbitrary threads and languages like Go pick the stackful coroutines and drop the ability to do TLS.

To be fair though, I think people generally just avoid TLS when running with green thread systems.

mplanchard 9 months ago

Having written a lot of asynchronous code in python and in rust, I’d take Rust any day. If it compiles, it works.

I also don’t think it’s hard to reason about in practice. Tutorials tend to get much deeper into the weeds than you typically need to go.

[-]

pdimitar 9 months ago

Maybe Rust has improved in the last 13-14 months but last time I needed to do a lot of async code I ended up with a browser session with at least 12 crates on docs.rs where only a small part of the picture was explained. It was absolutely terrible to track down what's the problem.

In the end, very helpful (and hardcore -- like the main author of Tokio) people unblocked me. I am not sure I was left very enlightened though; but I likely didn't stay for long enough for the whole thing to stick firmly into my memory. It's likely that.

[-]

mplanchard 9 months ago

It has improved a lot, yes. impl in return position in traits makes writing async traits much, much easier, as you can avoid a large number of Boxing and lifetime issues when you don't need a trait object (which is what the async_trait crate does under the hood).

I also think you've really got to be willing to be pragmatic when writing async code. If you like to do functional chains, you've got to be willing to let go of that and go for simple imperative code with match statements.

Where I find it gets complicated is dealing with streams and stuff, but for most application authoring use-cases, you can just await stuff that other people have written, or throw it into a `join_all` or whatever.

samatman 9 months ago

> If it compiles, it works.

This slogan sucks. If it compiles, it type checks. Yes, Rust has a more sophisticated type system than Python with annotations, so it catches more type errors.

No, the type system cannot prevent logic bugs or design bugs. Stop pretending it does.

[-]

mplanchard 9 months ago

Well, for stuff like "does my async code actually run," or "will the program crash," the slogan is generally true. Or, versus C++, "are all of my references valid, or will my program segfault."

Obviously a type system cannot catch all your logic errors, but you can write code such that more of the logic is encapsulated in types, which _does_ help to catch a lot of logic errors.

There's a strong qualitative difference working with the Rust compiler versus working with Python or C++. Do you have a better suggestion for how to express that?

[-]

samatman 9 months ago

I made the suggestion! "if it compiles, it type checks". Rust's type system is much more sophisticated than that of C++ and Python, and that is the difference you're gesturing at.

Also, no, the Rust compiler will happily pass code which will crash your program, all it takes is an out of bounds array access. That's the kind of puffery many of us are tired of. The "if it compiles, it works" slogan is, bluntly, wrong.

[-]

mplanchard 9 months ago

I mean, if it compiles it typechecks is kind of tautological, and not particularly effective as a saying or slogan. I feel like you miss the point of a phrase like that, which like any aphorism, slogan, saying, or rule of thumb is not about representing reality with 100% accuracy but instead about conveying an idea.

The only two languages I’ve worked with that gave me the feeling that I could generally trust a compiling program to be approximately correct are Rust and Haskell. That difference relative to other languages is meaningful enough in practice that it seems to me to be worth a slogan. I believe it’s meant to be more of a “works, relative to what you might expect from other languages” kind of thing versus, “is a completely perfect program.”

And, if you care about maximizing the “if it compiles it works” feeling, it’s possible to use .get() for array access, to put as much logic in the type system as is feasible, etc. This is probably more idiomatic and is generally how I write code, so it does often feel that way to me, regardless of whether it is completely, objectively, literally true.

[-]

samatman 9 months ago

> if it compiles it typechecks is kind of tautological

It's not tautological at all, because the type system in Rust and Haskell is not a trivial condition of the language.

> not particularly effective as a saying or slogan

Neither is "if it compiles it runs", rather less so in fact, everyone is sick of hearing it, and rolls their eyes so hard it's actually audible.

Every one of these 764 bugs compiled and passed type checks:

https://github.com/tokio-rs/tokio/labels/C-bug

Not picking on tokio in particular, mind you, finding and fixing bugs is a sign of quality in a library or program.

> I believe it’s meant to be more of a “works, relative to what you might expect from other languages” kind of thing versus, “is a completely perfect program.”

Which is why I describe it as meaningless puffery. What you're saying here is that you know full well it isn't true, but want to keep saying it anyway. My reply is find a way to express yourself which is true, rather than false. I bet you can figure one out.

[-]

kod 9 months ago

> the type system cannot prevent logic bugs or design bugs

^ your words, that statement is false. the type system _can_ prevent logic bugs or design bugs, exhaustive pattern matching is an obvious example.

I bet you can find a way to express yourself which is true, e.g. "the type system cannot prevent _all_ logic bugs or design bugs"

gpderetta 9 months ago

asyncio is its own special kind of hell. I understand there are significantly better async libraries for python, but being built-in it is what you end up reaching to.

[-]

palata 9 months ago

What is bad with asyncio? Genuinely interested.

[-]

gpderetta 9 months ago

From the top of my head: lack of structured consistency, error prone , atrocious stack traces, and (and this might be a skill issue on my part) confusing task cancel semantics.

DeathArrow 9 months ago

>Think about it, if you write a lot of async code, chances are you have a ton of latency, waiting on I/O, disk, network etc. Using Rust for this in the first place isn't the wisest since performance isn't as important, most of your performance is wasted 'waiting' on things. Besides Rust wants purity and I/O is gritty little dirt.

But isn't most code going to perform some I/O at some time? Whether is calling an API, writing on disk or writing to a DB?

pdimitar 9 months ago

I am mostly in agreement -- I moved to Golang for not-super-complex async programs because I value my time and day-to-day productivity more than the super strictness of Rust (which I love and I wish more languages had it, though in a more approachable manner).

Rust was a legendary pain to untangle when learning to do async, though as I admitted in a comment down-thread this was also because I didn't stay for long enough for everything to cement itself in my head. It was still an absolute hell to get into. I needed help from Tokio's author to have some pieces of code even compile because I couldn't for the life of me understand why they didn't.

...BUT, with that being said, Rust has a much smaller memory footprint and that is an actual and measurable advantage on cloud deployments. It could be painful to make it compile and run but then it'll give you your money's worth and then some. So it's worth even only for that (and "that" is a lot!), if you are optimizing for those values. I plan to do that in the future. In the meantime Golang is an amazing compromise between productivity and machine performance.

tcfhgj 9 months ago

Performance isn't wasted. More waiting and less CPU time also means longer battery life / energy efficiency.

Animats 9 months ago

It's amusing that the Rust Playground lets you run a hundred threads. That's generous of them. There's a ceiling below 1000, though. The author points out, "On my Linux laptop I can spawn almost 19k threads before I hit this crash, but the Playground has tighter resource limits." Large servers can go higher than that.

The thread-based I/O example with the compute bound poll loop is kind of strange.

"Join" isn't really that useful when you have unrelated threads running finite tasks. Usually, you let the thread do its thing,finish, put results on a queue, and let the thread exit. Then it doesn't matter who finishes first. Rust join is actually optional. You don't have to join to close out a thread and reap the thread resources. It's not like zombies in Unix/Linux, where some resources are tied up until the parent process calls wait().

Loops where you join all the threads that are supposedly finished are usually troublesome. If somebody gets stuck, the joiner stalls. Clean exits from programs with lots of channels are troublesome. Channels with multiple senders don't close until all senders exit, which can be hard to arrange when something detects an error.

In Rust, the main thread is special. (I consider this unfortunate, but web people like it, because inside browsers, the main thread is very special.) If the main thread exits, all the other threads are silently killed.

[-]

shepmaster 9 months ago

> the Rust Playground lets you run a hundred threads

It's more that we don't do anything to prevent it, other than coarse process-wide memory / CPU time limits. IIRC, Rust-spawned threads on Linux use 2MiB of stack space by default, so that seems like a likely cap.

Note that the playground is only 2 cores and you are sharing with everyone else, so you aren't likely to really benefit.

[-]

littlestymaar 9 months ago

> Note that the playground is only 2 cores and you are sharing with everyone else

This is amazing, I use it all the time with no performance issues so I expected it to be much beefier to support many simultaneous users.

How many users does it serve? (Monthly or daily user and/or compilation job sent). And what tricks are used to keep it working? (I suspect it can re-use already compiled binaries of all supported dependencies and only need to compile the user's code and link it, but is there other clever strategies?)

[-]

shepmaster 9 months ago

> How many users does it serve?

I don't really track users, but over the last 24 hours, there were 47.8k meaningful [1] requests taking a total of 28.2 hours. That ~0.5 requests per second number has been relatively consistent.

> re-use already compiled binaries of all supported dependencies and only need to compile the user's code and link it, but is there other clever strategies?

Yes, we pre-compile all the available dependencies [2] and that's about it.

> I use it all the time with no performance issues

That's good to hear! There's been a long-running bug where the playground binary loses track of the child Docker container (maybe?) and then the machine runs out of memory and the OOM killer often does more harm than good [3]. While trying to pin that down, I've recently caused the entire process to get into what appears to be a complete deadlock where no requests can be serviced at all. This tends to happen while I'm asleep so either I have no chance to debug it before it is auto-killed or the playground is unresponsive for 8+ hours.

[1]: compiling / executing code, running clippy/miri/rustfmt, expanding macros

[2]: https://github.com/rust-lang/rust-playground/blob/c4d00b90aa...

[3]: somehow it does something that ends up killing the network stack and then the machine is basically dead in the water. Very similar to what is reported in https://serverfault.com/q/1125634/119136

rtpg 9 months ago

Very fun to see your username outside of Stack Overflow, thanks for your work on having the playground!

Beyond the running costs of the machine itself, has the rust playground been any trouble, or has it mostly been smooth sailing after the initial setup?

[-]

shepmaster 9 months ago

See my sibling comment [1] for my current fun bug hunt. Overall, it's all pretty stable until I (try to) make it do new things.

[1]: https://news.ycombinator.com/item?id=41931288

oconnor663 9 months ago

> Rust join is actually optional.

I was recently surprised to learn that returning from main() with background threads still running is more or less UB in C++, because those threads can race against static destructors: https://www.reddit.com/r/cpp/comments/1fu0y6n/when_a_backgro.... C doesn't have this issue, though, as far as I know?

[-]

loeg 9 months ago

Common C implementations (clang, gcc) have static destructors as an extension, though C codebases probably use this a lot less than C++ ones with static objects and destructor methods.

dwattttt 9 months ago

atexit enters the chat

dwattttt 9 months ago

> Loops where you join all the threads that are supposedly finished are usually troublesome. If somebody gets stuck, the joiner stalls. Clean exits from programs with lots of channels are troublesome. Channels with multiple senders don't close until all senders exit, which can be hard to arrange when something detects an error.

I wish join-with-timeout was a more common/supported operation.

willglynn 9 months ago

> In Rust, the main thread is special. (I consider this unfortunate, but web people like it, because inside browsers, the main thread is very special.) If the main thread exits, all the other threads are silently killed.

Rust inherits this from `pthread_detach()`:

       The detached attribute merely determines the behavior of the
       system when the thread terminates; it does not prevent the thread
       from being terminated if the process terminates using exit(3) (or
       equivalently, if the main thread returns).

[-]

wahern 9 months ago

The main thread is special because that's how the runtime works on Unix. In particular, when "main" exits, the process exits. This is required by the C standard. It's also fundamentally built into how Unix processes work, as certain global variables, like argv and environ strings, typically are stored on the main thread's stack, so if the main thread is destroyed those references become invalid.

In principle Rust could have defined its environment to not make the main thread special, but then it would need some additional runtime magic on Unix systems, including having the main thread poll for all other threads to exit, which in turn would require it to add a layer of indirection to the system's threading runtime (e.g. wrapping pthreads) to be able to track all threads.

[-]

kelnos 9 months ago

> In principle Rust could have defined its environment to not make the main thread special...

Not to mention they'd have to be very careful with what they do on the main thread after they start up the application's first thread (e.g. allocating memory via malloc() is out), since there are quite a few things that are not safe to do (like fork() that's not immediately followed by exec()) in a multi-threaded program. So even a "single-threaded" Rust program would become multi-threaded, and assume all those problems.

diath 9 months ago

> Loops where you join all the threads that are supposedly finished are usually troublesome. If somebody gets stuck, the joiner stalls.

That makes sense if the main thread is actually doing any useful work, but when its only job is to spawn threads and wait for them to finish before exiting, then it's a pretty common idiom.

zavec 9 months ago

I really liked the footnote implementation so I checked out his GitHub to see what kind of static site generator he was using, and it looks like he wrote his own? What a baller.

[-]

oconnor663 9 months ago

:-D The styling is adapted from https://edwardtufte.github.io/tufte-css/

[-]

zavec 9 months ago

I am definitely going to have to check that out for my own site!

thunderseethe 9 months ago

It's also custom logic to embed the code snippets within the article, they're all working code that gets pulled in from rust files. Really stellar stuff.

[-]

zavec 9 months ago

Yeah! I have a whole laundry list of things I want to improve on my site now

DeathArrow 9 months ago

Async in Rust seems to be modelled after async in .NET.

[-]

GolDDranks 9 months ago

I'm not super knowledgeable about async in .NET, but doesn't that have garbage collection as a prerequisite?

[-]

DeathArrow 9 months ago

No, it's a state machine.

vaylian 9 months ago

What similarities are there?

ingen0s 9 months ago

Thank you

9 months ago

[deleted]