Unsafe Rust is harder than C

(chadaustin.me)

120 points | by todsacerdoti 8 months ago ago

131 comments

wongarsu 8 months ago

Unsafe Rust is significantly harder than C, with complex rules that are not fully specified. But luckily you don't have to write a lot of it. And the bit you need can either come from battle-tested libraries or be contained to it's own module where you can rigorously test and fuzz it. And the tooling available for that is pretty good. The article shows good examples of that.

There is some work on making unsafe rust safer, or at least easier to get right. But those are significant undertakings that will take years to pay off.

On the other hand safe rust is so much easier than correct C that I don't particularly mind the tradeoff.

[-]

WalterBright 8 months ago

> On the other hand safe rust is so much easier than correct C that I don't particularly mind the tradeoff.

D isn't Rust, but my personal style has gravitated towards the borrow checker rules anyway. It wouldn't be hard at all to do it in C, either.

C's biggest security failing, however, is the lack of array bounds checking. It's always the #1 cause of security bugs in the field, by a wide margin. None of the workarounds in Standard C are attractive. I do not understand why the C committee adds all these bits and pieces of new features, and does not fix that problem. It's not like it's hard:

https://www.digitalmars.com/articles/C-biggest-mistake.html

The solution has been out there for 15 years now.

[-]

Gibbon1 8 months ago

The standards people won't even add a standard buffer or slice to the language.

I kinda cry every time I see a new api with foo(void *src, int sz) signatures.

[-]

rurban 8 months ago

The standard people won't even add proper string support, besides their weak support for null-terminated buffers. Unicode strings have very special rules for searching, comparison and upper/lowercasing. And they are usually UTF-8 encoded.

kazinator 8 months ago

Safe Rust looks absolutely painful painful compared to other safe languages.

Correct C is not actually hard to write. It can be somewhat hard to convince oneself that it's right.

That's an important difference. Not all C has to be correct. If it's experimental or exploratory code that'll be thrown away. If that just works on one machine or on particular inputs that you care about, or it only certain specific environmental conditions, that can be fine.

Correctness is more than just memory safety and avoiding overflows. They are still bugs in programs and languages that avoid those problems.

[-]

tptacek 8 months ago

When people talk about not needing to write much unsafe Rust, they mean that in any large-scale system, there are probably only a couple hotspots that are complicated enough to need the escape hatch from the Rust safety rules; you write the unsafe code, you give it an interface to the rest of the code, you carefully validate the unsafe code, and you're good to go.

They don't mean that you write unsafe Rust code for experimental or exploratory code that will be thrown away. You could do that, but it's not common.

More importantly: this approach of parsimoniously writing unsafe code in a larger-scale safe system does not work in C/C++. Memory safety issues can (and do) crop up all over your system, not just in the small "unsafe" parts you validate carefully.

[-]

steveklabnik 8 months ago

Writing unsafe Rust is enough of a pain in the butt (partially thanks to some syntactic salt that comes from the early days that probably should be removed but isn't being dealt with any time soon as far as I know) that it would be harder to prototype in unsafe Rust than safe Rust, IMHO.

[-]

Ygg2 8 months ago

I'm curious. What syntaxic salt?

[-]

steveklabnik 8 months ago

Needing explicit deref for raw pointers. (*ptr).bar() instead of ptr.bar() like if it ptr was a reference or adding something like ptr~bar() so at least you still know it’s a raw pointer deref but is at least nearly the same as a reference.

kazinator 8 months ago

Write the unsafe code easily in C, and write the rest of it in an easy-to-use language, and you're good to go.

[-]

tptacek 8 months ago

That's a valid strategy, but it's harder to write "unsafe-safe" code in C than people think it is. We just did a podcast episode about this with results from the Android team:

https://securitycryptographywhatever.com/2024/10/15/a-little...

Even with careful rules and oversight and secure coding idioms and library exclusions and fuzzing, the rate of memory safety defects in C/C++ code is still pretty high.

[-]

hedgehog 8 months ago

I'm not sure I've met anyone with significant application security experience who would claim writing correct C is anything less than very challenging. It's not just memory safety, even basic features like iterable collections make code easier to write while at the same time avoiding potential bugs (vs C for loops in this case).

[-]

kazinator 8 months ago

Security is tricky because correct programs (including in C) can have security problems, like side channel attacks and whatnot.

You can write code which you think erases some security-sensitive information to zero bits after a calcualtion, but the compiler throws it away. The program is perfectly safe in that it doesn't crash on any bad input or miscalculate on any good input, but doesn't meet a security requirement.

The crux is that what the language standard calls "observable behavior", and behavior that can actually be observed (with security implications) are two different things.

kazinator 8 months ago

All C is unsafe, because it's a property of most language features. E.g. this is unsafe:

  int main(void)
  {
    global_init();
    service_loop();
    global_cleanup();
  }

Some of these functions, defined in another translation unit could require parameters, but their declarations be incorrect or missing from this translation unit. So the program will link, but the behavior is undefined.

"Unsafe" means that "no diagnostic is required when a rule is broken". When correctness depends on the programmer, such that the tooling silently accepts incorrectness, that is unsafe.

Almost all C is unsafe, but not all coding situations are equally risky of having a problem.

kazinator 8 months ago

Under this valid strategy we minimize writing C. When we do right see it's easier than writing the unsafe Rust equivalent (if we are to believe the submitted article).

People working in non-Rust languages, I'm writing just small amounts of low level, unsafe parts in Rust, would be better off using C.

f1shy 8 months ago

That is what we currently do. The dirty low level things in C, and glue it all with very high level, GC language.

steveklabnik 8 months ago

Safe Rust takes some time to learn, but once you do, is quite pleasant to write. It is true that it takes some time to get over that hump though. Google found that it was three to six months, but after that, there was no productivity penalty.

Rust offers tons of tools that are very useful for correctness outside of memory safety. I hope to never program in a language without sum types ever again, if possible.

[-]

rc00 8 months ago

You keep referencing Google's clearly flawed and asymmetric marketing about Rust. It is really unsightly. It has been thoroughly debunked nearly everywhere it was reported but here is a 10 minute recap of some of the clear flaws: https://www.youtube.com/watch?v=vB3ACXSesGo

The summary is that the research that you reference was not done in good faith.

[-]

steveklabnik 8 months ago

I like prime but just because he says something doesn't mean it's true.

Repeating his claims in text form would be much more useful than linking to a video. I am not anti-video, but for example, I am out in public right now, and cannot watch it, and therefore can't meaningfully respond.

[-]

rc00 8 months ago

The video is worth a watch before responding. I'll try to summarize but it is unfair to the content. (I specifically opted for it over some others due to its brevity.) In all, the statement from Google lacks context and due diligence.

Some bullets:

* Experience levels in C++ and Rust are not equivalent. The experience gap between C++ and Rust developers can skew results, as C++ has been around much longer, affecting developer proficiency and code complexity.

* Comparison of codebases: legacy C++ vs. greenfield Rust. Comparing a mature C++ codebase with a new Rust project is like comparing apples to oranges; age and complexity can impact productivity metrics significantly.

* Measurement metrics can be misleading. The metrics used to determine productivity often lack depth, failing to account for feature complexity or development duration.

* Organizations have asymmetric composition of resources. The management style and communication within teams can heavily influence productivity, indicating that not all teams function under the same conditions.

[-]

Ygg2 8 months ago

Counter bullets:

* Experience gap between C++ and Rust should give edge to C++ as it has larger pool of candidates to pick from.

* Even in greenfield Rust vs greenfield C++ like Rust drivers in Android, Rust significantly reduced defect rate.

* Measurements in this study were done on teams with same size, rewriting existing service. Neither you or Primagen demonstrate how that would be misleading.

* See first point. While true how can you be sure C++ team isn't overall better than Rust? What if C++ is three senior engineers and Rust is three mediors?

Actual possible problems:

* There isn't yet enough data for language comparison. Too early to tell.

* Observer effect might have influenced the study.

[-]

steveklabnik 8 months ago

Yes, full agree with all of this.

gauge_field 8 months ago

Any other source that goes thoroughly over the video of keynote since it was thoroughly debunked nearly everywhere I could not find it myself. Also, I would not call it thoroughly.

OtomotO 8 months ago

Aha, so you're saying that asymmetric marketing is bad (I agree) and as proof you put up some other asymmetric marketing, namely an influencer`s video... Hard pass :)

Ygg2 8 months ago

> thoroughly debunked

Ok. Looks inside. First warning sign Primagen. Ok. What did he talk about:

> There is lies, damn lies and statistics.

> Go is goated, let's go.

Truly dedunked.

[-]

metaltyphoon 8 months ago

:D I live prime but he’s creating this cult like following close to Uncle Bob did a long time ago.

zamalek 8 months ago

Safe Rust takes some time to _unlearn._ By which I mean, there are (in my opinion) objectively bad practices that a linear language prevents you from doing - such as cyclic data graphs in memory. Rust is hard because you need to unlearn bad habits, and then learn how to solve those issues all over again.

I believe that's a good thing (in general, constructive constraints are good), but it is just an opinion.

[-]

kazinator 8 months ago

"Cyclic datagraphs bad" is a perfect example of religious Rust brainwashing.

Cyclic structures in memory are not bad just because the one-memory-management-trick pony language you're currently infatuated with doesn't handle them within its safety paradigm.

Graph structures are entirely legitimate in computer science, and are handled nicely by garbage collection which is the gold standard in memory management.

Fully general lexical closures lead to cycles even when no variables are mutated. And even if no circular definitions are supported in the language.

We can use the Y combinator to create a cycle: a situation when a function's captured environment contains a binding whose value is that function itself.

8 months ago

[deleted]

kelnos 8 months ago

This has been my experience as well.

I was talking to a friend last night, and programming languages came up, including Rust. I remarked to him that I felt that Rust's biggest weakness was its learning curve. But once you're past that, you're golden.

stouset 8 months ago

> Safe Rust looks absolutely painful painful compared to other safe languages.

It really isn't. It's quite nice to use. The big hangup is that Rust "wants" you to structure your programs according to a certain philosophy. Unfortunately the compiler can't teach you that approach directly, but can only complain when you write programs that don't abide by it.

In my experience, following this design philosophy benefits even outside of memory safety, and I now write code following these principles in every language.

> Correct C is not actually hard to write.

Quite literally all of the available evidence strongly disagrees with this conclusion.

> Correctness is more than just memory safety and avoiding overflows. They are still bugs in programs and languages that avoid those problems.

This is true, but C lack tools to help with any of them. Fixing memory safety and overflows is in and of itself an enormous win, but Rust also has other tools that assist in program correctness (sum types being a huge one).

[-]

kazinator 8 months ago

> Quite literally all of the available evidence strongly disagrees with this conclusion.

Easy to write is not the same thing as easy to prove. It may be hard to show that some C program is correct, but when that is done, that doesn't change that it was easy to write (if that had been the case).

C is actually "unreasonably effective". You can't just look at the problems; the overall picture is that a lot of large and complicated, yet well-working systems have been achieved in C. They may not be correct, but are "correct enough".

[-]

stouset 8 months ago

The number of large C programs which have been repeatedly proven to be incorrect in practice is essentially 100%.

Still, C is unreasonably effective. A lot of large and complicated, yet mostly well-working (modulo regular CVEs) systems have been achieved in C. And that’s truly wonderful. I cut my teeth on C nearly 30 years ago and I still hold a soft spot in my heart for it.

But other languages these days are even more unreasonably effective. They need fewer lines of code to achieve the same results and they have fewer security vulnerabilities and logic bugs per line as well.

[-]

kazinator 8 months ago

Many of the ways that C programs are incorrect are abstract portability problems, that don't have a repro test case on the machines and compilers that people care about.

Rule of thumb: almost never fix some alleged problem without a repro test case.

[-]

stouset 8 months ago

Might I point you at the CVE database?

[-]

kazinator 8 months ago

May I point you to Wikipedia pages on cognitive biases, like Confirmation and Selection?

The CVE database lists only entries related to situations gone wrong. It doesn't list any information about working software that has no issues.

Also nothing will appear in the CVE database in relation to a language that nobody uses.

The database also has some garbage entries.

Overall the database is actually paltry in size in relation to the vast amounts of stuff out there written in C.

Also, correct C programs can have security issues, because some security issues depend on actual behaviors being observable which correspond to behavior that is not observable according to language standard. ISO C is mum about memory being observed, or side channel information being monitored, or timing of operations.

kelnos 8 months ago

> Correct C is not actually hard to write.

Ugh, not this again. Yes it is. It's hard. Veteran C programmers (I include myself in this category), even with compiler warnings, linters, and address sanitizers, still make mistakes all the time.

> Not all C has to be correct. If it's experimental or exploratory code that'll be thrown away. If that just works on one machine or on particular inputs that you care about, or it only certain specific environmental conditions, that can be fine.

So what? That's not the kind of code we're talking about.

> Correctness is more than just memory safety and avoiding overflows. They are still bugs in programs and languages that avoid those problems.

That's either whataboutism or a straw man. No one is claiming that languages that help avoid common C memory-related bugs like out-of-bounds access, use-after-free, memory leaks, etc. will also somehow magically make all your logic errors go away too.

[-]

kazinator 8 months ago

> still make mistakes all the time

Even veteran eaters bite their tongue or lip once in while. That doesn't mean eating is hard.

> No one is claiming

Only about 3 out of 4 Rust advocates.

> That's not the kind of code we're talking about.

We should talk about all code. Code that gets thrown away, rewritten, or improved from prototype to production. It is relevant to productivity. If I'm going to throw away some code to get to the code I want, it will be an extra waste of time if it is difficult to write.

[-]

camgunz 8 months ago

> Even veteran eaters bite their tongue or lip once in while. That doesn't mean eating is hard.

Kinda. This would be more like if my mouth were stuffed with anti-tongue-bite technology and I'd trained for years using scientifically proven techniques to avoid biting my tongue, but inevitably I do anyway and then my identity and the identity of all of my friends get stolen.

I love C and actively dislike Rust, but to minimize the effort spent compensating for and consequences of C's shortcomings is very head in the sand.

gauge_field 8 months ago

I am not sure if Rust advocates tell that it will "magically" fix logic errors. What I hear they say is rather more precise terms: Stronger Type system, and memory safety, forcing more explicit design (e.g. no function overload). Whether this is true or not is another discussion to have.

In my experience, it helps to have stronger type system regardless of any language of comparison if you want to avoid logic errors.

[-]

kazinator 8 months ago

Not just logic errors but their problems in life, or something.

[-]

gauge_field 8 months ago

Ok. There are some problematic people in any community (although I would require more scientific data instead of personal anecdotes to be convinced, e.g. comparing stats related to this behaviour across different PLs with meaningful statistical metrics). I think this is more of complex sociological question than some people make it out to be. What I see is occasional people being aggressive rarely. The are disagreeing on a principled/reasonable manner. Those who are Rust fanboys (or whatever you implied) are suppressed/downvoted etc.

f1shy 8 months ago

There are two big commandments of the modern programming religion:

- Type safety

- "Memory safety"

We have to understand it: it is religion. No science.

in 10 years maybe somebody start realising, that even with the 2 sw can still be very bad. And those 2 are not silver bullets... until then... patience.

signa11 8 months ago

yup, after some experience, memory safety becomes kinda easy to get right.

integer safety on the other hand is pretty hard, and non-trivial to get right unfortunately.

rowanG077 8 months ago

> Correct C is not actually hard to write.

I would advise to you not say such blatantly false things. You paint yourself either as a liar or as unqualified. C has been proven time and time again to be almost impossible to write correctly. Even top tier programmers like djb cannot write safe C.

[-]

kazinator 8 months ago

Nobody can write "safe C" because C is inherently not safe. See my other comment about it. Safe means something like "if the program breaks some semantic rule, it will be caught and diagnosed". It's not a statement about the program being correct or not, but the consequences of if it were to be incorrect. Even tiny, trivial programs have the potential to be incorrect without diagnosis.

> C has been proven time and time again to be almost impossible to write correctly.

Silly hyperbole.

[-]

rowanG077 8 months ago

Without writing safe C you cannot write correct C. If your C is unsafe it is incorrect by definition.

[-]

kazinator 8 months ago

That is obviously untrue by the definition of unsafe that I've been explicitly using.

I've been sticking with the concept of unsafe language, whereas you're talking about an unsafe program.

The two are related. An unsafe program misbehaves when given bad inputs. It's behavior is undefined.

An unsafe language is an unsafe program with regard to bad inputs, which consist of certain kinds of incorrect programs. (Those which are not diagnosed).

It's easy to write unsafe programs in unsafe languages. If we only focus on the program's functional requirements and none of them speak about what to do with bad inputs or in bad situations, we can end up with a program which is perfectly correct but unsafe. It will behave as specified on the correct inputs.

If you happen to be advocating Rust due to misunderstanding this kind of material, you're in it for the wrong reasons.

8 months ago

[deleted]

justincredible 8 months ago

[dead]

kelnos 8 months ago

It sounds like this boils down to one thing: C and Rust have different aliasing rules, with Rust's being much more restrictive.

You can't assume programming in unsafe Rust is like programming in C. Unsafe doesn't mean "do whatever you want"; it means you get to do things that the compiler won't check for you, but it's your responsibility to ensure you're still maintaining Rust's invariants.

So yes, unsafe Rust is harder than C. I think that's fine, though. Most people will be able to get by without writing any unsafe Rust, and those who still need to use it will end up writing very little of it. And tools like MIRI exist to help increase your confidence that your unsafe Rust is correct.

[-]

steveklabnik 8 months ago

It's not so much "more restrictive" as it is "just plain different."

You have behaviors for all four quadrants: there's stuff that's okay in Rust and not okay in C, stuff that's okay in both, stuff that's not okay in both, and stuff that's not okay in Rust and is okay in C.

[-]

bloppe 8 months ago

Very curious to know an example of what's ok in Rust but not in C

[-]

whytevuhuni 8 months ago

Of the things you can do: Integer overflow (wrapping in Rust, undefined behavior in C), indexing out of bounds (panic in Rust, UB in C), unwinding stack frames (runs destructors in Rust, UB in C)

Of the things you can try to do, but will be prevented: use-after-free, double-free, using uninitialized variables, null references, modifying constants, data races

steveklabnik 8 months ago

The most straightforward one is that Rust doesn't have strict aliasing.

jsheard 8 months ago

> C and Rust have different aliasing rules, with Rust's being much more restrictive.

Rust is more restrictive by default, but C's restrict keyword lets you opt-in to strict aliasing semantics in exchange for unlocking compiler optimizations that would otherwise be unsound, so it's not like you're free from having to think about aliasing in high performance C code.

[-]

kelnos 8 months ago

Agreed, but in practice defaults do matter, and it's pretty rare that I see a C code base that actually makes use of 'restrict' etc. It's nice to see it sprinkled liberally throughout glibc's headers, but that's the exception, not the rule.

[-]

physicsguy 8 months ago

It’s in pretty much every simulation code base I’ve worked on (both academic + industry)

twoodfin 8 months ago

Coming from a position of extreme Rust ignorance & C++ brain poisoning:

Here’s the issue: waiting_for_elements is a Vec<Waker>. The channel cannot know how many tasks are blocked, so we can’t use a fixed-size array. Using a Vec means we allocate memory every time we queue a waker. And that allocation is taken and released every time we have to wake.

Why isn't a structure that does amortized allocation an option here? I appreciate the design goal was "no allocations in steady-state", but that's what you'd expect if you were using C++'s std::vector: After a while the reserved space for the vector gets "big enough".

[-]

chadaustin 8 months ago

I'm sorry, I had a whole section on amortized allocation in a draft version of the post but I deleted it, thinking it was tangential. You're not the first person to ask: https://www.reddit.com/r/rust/comments/1gbqy6c/comment/ltonq...

And my response: https://www.reddit.com/r/rust/comments/1gbqy6c/comment/ltpv0...

One typical approach is double-buffering the allocation but it doesn't work here because you need to pull out the waker list to call `wake()` outside of the mutex. You could try to put the allocation back, but you have to acquire the lock again.

I had an implementation that kept a lock-free waker pool around https://docs.rs/wakerpool/latest/wakerpool/ but now you're paying for atomics too, and it felt like this was all a workaround for a deficiency in the language.

Intrusive lists are the "correct" data structure, so I kept pushing.

8 months ago

[deleted]

maxbrunsfeld 8 months ago

I had the same question. I would think that with rust’s Vec, no allocation would occur at steady state. Vec does not automatically resize when removing elements.

[-]

makapuf 8 months ago

Yup, you can even pre-allocate a given vec capacity to not start zero-sized.

Analemma_ 8 months ago

Yeah, I'm also not sure what the author is talking about, std::vec also does amortized allocation and never shrinks. There are a bevy of more complicated vector classes in Rust if you need different behaviors, but off the top of my head I don't know why std::vec wouldn't work here.

dgfitz 8 months ago

For me, it really is just so much syntax and mental overhead to track. Somehow it seems like the same goals could have been accomplished with less verbosity.

It almost feels like an engineering tool/language, by devs for devs, instead of by devs for “customers” where customer == “outside” devs, of which I am one.

Edit: typo

andrewmcwatters 8 months ago

I feel a sort of professional obligation to learn the cryptic unnecessary glyphs and runes that Rust and its ilk proliferate all for this important concept that is memory safety.

However, I would much rather that more languages like Go and Zig simply take off in popularity instead, and that we just reject the eyesore and cognitive mess that is Rust syntax. It’s a language which has no regard for beauty.

[-]

MarkSweep 8 months ago

As long as we counting Go as an alternative to Rust, there are a whole host of other garbage collected languages that provide memory safety benefits compared to the status quo of C and C++. Languages like Ruby, Java, Python, C#, and PHP.

Personally I don’t think it makes sense to write all software in Rust. A GC makes it much easier to write certain patterns that are frustrating to write in Rust.

jcrites 8 months ago

For what it's worth, I disagree. I think it would be difficult to design a language more elegant than Rust while accomplishing the same goals. Maybe those goals aren't ones that everyone cares about, and that's OK.

It's a language that's definitely worth learning; it expands the mind in a way that languages like Go (or Java, or any GC language) will not. There is a great beauty and elegance to its design.

OtomotO 8 months ago

I would love a much simpler Rust...a Zig with a borrowchecker, a small stdlib (like Rust), some official package manager.

I agree that Rust has a lot of syntactic warts but the common stuff I find beautiful

fjasdyfs 8 months ago

I agree the syntax in Rust can be hairy however presenting Go as an alternative to Rust makes zero sense. They have completely different runtime characteristics and goals.

maxk42 8 months ago

Safe rust is harder than C.

[-]

Cyph0n 8 months ago

Safe C is harder than safe Rust.

[-]

nanolith 8 months ago

That used to be true, but now we have reasonable model checking tools for C. It's possible to write safer C without the cognitive load of Rust.

https://www.cprover.org/cbmc/

[-]

kelnos 8 months ago

So two things:

1. This is perhaps just my preference, and so is more subjective, but I don't want to have to pick and choose among various options for doing linting or static analysis or or address sanitizing or model checking after the fact. I want compilation to fail if any of these invariants don't hold. Rust can do that; C never will be able to do that. (Sure, if I write unsafe Rust, I'm going to want to run MIRI on it, but if can stick to safe Rust, the compiler should be sufficient.)

2. I'm probably not as well-versed in the topic as you are, but my understanding is that model checking tools like this cannot prove that every single program that a compiler will accept will also be free of the issues that the model checker is looking for. Again, the Rust compiler can do that. Yes, that does mean that the Rust compiler might reject some programs that would turn out to be safe and sound, but I'm ok with that trade off.

[-]

nanolith 8 months ago

1. Of course it can. My model checks on function contracts and invariants run for every function. If a function fails this contract or invariant, the overall build fails. Does it matter that this is a separate build step than the compilation? Not in practice. Think of the model check step as being a semantic analysis step that happens after the type check and syntax check from the compiler.

2. You're not interested in proving that every program that the compiler accepts will be free of issues. You're interested in whether YOUR program that you write is free of those issues. The Rust compiler CAN'T do this, because the Rust compiler is only looking at a SUBSET of the possible things that you can build model checks against. This is why Kani -- a model checker for Rust -- exists. You can model check unsafe code in Rust, as well as safe code in Rust against user assertions and function contracts that are otherwise not possible to check in vanilla Rust.

Model checking isn't just for C, but model checking, as a practical form of formal methods, brings the same and even better safety to C. In fact, with Kani, you can get similar safety in Rust as well.

If you like Rust, use it. But, as was the point of my comment, it is possible and practical to get similar safety in C.

Cyph0n 8 months ago

I am sure it is “possible”, but we are talking about practicality here.

Why doesn’t the Linux kernel embrace model checking instead of experimenting with Rust?

[-]

nanolith 8 months ago

It is quite practical. I'm actually planning a book on the subject.

The reason why some Rust enthusiasts have been experimenting with Rust in the Linux kernel is because they are passionate about Rust, and kernel maintainers are looking to find younger people. It's neither an endorsement of Rust nor an argument against model checking in C.

The reality is that this tooling isn't yet well known about. As it becomes better known, it will be adopted.

[-]

Cyph0n 8 months ago

Unless you have some insider info, I think you’re oversimplifying why Linux is giving Rust a chance.

Also, model checking - at least as you’re portraying it - sounds too good to be true. If it was anywhere close to the realm of practicality for large C codebases (without maintaining a model separate from code), we would be hearing its praises being sung by C devs all over.

[-]

nanolith 8 months ago

> Unless you have some insider info, I think you’re oversimplifying why Linux is giving Rust a chance.

Not really. Linus Torvalds has been quite open about this topic, and it has been covered extensively on LWN.

> Also, model checking - at least as you’re portraying it - sounds too good to be true.

> If it was anywhere close to the realm of practicality... we would be hearing its praises being sung by C devs all over.

Like any technology, model checking takes effort to use and learn. But, it does work quite well. Again, you're conflating the popularity of something with its effectiveness, which is a poor argument.

[-]

Cyph0n 8 months ago

In my opinion, vaguely handwaving that model checking makes C as effective as Rust invites nothing but skepticism to anyone who knows C and Rust.

Typically, if you’re advocating for a relatively unknown technology that you want others to adopt, the onus is on you to describe how it is better and to be upfront about its limitations. Good luck with your book!

[-]

nanolith 8 months ago

I appreciate your opinion. It's not possible to hold a nuanced conversation about model checking in general or in C in particular in the comments of HN. I'd need much more space. But, I can summarize.

CBMC's abstract machine can detect memory safety violations and integer related UB. This includes things like use-after-free, buffer overruns, heap/stack corruption, thread races, fence post errors, integer conversion and promotion errors, and signed integer overflow. When it detects a violation, it provides a counter-example demonstrating this. It also provides the user to build custom assertions, which allows function contracts to be built and enforced. Any function can be defined as the entry point for the model checker, which allows function-by-function analysis. Shadow methods can be substituted, which provides the abstraction necessary to model check entire code bases. A shadow method uses non-determinism to cover all possible inputs, outputs, and side-effects that the real function could perform. This abstraction also allows modeling of third party libraries, user/kernel code, and hardware. So far, I've model checked code bases of around half a million lines of code. It will easily scale to cover a code base the size of the Linux kernel, as long as you understand how to use it. It's an engineering problem at this point. Tracking that abstraction matches implementation is actually not that hard and can be done by writing good function contracts which are verified by the model checker. If the shadow and the original code follow the same function contract, which includes all possible inputs, outputs, and side-effects, the abstraction can be substituted.

The biggest limitation really comes down to large recursive data structures, which is also a pain point for Rust for that matter. There are ways to deal with this, but that's probably the most significant place where any code base customization is required. It's possible to refactor this code to be just as fast in modern C, but in a way that is easier for the model checker to verify quickly.

It's impossible to convert the trillions of lines of code written in C to Rust or any other language without blowing the entire budget of the tech sector for 30 years. Rewrites are prohibitively expensive. Tooling and automation for this tooling is not nearly that expensive.

[-]

Cyph0n 8 months ago

So is it a form of property checking? That is, your model defines certain properties which your program is checked against.

Or is it some form of symbolic execution? This I doubt because I believe the performance is not there yet.

I will read up a bit more on CBMC.

[-]

nanolith 8 months ago

It's a form of abstract interpretation. The code is compiled to a constraint problem in an SMT solver.

There is definitely a performance impact here, which is why it is important to decompose the program and verify it function by function. This decomposition is sound as long as the model checking scenario covers the complete function contract. To improve performance further, I use abstraction in the form of shadow methods. These are sort of like mocks for the model checker. They provide the same function contract -- inputs, outputs, and side effects -- as the original function, but using built-in non-determinism provided by the model checker. This simplifies the overall SMT equation while maintaining an approximation of the overall program. By defining external function contracts, I can use the model checker to verify that both the original function and the shadow function follow the same contract, which keeps the two in sync. The shadow functions are used to replace functions called by the function under model check in order to isolate this function and simplify the overall SMT equation.

The tool provides the mechanism, but it has taken me six years of work and research to develop a practical way to scale it. The book will cover the tool, but it is documenting this "cheat sheet" that is the real purpose for it.

For what it's worth, I'm also considering an edition that covers Rust and Kani.

binary132 8 months ago

Now I’m curious

OtomotO 8 months ago

Great now give me tooling of this millenium (no, I am not going back to vendor everything manually and I am not reinventing every basic data structure I wrote in University in ever project I work on) and we have a deal!

Oh, also get rid of header files, they are archaic. And I want fearless concurrency... And sum types!

[-]

nanolith 8 months ago

If you want those things, you don't want C. Pick a reasonable higher level language you like that makes those decisions for you.

My comment was not to imply that somehow C is superior to X, Y, or Z, but rather to point out that the safety problem with C does have a practical solution.

[-]

OtomotO 8 months ago

Fair point, sorry (non native speaker here, for what it's worth) have a wonderful day!

Quothling 8 months ago

Zig is easier than Rust.

[-]

jsheard 8 months ago

Safe Zig is harder than safe Rust, but easier than safe C.

In conclusion, programming is a land of contrasts.

[-]

eikenberry 8 months ago

I've never heard the opinion that safe Zig is harder than safe Rust. Pretty much always the opposite, that Zig is easier than Rust all around.

[-]

binary132 8 months ago

I think the claim here is that it’s hard to write really safe zig

kelnos 8 months ago

> but easier than safe C

Well, everything is easier than something that doesn't exist.

tonetegeatinst 8 months ago

Dosnt GCC have flags that enforce memory safety? Also don't fuzzers handle this issue?

[-]

kstrauser 8 months ago

It cannot possibly. You can catch some low hanging fruit, but asking a compiler to evaluate whether a specific chunk of code is memory safe is basically solving the halting problem.

[-]

jmkr 8 months ago

Can you explain what that low hanging fruit is (or refer me to docs), and also explain it being a decision problem a bit more thoroughly. I will accept that if you have to run a program to decide if it's memory safe then that fits the criteria, but from my understanding static analysis doesn't run the program, and a compiler is parsing and lexing anyway so it should be able to catch at least some things (the low hanging fruit)?

Since I have actually started using C I realized how easy it is to be lazy and not handle memory right so it makes Rust and maybe C++ seem more appealing, but trying to figure out random segfaults it seems like address sanitizer and valgrind catches more than I would have assumed is a low hanging fruit.

I guess I should look more into how Rust manages that safety or understand what memory safety is trying to accomplish more formally. I've taken GC for granted for years until I needed to care about memory.

[-]

steveklabnik 8 months ago

(Not your parent)

An example of low hanging fruit is -fwrapv. This flag takes a behavior that is undefined, signed overflow, and converts it to defined behavior, two's compliment wrapping. That improves safety, but it does not prevent all errors. There are many flags like this, but they all tackle individual aspects of the problem, and even if you turn them all on, there are situations which aren't caught.

[-]

jmkr 8 months ago

Thanks. Yeah that makes sense for low hanging fruit. Going through the gcc flags it does seem like a lot of tradeoffs have to be made so you can't cover everything. A quick look through compiling Rust it seems it does at least some of this checking at MIR. I'll have to read more about it.

nanolith 8 months ago

The compiler really doesn't have the opportunity to simulate every possibility of code. It's not just the matter of whether the function is safe, but every possible use of the function, which the compiler may not see when it is focused on a single compilation unit or a library.

If you want this level of safety, which is possible in C, then you need to use a model checker. Model checking C isn't as trivial as adding a flag to the compiler, but it can be done with about as much overhead as unit testing, if a reasonable idiomatic style is followed, and if the model checker is used well.

It is still a decision problem, and thus has similar limitations, but you can perform steps to ensure that you have some level of soundness with unwinding assertions and other techniques.

[-]

jmkr 8 months ago

Thanks that's helpful I'll take a look at model checking.

Filligree 8 months ago

They’re incomplete, and running msan significantly slows things down, to the point you’d be better off with Java.

lomase 8 months ago

Any kind of Rust is harder than safe C#.

_bin_ 8 months ago

Maybe this depends on the application type. I've written a lot of each, and maintaining anything involving concurrency is worlds easier in Rust, since those tend to be the most painful and time-consuming bugs to fix. It's got a learning curve, isn't a particularly easy language, and a big chunk of the "community" sucks, but most of the places I've applied it have been a net hassle savings.

rowanG077 8 months ago

Is that really the case? I would say writing bug ridden C is easier then Rust in some cases. Writing working C is much harder then writing safe Rust.

Sytten 8 months ago

Most of the complexity comes from the fact that all types are moveable by default. That is why I prefer this proposal to make it opt-in [1]. Working with pin really sucks the joy out of you and it won't be some sugar syntax (the current &pin mut proposal) that will help that. Rust is a fine language most of the time, but sometimes like the author you fall in rabbit holes.

[1] https://smallcultfollowing.com/babysteps/blog/2024/10/14/ove...

melodyogonna 8 months ago

Rust is harder than C

[-]

OtomotO 8 months ago

To me C is much, much harder than Rust.

I've written both, albeit way more Rust.

[-]

melodyogonna 8 months ago

What makes C much much harder? What concepts exist in C and are not present in Rust?

[-]

OtomotO 8 months ago

It's not about concepts, it's not about language constructs but the overall work and mental load working with the languages.

For me it's header files, no package manager, doing pointer magic all the time (void pointers are... Brrr!), concurrency, ...

I could go on. I don't enjoy it, although I like simple and easy things.

C is simple, but not easy

[-]

uecker 8 months ago

I find the mental load in C very light and I rarely use a void pointer in C. I also like headers, they give a clean separation of implementation and interface and enable very fast compilation.

[-]

gauge_field 8 months ago

It all depends on what kind of project you work on.

I work high performance libraries, when I compare the code C and Rust, the void is used alot of blas libraries, to dispatch different function with different type information. They also include reimplementation of threading libraries that you find as a function from a crate. This coupled with lack of doc.rs, header files, (C not having namespaces/modules (especially for complex projects)) my needed cognitive load is higher for C than for Rust.

Non-negligible amount of these libraries are also hard to setup on non-unix environment (mainly due to not having OS-independent build system). One of my favorite things is looking at rust cli projects, checking them and install them with cargo install $(name_of_project) and have it working.

Also, alot of other stuff is not default (e.g. for testing, you just #[test] in rust, for C you need 3 party tooling).

The more complex project becomes, the more of these needs you realize.

Some of these you might need or not, it all depends on your case. But, for myself (someone also used to other more modern tooling system), these do matter to a non-negligible extent.

[-]

uecker 8 months ago

I maintain a fairly complex computational project about 10 years now... I do not see these problems and I would not want to do this in Rust (after studying it for a while). Mostly I like me extremely short compile times. The huge amount of dependencies the cargo-style of development tends to pull in and the instability of Rust would also still worry me a lot.

[-]

OtomotO 8 months ago

I never understood why people care about dependencies that much.

Either I write the code myself, or I pull in a dependency.

Compile times I agree with though.

[-]

uecker 8 months ago

Dependencies... I just leave this here:

https://internals.rust-lang.org/t/type-inference-breakage-in...

gauge_field 8 months ago

Some of these are personal choices. Some of them are not. E.g. look at the blis project, compilation on windows is still problem.

On the dependency issues, I would disagree. Firsly, large amount of dependency (especially if you working on computational project e.g. on a university cluster) is not really an issue, because in the projects I worked with C++/python bindings, we already used alot of python bindings and dependencies from python ecosystem. It is just the nature of experimental projects/numerical. Limiting number of dependencies for numerical projects (e.g. simuation of physics sytem) is an very rare example given the how little academia care about the software develop, they pull whatever helps them (Nothing wrong with that since they have other things to care about than software quality).

Secondly, It just depends on the number of dependencies you pull, you have the option not to include and write it yourself, which is what C projects tend to do. It is trade-off. Given, how east it is to manage other things in rust-up (e.g. tooling versioning), I prefer this one.

I think this is more of philosophical difference: modern tooling (where you use tools like cargo/pip with declarative simple config api) vs make (where you do it more by yourself)

There can be some issues with dependency, e.g. breaking change. But, these are not unsolvable problems. If you choose you dependencies correctly, I would prefer having to manage dependency version than to write it myself (Again you can write it yourself if you want). Also, these issues are really rare.

I dont know about instability of Rust itself and where you are getting this claim from. Rust promises backwards compatibility and uses tools like these to make sure it: https://github.com/rust-lang/crater

Maybe you are talking about MSRV as semver breaking change. There has been alot of discussion about this, you can read it up as to why the choice made.

Regarding compilation time, you have to be more careful about what features you use (not spreading traits throughout your code) using incremental compilation, see: https://matklad.github.io/2021/09/04/fast-rust-builds.html

Another one is lack features: E.g. generics (That is why I'd prefer using C++ than C). But, that is one of the features of C, not a disadvantage.

[-]

gauge_field 8 months ago

Just to add, there are tools in C/C++ that addresses some of these issues. For instance, I use meson as much as I could, and their project is really makes it smoother. I feel like it should have been used more in the ecosystem.

pornel 8 months ago

Unchecked shared mutability that causes data races and Undefined Behaviour is the pervasive default behavior in C, with no option to turn it off.

Safe Rust doesn't have this "feature".

This makes multi-threaded code in C very difficult to write correctly beyond simplest cases. It's even harder to ensure it's reliable when 3rd party code is involved. The C compiler has no idea what thread safety even is, so it can't help you (but it can backstab you with unexpected optimizations when you use regular types where atomics are required). It's up to you to understand thread-safety documentation of all code involved, if such documentation exists at all. It's even more of a pain to debug data races, because they can be impossible to reproduce when a debugger or printf slows down or accidentally synchronises the code as a side effect.

OTOH thread-safety is part of Rust's type system. Use of non-thread-safe data in a multi-threaded context is reliably and precisely caught at compile time, even across boundaries of 3rd party libraries and callbacks.

kelnos 8 months ago

C is easier to write, Rust is easier to write correctly.

[-]

melodyogonna 8 months ago

Rust is also hard to write correctly. You just pay the complexity costs at different times in both languages - in Rust you make a lot of upfront investment.

[-]

j-krieger 8 months ago

Can't say I agree.

tapirl 8 months ago

Rust is harder than C++. It is C+++.

agentultra 8 months ago

> Using a Vec means we allocate memory every time we queue a waker. And that allocation is taken and released every time we have to wake.

Is that so? On every push back? I’d expect it’d only do an allocation when the current array segment is almost full… as a vector you might write by hand or like the ones in the C++ standard libraries do.

[-]

raptorfactor 8 months ago

This surprised me too, so I checked: https://doc.rust-lang.org/std/vec/struct.Vec.html#capacity-a...

"Vec does not guarantee any particular growth strategy when reallocating when full, nor when reserve is called. The current strategy is basic and it may prove desirable to use a non-constant growth factor. Whatever strategy is used will of course guarantee O(1) amortized push."

Seems it should be amortized just like in C++?

[-]

agentultra 8 months ago

I don't want to tell the Rust folks what they should do but as a potential user I'd be more interested in the language if it was explicit (or the rules we clear and deterministic at the least). This claim took me a bit by surprise and I'd be upset if I'd encountered it in production code.

[-]

ratorx 8 months ago

I think explicitly stating what it doesn’t guarantee is the right thing to do. Otherwise, the API becomes tied to your implementation through implicit details, which can prevent future generic performance improvements (e.g. unordered_map pointer stability in C++ prevents the implementation from being changed to a different representation like absl::flat_hash_map, even though that’s a guarantee that most people don’t care about).

Re: performance considerations. This is important, but for a performance critical application, any compiler, library etc version change can cause regressions, so it seems better to benchmark often and then tackle this, rather than make assumptions based on implicit (or even explicit) guarantees.

LegionMammal978 8 months ago

You do get some determinism, in that as long as the length of the vector is less than its capacity, it will never reallocate the buffer when new elements are added. And there are plenty of functions like Vec::with_capacity(), Vec::reserve(), Vec::reserve_exact(), etc. which let you control the capacity. The only unspecified part is by how much the capacity grows when the vector does have to reallocate.

codewiz 8 months ago

The very last line: "Putting this in words was as hard as writing the code."

sgt 8 months ago

Can't wait for Swift to gain traction in systems programming.

handwarmers 8 months ago

[flagged]

[-]

whatshisface 8 months ago

If you aren't nice to rust people, they'll get RCE after overflowing your buffers and make you a furry.

[-]

8 months ago

[deleted]

f1shy 8 months ago

They come with the book in their hands... oh... convert or be burn in the fire!