Smart pointers for the kernel

(lwn.net)

149 points | by signa11 20 hours ago ago

96 comments

  • acbits 19 hours ago

    https://github.com/acbits/reftrack-plugin

    I wrote this GCC plugin for this exact reason. Let's see whether the kernel team is interested in adopting it.

    • brenns10 18 hours ago

      I searched lore.kernel.org and couldn't find any postings that propose using this in the kernel. I'd encourage you to share a proposal, otherwise the "kernel team" will never be interested, because they'll never hear of it.

      • acbits 15 hours ago

        I will send it again. It probably got lost in the high traffic volume of LKML last year.

    • pornel 7 hours ago

      That's cool, but it'd be nice to also have a distinction between a get and a copy (like ObjC) or borrow/move (like Rust) to avoid redundant increments and decrements.

  • smodo 20 hours ago

    I’m not very well versed in kernel development. But I am a Rust dev and have observed the discussion about Rust in Linux with interest… Having said that, this part of the article has me baffled:

    >> implementing these features for a smart-pointer type with a malicious or broken Deref (the trait that lets a programmer dereference a value) implementation could break the guarantees Rust relies on to determine when objects can be moved in memory. (…) [In] keeping with Rust's commitment to ensuring safe code cannot cause memory-safety problems, the RFC also requires programmers to use unsafe (specifically, implementing an unsafe marker trait) as a promise that they've read the relevant documentation and are not going to break Pin.

    To the uninformed this seems like crossing the very boundary that you wanted Rust to uphold? Yes it’s only an impl Trait but still… I can hear the C devs now. ‘We pinky promise to clean up after our mallocs too!’

    • GolDDranks 18 hours ago

      Try imagining trait AlwaysIndexableUnder100. There's a generic codebase/library that takes in types that implement that trait, and do indexing with indexes that are always below 100. Like `usersCustomSliceA[4] = usersCustomSliceB[5];`

      You'd be tempted, for performance, to use `get_unchecked` methods that skip the boundary checks. After all, the trait says that this should always succeed.

      However, if the user passes in a type that is not indexable with integers smaller than 100, whose fault it is if the program segfaults? The users? But they managed to get the program to segfault _without_ writing unsafe code. The provider of the library? They are using `unsafe` to call `get_unchecked` after all.

      Bingo. It's the library dev's fault. The API they provide is not sound. However, they can make it sound by marking the _trait_ unsafe to implement. Then the user needs to type `unsafe` when implementing the trait for their type. "I solemnly swear that this type is actually indexable with all integers smaller than 100." That shifts the blame to the mistaken implementation, and the user is to blame.

      It's the same situation here. Deref is not unsafe to implement. That's why if you need to uphold a trust boundary, you need an unsafe trait.

      So, the whole thing doesn't account to crossing the boundary willy nilly, a big point of Rust's unsafe is for the compiler to force documenting and checking for accountability: who is required to do what, and who is allowed to rely on that.

    • gpm 19 hours ago

      This kind of use of unsafe has been in rust forever, for example with `Sync`, `Send`. Implementing an unsafe marker trait to promise to the compiler that other methods on the structure act a certain way.

      The scope of an unsafe block has always been interpreted to include guaranteeing things about it's surroundings up to other things in the same module. E.g. if I'm implementing `Vector::push` I'm going to rely on the fact that `self.capacity` really is the size of the allocation behind `self.ptr` without verifying it, and I'm going to feel good about that because those fields aren't public and everything within the module doesn't let you violate that constraint, so it's not possible for external safe code to violate it.

      The same applies to marker traits. If I'm writing `unsafe impl Send for MyStruct {}` I'm promising that the module exposes an interface where `MyStruct` with will always comply with the requirements of `Send` no matter what safe external code does (i.e. sending MyStructs across threads is safe given the exposed API). With this proposal if I write `unsafe impl PinCoerceUnsized for MyStruct {}` I'm promising the same with respect to that (whatever the documentation for that trait ends up saying, which should essentially be that I've implemented `Deref for MyStruct` in the same module and I don't expose any way for safe external code to change what reference `Deref` returns).

    • foundry27 19 hours ago

      Rust’s whole premise of guaranteed memory safety through compiletime checks has always been undermined when confronted with the reality that certain foundational operations must still be implemented using unsafe. Inevitably folks concede that lower level libraries will have these unsafe blocks and still expect higher level code to trust them, and at that point we’ve essentially recreated the core paradigm of C: trust in the programmer’s diligence. Yeah Rust makes this trust visible, but it doesn’t actually eliminate it in “hard” code.

      The punchline here, so to speak, is that for all Rust’s claims to revolutionize safety, it simply(!) formalizes the same unwritten social contract C developers have been meandering along with for decades. The uniqueness boils down to “we still trust the devs, but at least now we’ve made them swear on it in writing”.

      • kelnos 19 hours ago

        I don't think you're giving Rust enough credit here.

        For those projects that don't use any unsafe, we can say -- absent compiler bugs or type system unsoundness -- that there will be no memory leaks or data races or undefined behavior. That's useful! Very useful!

        For projects that do need unsafe, that unsafe code can be cordoned off into a corner where it can be made as small as possible, and can be audited. The rest of the code base is just as safe as one with no unsafe at all. This is also very useful!

        Now, sure, if most projects needed to use unsafe, and/or if most projects had to write a significant amount of unsafe, then sure, I'd agree with you. But that's just not the reality for nearly all projects.

        With C, everything is unsafe. Everything can have memory leaks or data races or undefined behavior. Audits for these issues need to examine every single line of code. Compilers and linters and sanitizers can help you here, but they can never be comprehensive or guarantee the absence of problems.

        I've been writing C for more than 20 years now. I still write memory leaks. I still write NULL pointer dereferences. I still struggle sometimes to get my data ownership (and/or locking) right when I have to write multithreaded code. When I get to write Rust, I'm so happy that I don't have to worry about those things, or spend time with valgrind or ASAN or clang's scan-build to figure out what I've done wrong. Rust lets me focus more on what I actually care about, the actual code and algorithms and structure of my program.

        • fshbbdssbbgdd 4 hours ago

          Just to give an experience report as someone maintaining a 50k line rust codebase at work. I didn’t write this code and have only read parts of it. I am not a rust expert. I faced a really puzzling bug - basically errors coming out of an API that had nothing to do with the call site. After struggling to debug, I search for “unsafe” and looked at the 6 unsafe blocks in the project (totaling a few dozen lines of code), and found one of those had a bug. It turns out the unsafe operation was corrupting the system the code was interacting with and causing errors that pop up during later calls. This bug would have been much more difficult to track down if I couldn’t narrow down the tricky code with “unsafe”.

        • weinzierl 16 hours ago

          "For projects that do need unsafe, that unsafe code can be cordoned off into a corner, where it can be made as small as possible, and can be audited. The rest of the code base is just as safe as one with no unsafe at all. This is also very useful!"

          Exactly this, and very well put!

          I'd just like to add one small but important detail. It's one of the things that is so obvious to one group that they rarely even mention it, but at the same time so obscure to the others that they are completely oblivious to it.

          While the unsafe code is cordoned off into a corner its effects are not. A bug in an unsafe block in one part of your program can trigger an outcome in a completely different and safe part of your program, that normally safe Rust should prevent.

          To put it more metaphorically, Rust restricts the places where bombs can be placed, it does not limit the blast radius in case a bomb goes off.

          This is still huge progress compared to C/C++, where the bombs can and usually are everywhere and trying to write it safely feels a lot like playing minesweeper.

          • tialaramex 9 hours ago

            An important element of Rust's culture of safety, which is if anything more important than its safety technology which merely enables that culture to flourish, is as follows:

            It is categorically the fault of that unsafe code when the bomb goes off. In a language like C++ it is very tempting for the person who planted the bomb to say "Oh, actually in paragraph sixteen of the documentation it does tell you about the bomb so it's not my fault" but nobody reads documentation, so Rust culturally requires that they mark the function unsafe, which is one last reminder to go read that documentation if you must use it.

            Because this is a matter of culture not technology we can expect further refinement both in terms of what the rules are exactly and the needed technology to deliver that. Rust 1.82 which shipped yesterday adds unsafe extern (previously all the extern functions were unsafe, but er, maybe we should flag the whole block? This will become usual going foward) and unsafe attributes (the attributes which meddle with linking are not safe to just sprinkle on things for example, again this will become usual for those attributes)

        • foobazgt 19 hours ago

          Yes, the drawback of unsafe is one single goof in just one unsafe block can blow your entire program wide open. The advantage is that your entire program isn't one gigantic unsafe block (like C).

          The magnitude matters.

          • gauge_field 18 hours ago

            Also, in my experience, the locality and unsafe api is better for testing purposes compared to unsafe language. If I have an unsafe code that provides safe api with certain safety conditions.

            1) I have a more ergonomic/precise/local contract to satisfy safety

            2) Since this unsafe block is local, it is easier to set up its testing conditions for various scenarios. Otherwise, testing for bigger unsafe block (e.g. unsafe language) would also have to handle coupling between api from which ub originates and the rest of the code.

        • foundry27 18 hours ago

          I’ll propose that most Rust projects that do useful work (in the potential energy sense?) depend on unsafe code, and it’s likely going to be found in the codebases of their dependencies and transitive dependencies. But I agree with almost all of what you’re saying about C and Rust; I work on a C operating system professionally, and I know those same pain points intimately. I program in Rust for fun, and it’s great to use.

          At the end of the day this isn’t a technical argument I’m trying to make, it’s a philosophical one. I think that the more we normalize eroding the core benefits the language safety features provide, one enhancement proposal at a time, one escape hatch added each year for special interfaces, the less implicit trust you can have in rust projects without reviewing them and their dependencies for correctness.

          I think that trust has enormous value, and I think it would suck to lose it. (reflect: what does seeing “written in rust” as a suffix make you think about a project’s qualities before you ever read the code)

          • GolDDranks 18 hours ago

            I’ll propose that ALL Rust projects that do useful work depend on unsafe code.

            If one claims otherwise, I say they have no understanding of Rust. But also, if one helds that against Rust's value promise, I, again, say that they have no understanding of Rust.

            • Dylan16807 15 hours ago

              I get the impression they're only counting code outside the standard library, in which case tons of useful programs are fully safe.

            • I_AM_A_SMURF 15 hours ago

              It's definitely all of them. Even HashMap uses unsafe.

              • steveklabnik 9 hours ago

                It’s more fundamental than that: the Rust language does not encode hardware specifics into the language, and so way deep down there, you have to write down bytes to an address that Rust considers arbitrary. Unless you only want to run programs that accept no input and take no output, which is not exactly a useful subset of programs.

          • jdiez17 14 hours ago

            Of course all software ultimately runs on hardware, which has things like registers and hidden internal state which affect how that hardware accesses or writes to physical memory and all sorts of other "unsafe" things.

            In a more practical sense, all software, even Python programs, ultimately call C functions that are unsafe.

            It's like that saying "all abstractions are wrong, some are useful".

            > what does seeing “written in rust” as a suffix make you think about a project’s qualities before you ever read the code

            By itself, that tells me very little about a project. Same thing if I see a project written in Python or Go, which are nominally memory safe programming languages. I perceive a statistically significant likelihood that software written in these languages will not segfault on me, but it's no guarantee. If I see two programs with the same functionality, where one is written in Python and another one in Rust, I also have some expectation that the one written in Rust will be more performant.

            But you cannot draw general conclusions from that piece of information alone.

            However, as a programmer, Rust is a tool that makes it easier for me to write code that will not segfault or cause data races.

          • kloop 17 hours ago

            > reflect: what does seeing “written in rust” as a suffix make you think about a project’s qualities before you ever read the code

            That the community is going to be significantly more dramatic than average

        • gary_0 18 hours ago

          Also Rust is far from the only language that gives you escape-hatches out of the safety sandbox where you can make a mess if you're reckless. Java, Python, Go, C#... (heck, C# also has an `unsafe` keyword) but hardly anyone would argue those languages have the same safety issues that C has.

          • Y_Y 14 hours ago

            In C unsafe code is typically marked by surrounding it with {braces}.

        • dzaima 19 hours ago

          nit - Rust does allow memory leaks in safe code. https://doc.rust-lang.org/std/mem/fn.forget.html#safety

          • thayne 15 hours ago

            It's also possible to leak memory in languages with tracing garbage collection, just create a data structure that holds strong references to objects that are no longer needed, which commonly happens when using something like a HashMap as a cache without any kind of expiration.

          • eru 17 hours ago

            Yes, memory leaks are rarer in Rust than in C, but they are an entirely different topic that 'unsafe' blocks.

        • hansvm 17 hours ago

          This is giving Rust a bit too much credit though.

          - Memory leaks are not just possible in Rust, they're easy to write and mildly encouraged by the constraints the language places on you. IME I see more leaks in Rust in the wild than in C, C#, Python, C++, ...

          - You can absolutely have data races in a colloquial sense in Rust, just not in the sense of the narrower definition they created to be able to say they don't have data races. An easy way to do so is choosing the wrong memory ordering for atomic loads and stores, including subtle issues like those arising from mixing `seq_cst` and `acquire`. I think those kinds of bugs are rare in the wild, but one project I inherited was riddled with data races in Safe rust.

          - Unsafe is a kind of super-unsafe that's harder to write correctly than C or C++, limiting its utility as an escape hatch. It'll trigger undefined behavior in surprising ways if you don't adhere to a long list of rules in your unsafe code blocks (in a way which safe code can detect). The list changes between Rust versions, requiring re-audits. Some algorithms (especially multi-threaded ones) simply can't even be written in small, easily verifiable unsafe blocks without causing UB. The unsafeness colors surrounding code.

          • simonask 14 hours ago

            Wait, when exactly did the soundness rules change since 1.0? When have you had to re-audit unsafe code?

            The Rustonomicon [1] serves as a decent introduction to what you can or can't do in unsafe code, and none of that changed to my knowledge.

            I agree that it's sometimes challenging to contain `unsafe` in a small blast zone, but it's pretty rare IME.

            [1]: https://doc.rust-lang.org/nomicon/intro.html

            • steveklabnik 9 hours ago

              There was at least one in the first year after 1.0, we had warnings on for like nine months and then finally broke the code later.

              That I only remember such things vaguely and not in a “oh yeah here’s the last ten times this happened and here’s the specifics” speaks to how often it happens, which is not often.

              Lots of times soundness fixes are found by people looking for them, not for code in the wild. Fixing cve-rs will mean a “breaking” change in the literal sense that that code will no longer compile, but outside of that example, no known code in the wild triggers that bug, so nobody will notice the breakage.

            • hansvm 10 hours ago

              > Wait, when exactly did the soundness rules change since 1.0? When have you had to re-audit unsafe code?

              At a minimum you have to check that the rules haven't changed for each version [0].

              The issue with destructors just before 1.0 dropped [1] would have been something to scrutinize pretty closely. I'm not aware of any major changes since then which would affect previously audited code, but new code for new Rust versions (e.g., when SIMD stabilized) needs to be considered with new rules as well.

              > none of that changed to my knowledge

              This is perhaps a bit pedantic, but the nomicon has bug fixes all the time (though the underlying UB scenarios in the compiler remain stable), and it's definitely worth re-examining your unsafe Rust when you see changes which might have incorrectly led a programmer to write some UB.

              [0] https://doc.rust-lang.org/reference/behavior-considered-unde... [1] https://cglab.ca/~abeinges/blah/everyone-poops/

          • littlestymaar 5 hours ago

            There's some truth in what you're saying, but its also wildly exaggerated and “everything that is exaggerated is insignificant”.

            • hansvm an hour ago

              > but its also wildly exaggerated

              Such as?

              > everything that is exaggerated is insignificant

              But are the non-exaggerated things significant?

        • lertn 12 hours ago

          With C you can take proven algorithms from CLRS and translate them directly without boilerplate.

          The same algorithms already become ugly/obfuscated in idiomatic C++.

          Looking at the macro in the LWN article, the approach of Rust of using wrappers and boxes and complex macros to emulate features appears to go into the same direction as C++.

          Still in 2024, gdb is far less useful for C++ than for C. C++ error messages are far less useful.

          All of that matters for reliable software, crashes (which can occur anyway with unsafe) are just a tiny part of the equation.

          • ArtixFox 10 hours ago

            With C, you are not 100% sure that ur code will work. You have to verify and extensively test it. With C++ you have some very vague guarantees about ur code but you can easily transition from C and even have some interesting type safety like mp-units. with Rust, you have some good guarantees that ur code wont have UAF, will be threadsafe, etc etc and you can probably invent some interesting typesafety like mp-units.

            In all 3, you gotta verify [frama-C, astree,bedrock, the many projects working on rust, esp the coq one] and extensively test it.

            But by default, all 3 provide a different level of gurantees

        • littlestymaar 6 hours ago

          > For those projects that don't use any unsafe, we can say -- absent compiler bugs or type system unsoundness -- that there will be no memory leaks or data races or undefined behavior. That's useful! Very useful!

          It's very useful indeed, I've been programming in Rust daily for the past 7 years (wow time flies) and the times when I've needed unsafe code can still be counted on my two hands.

      • Rusky 18 hours ago

        It's not the same unwritten social contract: in Rust even the unsafe code has the same stricter type signatures as the safe code, so there is a formal way to judge which part of the program is at fault when the contract is broken. You might say the contract is now written. :-)

        In C, the type system does not express things like pointer validity, so you have to consider the system as a whole every time something goes wrong. In Rust, because the type system is sound, you can consider each part of the program in isolation, and know that the type system will prevent their composition from introducing any memory safety problems.

        This has major implications in the other direction as well: soundness means that unsafe code can be given a type signature that prevents its clients from using it incorrectly. This means the set of things the compiler can verify can be extended by libraries.

        The actual practice of writing memory-safe C vs memory-safe Rust is qualitatively different.

        • lertn 12 hours ago

          This got me intrigued. Is there a soundness proof for the Rust type system?

          The only language with such a proof that I am aware of is StandardML. Even OCaml is too complex for a soundness proof.

        • im3w1l 16 hours ago

          > In Rust, because the type system is sound

          Unfortunately, it's not. Now I do think it will be eventually fixed, but given how long it has taken it must be thorny. https://github.com/rust-lang/rust/issues/25860

          • simonask 14 hours ago

            In practice this is a compiler bug, though, and is treated as such, and not a soundness hole in the abstract design of the type system.

            There has also not been a single case of this bug being triggered in the wild by accident.

            • _flux 13 hours ago

              I take this to mean e.g. the Oxide project has proven the Rust type system sound?

              There was a Git repository demontrating unsoundness issues in the compiler, but I seem to be unable to find it anymore :/. It seemed like there would be more than one underlying issue, but I can't really remember that.

          • johnisgood 13 hours ago

            That code looks horrendous.

      • wbl 19 hours ago

        The difference is every line of C can do something wrong while very few lines of Rust can. It's much easier to scrutinize a small well contained class with tools like formal methods than a sprawling codebase.

        • uecker 17 hours ago

          If you limited wrong to "memory safe" and also ignore that unsafe parts violating invariants can make safe parts of Rust to be wrong.

          • Dylan16807 16 hours ago

            > If you limited wrong to "memory safe"

            Yes, because this is a discussion about the value of "unsafe", so we're only talking about the wrongs that are enabled by "unsafe".

            > and also ignore that unsafe parts violating invariants can make safe parts of Rust to be wrong.

            If I run a line of code that corrupts memory, and the program crashes 400 lines later, I don't say the spot where it crashes is wrong, I say the memory corrupting line is wrong. So I disagree with you here.

            • uecker 13 hours ago

              It does not invalidate an argument that you do not want to talk about it.

              Regarding the second point: yes, you can then blame the "unsafe" part but the issue is that the problem might not be so localized as the notion of "only auditing unsafe blocks is sufficient" implies. You may need to understand the subtle interaction of unsafe blocks with the rest of the program.

              • Dylan16807 2 hours ago

                Unsafe blocks have to uphold their invariants while accepting any possible input that safe code can give them. Any subtle interactions enabled by "unsafe" need to be part of the invariants. If they don't do that, it's a bug in the unsafe code, not the safe code using it.

                If done properly, you can and should write out all the invariants, and a third party could create a proof that your code upholds them and they prevent memory errors. That involves checking interactions between connected unsafe blocks as a combined proof, but it won't extend to "the rest of the program" outside unsafe blocks.

              • dwattttt 12 hours ago

                > the problem might not be so localized as the notion of "only auditing unsafe blocks is sufficient" implies

                It depends on what you consider "problem" can mean. An unsafe function needs someone to write unsafe in order to call it, and it's on that calling code to make sure the conditions needed to call the unsafe function are met.

                If that function itself is safe, but still let's you trigger the unsafe function unsafely? That function, which had to write 'unsafe', has a bug: either it's not upholding the preconditions of the unsafe function it's calling, or it _can't_ uphold the preconditions without their own callers also being in on it, in which case they themselves need to be an unsafe function (and consider whether their design is a good one).

                In this way, you'll always find unsafe 'near' the bug.

                • uecker 6 hours ago

                  In other words, somebody made an error somewhere.

                  • dwattttt 2 hours ago

                    You're thinking of C; Rust forced that somebody to write unsafe near it to create the bug.

              • Filligree 10 hours ago

                Unsafe blocks have a specific set of requirements they have to abide by.

                Assuming they successfully do so, it is then guaranteed that no safe code is able to trigger undefined behaviour by calling the unsafe code.

                Importantly, this can be checked without ever reading any of the safe code.

                • uecker 6 hours ago

                  Let's discuss this example:

                  https://github.com/ejmahler/transpose/blob/e70dd159f1881d86a...

                  The code is buggy. Where is the bug?

                  • lostdog 3 hours ago

                    The most common bug in that type of code is mixing up x and y, or width and height somewhere in your loops, or maybe handling partial blocks. It's not really what Rust aims to protect against, though bounds checking is intended to be helpful here.

                    I don't get the argumentshere. In practice, Rust lowers the risk of most of your codebase. Yeah, it doesn't handle every logic bug, but mostly you can code with confidence, and only pay extra attention when you're coding something intricate.

                    A language which catches even these bugs would be incredible, and I would definitely try it out. Rust ain't that language, but it still does give you more robust programs.

                  • NobodyNada 2 hours ago

                    The code uses `unsafe` blocks to call `unsafe` functions that have the documented invariant that the parameters passed in accurately describe the size of the array. However, this invariant is not necessarily held if an integer overflow occurs when evaluating the `assert` statements -- for example, by calling `transpose(&[], &mut [], 2, usize::MAX / 2 + 1)`.

                    To answer the question of "where is the bug" -- by definition, it is where the programmer wrote an `unsafe` block that assumes an invariant which does not necessarily hold. Which I assume is the point you're trying to make -- that a buggy assert in "safe" code broke an invariant assumed by unsafe code. And indeed, that's part of the danger of `unsafe` -- by using an `unsafe` block, you are asserting that there is no possible path that could be taken, even by safe code you're interacting with, that would break one of your assumed invariants. The use of an `unsafe` block is not just an assertion that the programmer has verified the contents of the block to be sound given a set of invariants, but also that any inputs that go into the block uphold those invariants.

                    And indeed, I spotted this bug by thinking about the invariants in that way. I started by reading the innermost `unsafe` functions like `transpose_small` to make sure that they can't ever access an index outside of the bounds provided. Then, I looked at all the `unsafe` blocks that call those functions, and read the surrounding code to see if I could spot any errors in the bounds calculations. I observed that `transpose_recursive` and `transpose_tiled` did not check to ensure the bounds provided were actually valid before handing them off to `unsafe` code, which meant I also had to check any safe code that called those functions to see how the bounds were calculated; and there I found the integer overflow.

                    So you're right that this is a case of "subtle interaction of unsafe blocks with the rest of the program", but the wonderful part of `unsafe` is that you can reduce the surface area of interaction with the rest of the program to an absolute minimum. The module you linked exposes a single function with a public, safe interface; and by convention, a safe API visible outside of its module is expected to be sound regardless of the behavior of safe code in other modules. This meant I only had to check a handful of lines of code behind the safe public interface where issues like integer overflows could break invariants. Whereas if Rust had no concept of `unsafe`, I would have to worry about potentially every single call to `transpose` across a very large codebase.

      • mort96 16 hours ago

        It is literally impossible to build systems where you never at any point trust that underlying systems work correctly. This is a boring and uninformative criticism. It is the case for every language ever invented.

      • sqeaky 18 hours ago

        Constrained and minimized trust in programmer diligence is better than unconstrained and omnipresent trust in the same.

      • stackghost 19 hours ago

        I'm not a rust fanboy but isn't the point of rust to dramatically decrease the area in which null pointer dereferences and friends can occur, and thus make them more likely to be spotted?

        • steveklabnik 9 hours ago

          Not just spotted, but easier to find after the fact when the problematic behavior happens.

      • thfuran 18 hours ago

        Would you similarly say that Russian Roulette is the same game whether the revolver has two chambers or ten thousand?

      • pornel 13 hours ago

        Building safe abstraction around unsafe code works, because it reduces the scope of the code that has to be reviewed for memory safety issues.

        Instead of the whole codebase being suspect, and hunting for unsafety being like a million-line "Where's Waldo?", it reduces the problem to just verifying the `unsafe` blocks against safety of their public interface, "is this a Waldo?". This can still be tricky, but it has proven to be a more tractable problem.

      • jchw 19 hours ago

        I think when people come to these conclusions it's largely due to a misunderstanding of what exactly the point of most programming language safety measures are and why they make sense.

        Something that people often ponder is why you can't just solve the null safety problem by forcing every pointer dereference to be checked, with no other changes. Well of course, you can do that. But actually, simply checking to make sure the pointer is non-null at the point of dereference gets you surprisingly little. When you do this, what you're (ostencibly) trying to do is reduce the number of null pointer dereferences, but in practice what happens now is that you just have to explicitly handle them. But, in a lot of cases, there's really nothing particularly sensible to do: the pointer not being null is an invariant that was supposed to be upheld and it wasn't, and now at the point of dereference, at runtime, there's nothing to do except crash. Which is what would've happened anyways, so what's the point? What you really want to do isn't actually prevent null pointer dereferences, it's to uphold the invariants that the pointer is non-null in the first place, ideally before you leave compile time.

        Disallowing "unsafe" operations without marking them explicitly unsafe doesn't give you a whole lot, but what you can do is expand the number of explicitly safe operations to cover more of what you want to do. How Rust, and many other programming languages, have been accomplishing this is by expanding the type system, and combining this with control flow analysis. Lifetimes in Rust are a prime example, but there are many more such examples. Nullability, for example, in languages like TypeScript. When you do it this way, the safety of such "safe" operations can be guaranteed, and while these guarantees do have some caveats, they are very strong to a lot of different situations that human code reviews are not, such as an unsafe combination of two otherwise-safe changesets.

        It's actually totally fine that some code will probably remain unable to be easily statically verified, the point is that we want to reduce the amount of code that can't be easily statically verified to be as small as possible. In the future we can use much less easy approaches to statically verify unsafe blocks, such as using theorem provers to try to prove the correctness of "unsafe" code. But even just reducing the amount of not-necessarily-memory-safe code is an enormous win, for obvious reasons: it dramatically reduces the surface area for vulnerabilities. Moreover, time and time again, it is validated that most new vulnerabilities come from relatively recent changes in code, which is another huge win: a lot of the unsafe foundations actually don't need to be changed very often.

        There is absolutely nothing special about code written in Rust, it's doing the same shit that C code has been doing for decades (well, on the abstract anyway; I'm not trying to downplay how much more expressive it is by any means). What Rust mainly offers is a significantly more advanced type system that allows validating many more invariants at compile-time. God knows C developers on large projects like the Linux kernel care about validating invariants: large amounts of effort have been poured into static checking tools for C that do exactly this. Rust is a step further though, as the safe subset of Rust provides guarantees that you basically can't just tack onto C with only more static checking tools.

        • sfvisser 18 hours ago

          Isn’t the argument that by checking for NULL you can now safely crash/panic instead of going into undefined behavior and being a potential security hazard?

          • jchw 18 hours ago

            The potential for undefined behavior is, I will agree, potentially fairly serious, especially depending on specific circumstances... (In most cases it should reliably hit an unmapped page and cause an exception, but there are exceptions on weird targets or with huge offsets.) But, you can pretty much entirely ignore it if you can just guarantee that the pointer isn't NULL in the first place, which not only prevents you from needing to worry about the undefined behavior, but also about incorrect code that might violate the invariant in the first place, since it is statically-checked.

            If you were only afraid of the undefined behavior, you could augment the compiler to insert runtime checks anywhere undefined behavior could occur (which obviously can be done with Clang sanitizers.) However, the undefined behavior problem is really just a symptom of incorrect code, so it'd be even better if we could just prevent that instead.

            In high level languages like Java and Python there is just as much, if not more, interest in preventing null reference exceptions, even though they are "safe".

            • thayne 14 hours ago

              > In most cases it should reliably hit an unmapped page and cause an exception, but there are exceptions on weird targets or with huge offsets

              Perhaps the most important exception is when the optimizer assumed the pointer was non-null, so optimized it in a way that produces completely unexpected behavior when it is null.

              Also use-after-free and use of uninitialized pointers is more likely to point to incorrect, but mapped, locations.

              • jchw 10 hours ago

                > Perhaps the most important exception is when the optimizer assumed the pointer was non-null, so optimized it in a way that produces completely unexpected behavior when it is null.

                > Also use-after-free and use of uninitialized pointers is more likely to point to incorrect, but mapped, locations.

                I stuck to a null pointer dereference because it's useful for demonstration since the side-effect of hitting one is usually not a huge deal, but actually it wouldn't matter if it were a huge deal or not. The point I'm trying to make, and maybe not making obvious enough, is that the null pointer dereference is just a symptom of the fact that other invariants are not being upheld; it's not just about preventing an unsafe operation, it's about preventing the kinds of incorrect code that lead to them. It's the same for a use-after-free. That's exactly why I am a fan of Rusts' borrow checker, you can statically eliminate the problem that causes use-after-frees.

                It isn't really that hard to construct a memory safe programming language, but the "obvious" ways of doing it have trade-offs that are undesirable or infeasible for some use cases. Rather than make the operations "more safe" by ducktaping runtime checks, Rust just forces the code to be more correct by statically checking invariants.

            • eru 17 hours ago

              > (In most cases it should reliably hit an unmapped page and cause an exception, but there are exceptions on weird targets or with huge offsets.)

              The kernel is one such exception.

              • jchw 11 hours ago

                Depends a lot on the system, but I don't think this is much of a problem with modern Linux systems. Looking on my machine, vm.mmap_min_addr is set to 65536, not to mention the mitigations modern CPUs have for preventing unintended access to user pages. Just as in userspace, a null dereference on a modern Linux system is almost guaranteed to hit a trap.

                That said, a potentially bigger problem is what happens when handling it. Instead of a kernel panic, nowadays you get a kernel oops. That's definitely going to have weird side-effects that could have e.g. security implications. But honestly, this all goes back to the original problem: in a lot of cases, there just isn't really a more sensible thing to do anyways. Even if the null dereference itself is potentially scary, by the time you get to the point where it might happen, you've already missed the actual underlying problem, and there might not be anything reasonable you can do.

                I will grant you though that there are definitely some exotic cases where null dereferences won't trap. But this wasn't the point, I glossed over it for a reason.

                • eru 10 hours ago
                  • jchw 10 hours ago

                    We're really going far out into the unrelated weeds now, but this relied on a myriad of bugs that were since fixed (like MMAP_PAGE_ZERO overriding mmap_min_addr, and MMAP_PAGE_ZERO not being cleared when exec'ing a setuid/setgid binary) and would be thwarted by modern processor mitigations (like SMAP and SMEP) which make this entire class of exploit usually impossible. You have to work a lot harder to have an exploitable null pointer dereference these days, and when you do, it's usually not related to the null pointer dereference itself, but actually what happens after trapping.

          • immibis 15 hours ago

            If that was the only point, we could simply add a compiler flag to make null pointer deref defined behaviour (raise SIGSEGV). It's already defined behaviour everywhere except the compiler's optimizer - unlike say a use after free.

        • eru 17 hours ago

          > But, in a lot of cases, there's really nothing particularly sensible to do: the pointer not being null is an invariant that was supposed to be upheld and it wasn't, and now at the point of dereference, at runtime, there's nothing to do except crash. Which is what would've happened anyways, so what's the point?

          Crashing is the lucky case! Specifically in the kernel, there can be valid memory at address 0, and there are exploits that capitalise on the friction between memory address 0 sometimes being and C's null pointer being full of undefined behaviour.

      • Ar-Curunir 17 hours ago

        This is nonsense. Just because some small parts of the code are must be annotated as unsafe doesn’t mean that we’re suddenly back to C land. in comparison, with C the entire codebase is basically wrapped in a big unsafe. That difference is important, because in Rust you can focus your auditing and formal verification efforts on just those small unsafe blocks, whereas with C everything requires that same attention.

        Furthermore, Rust doesn’t turn off all checks in unsafe, only certain ones.

        ,

        • uecker 17 hours ago

          If you think correctness is only about memory safety, only then you can you can "focus your auditing and formal efforts on just those small unsafe blocks". And this is a core problem of Rust that people think they can do this.

          • erik_seaberg 17 hours ago

            Memory and concurrency safety need to be the first steps, because how can you analyze results when the computer might not have executed your code correctly as written?

          • Ar-Curunir 14 hours ago

            my comment (and indeed in this entire comment chain) is within the context of memory safety. This should have been clear because of the focus on unsafe, which, compared to normal Rust, relaxes only memory safety.

            Obviously if you want to get formal guarantees beyond that property, you have to reason about safe code also.

            (Also, the comparison in this entire chain is against C, and the latter is better than Rust in this regard… how?)

            • uecker 14 hours ago

              Yes, this discussion is about memory safety, but this does not invalidate my argument. There is no point in only auditing your code with respect to memory safety, so the argument that you can simply ignore everything outside "unsafe" blocks is simply wrong.

    • rocqua 15 hours ago

      Rust is all about ring-fencing the scary parts in unsafe. A rust program that doesn't use unsafe, and only uses dependencies that are sound with respect to unsafe, is guaranteed to be fine. And it is very easy to write code without using unsafe. Unlike C, where code style that is guaranteed to be memory safe is nigh impossible.

      The difficult bit with Rust is still the sound use of unsafe, but it is quite feasible to do that by hand. It does, sadly, require looking at the entire module that contains the unsafe code.

    • remram 18 hours ago

      I think the part that uses `unsafe` and can break Pin if done wrong is the implementation of the smart-pointer type, not its use.

    • mise_en_place 19 hours ago

      I'm not convinced you can have your cake and eat it too. Not a dig at Rust specifically; it's more like, you can have managed memory, or manually manage it yourself. There is no in-between. The implementation ends up being a kludge. Terry had it right, just put everything in the lower 2 GB and DMA it.

  • DoubleDecoded 19 hours ago

    "could break the guarantees" is a weak guarantee then.

    • josephcsible 19 hours ago

      It's still a guarantee. The point of the guarantee isn't that no code can cause certain kinds of problems, but rather that any code that can must be marked unsafe.

    • dzaima 5 hours ago

      Basically any guarantee can be broken by an evil guaranteer. All this is saying is that this is a case of the user being required to uphold a guarantee, instead of the stdlib/language.

  • stonethrowaway 19 hours ago

    Anyone who wants to do kernel-level development should first do Embedded hardware/software interfacing. No RTOS, plain “Embedded C”, with some bit banging and dealing with voltage spikesc transients and people doing stupid shit to hardware (yes, really) and other things. Know the memory map and recite it from memory. Some might think I’m joking or being facetious - no, I’m pretty serious actually. I’d rather have an embedded person writing kernel drivers in slapped-together C than a Rustacian that complains about unsafe code and being an idiot about it. See [0] for detailed explanation.

    People need to learn the niceness of safety and perfect execution is a continuum of tolerances and flimsy guarantees from unmarked silicon that could be made in US, but is most likely a knock off made in China that will fail in 1/3rd of the expected time and gives a false reading if you so much as look at it the wrong way.

    [0] https://www.usenix.org/system/files/1311_05-08_mickens.pdf

    • rcxdude 13 hours ago

      Plenty of people writing rust are coming from that background. And a substantial part of the reason for wanting it in linux is that the slapped-together C in linux drivers is often awful, both in terms of understanding of the hardware and in terms of the quality of the software. Rust can at least help the poor quality of the latter not affect the security and stability of the kernel so much.

    • sureglymop 18 hours ago

      You can do that in rust. Yes, you probably will have unsafe blocks. You can even write "C like" rust code that has a bunch of unsafe blocks but you'll benefit from better tooling. But maybe I misunderstand the article and there is somehow an implication that unsafe blocks are bad?

      When I was doing some embedded development using rust it was actually a great experience, with hal and pac crates available already for a lot of hardware or easy to generate.

    • jendjdndn 17 hours ago

      You sound a lot like "kids these days!"

      What applicable skills would someone writing a kernel driver gain from reciting a memory map? Abstractions exist for a reason.

      The skill is in creating useful an performant abstractions.

      • a5c11 12 hours ago

        The problem starts when the abstraction fails and you have to dive deeper, but no one taught you how to dive, only swim with a head above the water level. Those who can dive can also swim, it doesn't work the other way round, though.

      • eru 16 hours ago

        > You sound a lot like "kids these days!"

        Exactly! Kids like stonethrowaway with their C. Real engineers write their compilers in assembly.

        (Less snarky, why stop at C, and not complain about even lower level stuff?)

        • stonethrowaway 16 hours ago

          > Real engineers write their compilers in assembly.

          Not sure where this misconception comes from. The engineering department mostly relies on Verilog twiddling, shoddy Spice models, debugging prototype boards with poorly crimped jumper wires, charged capacitors scattered around with no adults in the room, and freshly minted junior EEs who forget spice models and Verilog hacks aren’t the real thing.

          You have the wrong department. Software development is down the hall to the left. Those folks down there don’t even have an engineering degree.

          • eru 15 hours ago

            In any case, you might like https://docs.rust-embedded.org/book/

            I've recently done a bit of work on Rust in something like embedded systems. Only instead of real hardware, we are running our software on Zero Knowledge VMs, ie on math.

      • exmadscientist 15 hours ago

        See, I read the parent post's point not as "abstractions are bad" but as "it's much better to be someone who doesn't need the abstraction, but chooses to use it". I have worked with a number of crappy embedded developers in my career. Somehow, the ones who are capable of going beneath the many levels of abstraction are always really, really good at their jobs.

        So it's not that embedded Rust is bad. It's that developers who can't do their jobs without embedded Rust are usually very bad indeed. It's great when it's a choice. It's terrible when you lack the skills or perspective to work with what's under the hood.