Checked-size array parameters in C

(lwn.net)

68 points | by chmaynard 8 hours ago ago

26 comments

  • Arch-TK 6 hours ago

    That weird feeling when you realise that the people you hang out with form such a weird niche that something considered common knowledge among you is being described as "buried deep within the C standard".

    What's noteworthy is that the compiler isn't required to generate a warning if the array is too small. That's just GCC being generous with its help. The official stance is that it's simply undefined behaviour to pass a pointer to an object which is too small (yes, only to pass, even if you don't access it).

  • Veserv 6 hours ago

    Pointer to array is not only type-safe, it is also objectively correct and should have always been the syntax used when passing in the address of a known, fixed size array. This is all a artifact of C automatically decaying arrays to pointers in argument lists when a array argument should have always meant passing a array by value; then this syntax would have been the only way to pass in the address of a array and we would not have these warts. Automatic decaying is truly one of the worst actual design mistakes of the language (i.e. a error even when it was designed, not the failure to adopt new innovations).

    • jacquesm 6 hours ago

      Fully agreed, and something that is hard to fix. This guy is trying really hard and with some success:

      https://news.ycombinator.com/item?id=45735877

      • wild_pointer 6 hours ago

        This guy is doing something else completely. In his words:

        > In my testing, it's between 1.2x and 4x slower than Yolo-C. It uses between 2x and 3x more memory. Others have observed higher overheads in certain tests (I've heard of some things being 8x slower). How much this matters depends on your perspective. Imagine running your desktop environment on a 4x slower computer with 3x less memory. You've probably done exactly this and you probably survived the experience. So the catch is: Fil-C is for folks who want the security benefits badly enough.

        (from https://news.ycombinator.com/item?id=46090332)

        We're talking about a lack of fat pointers here, and switching to GC and having a 4x slower computer experience is not required for that.

        • Veserv 5 hours ago

          I am actually not talking about the lack of fat pointers. That is almost entirely orthogonal to my point. I am talking about the fact that what would be the syntax for passing a array by value was repurposed for automatically decaying into a pointer. This results in a massive and unnecessary syntactic wart.

          The fact that the correct type signature, a pointer to fixed-size array, exists and that you can create a struct containing a fixed-size array member and pass that in by value completely invalidates any possible argument for having special semantics for fixed-size array parameters. Automatic decay should have died when it became possible to pass structs by value. Its continued existence continues to result in people writing objectively inferior function signatures (though part of this it the absurdity of C type declarations making the objectively correct type a pain to write or use, another one of the worst actual design mistakes).

          Fat pointers or argument-aware non-fixed size array parameters are a separate valuable feature, but it is at least understandable for them to not have been included at the time.

          • moefh 4 hours ago

            > The fact that the correct type signature, a pointer to fixed-size array, exists and that you can create a struct containing a fixed-size array member and pass that in by value completely invalidates any possible argument for having special semantics for fixed-size array parameters.

            That's not entirely accurate: "fixed-size" array parameters (unlike pointers to arrays or arrays in structs) actually say that the array must be at least that size, not exactly that size, which makes them way more flexible (e.g. you don't need a buffer of an exact size, it can be larger). The examples from the article are neat but fairly specific because cryptographic functions always work with pre-defined array sizes, unlike most algorithms.

            Incidentally, that was one of the main complaints about Pascal back in the day (see section 2.1 of [1]): it originally had only fixed-size arrays and strings, with no way for a function to accept a "generic array" or a "generic string" with size unknown at compile time.

            [1] https://www.cs.virginia.edu/~evans/cs655/readings/bwk-on-pas...

        • jacquesm 5 hours ago

          This is not about performance.

  • nikeee 6 hours ago

    GCC also has an extension to support references to other parameters of the function:

        #include <stddef.h>
        void foo(size_t n, int b[static n]);
    
    https://godbolt.org/z/c4o7hGaG1

    It is not limited to compile-time constants. Doesn't work in clang, sadly.

  • o11c 6 hours ago

    Better option: just wrap it in a unique struct.

    There are perhaps only 3 numbers: 0, 1, and lots. A fair argument might be made that 2 also exists, but for anything higher, you need to think about your abstraction.

    • pixl97 5 hours ago
      • kalterdev 2 hours ago

        Nice article, never seen that.

        I’ve always thought it’s good practice for a system to declare its limits upfront. That feels more honest than promising ”infinity” but then failing to scale in practice. Prematurely designing for infinity can also cause over-engineering—like using quicksort on an array of four elements.

        Scale isn’t a binary choice between “off” and “infinity.” It’s a continuum we navigate with small, deliberate, and often painful steps—not a single, massive, upfront investment.

        That said, I agree the ZOI is a valuable guideline for abstraction, though less so for implementation.

  • anonymousiam an hour ago

    This would not be the first time that the "static" keyword in C was reused for something "new" (relative to the original K&R pre-ANSI C).

    https://developer.arm.com/community/arm-community-blogs/b/em...

  • Animats an hour ago

    But only for constant size arrays.

    You could just declare

        struct Nonce {
            char nonce_data[SIZE_OF_NONCE];
        }
    
    and pass those around to get roughly the same effect.
  • aaaashley 6 hours ago

    Funny thing about that n[static M] array checking syntax–it was even considered bad in 1999, when it was included:

    "There was a unanimous vote that the feature is ugly, and a good consensus that its incorporation into the standard at the 11th hour was an unfortunate decision." - Raymond Mak (Canada C Working Group), https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_205.htm

    • jacquesm 6 hours ago

      It wasn't considered bad, it was considered ugly and in the context given that is a major difference. The proposed alternative in that post to me is even more ugly so I would have agreed with the option that received the most support, to leave it as it was.

      • moefh 5 hours ago

        It was always considered bad not (just) because it's ugly, but because it hides potential problems and adds no safety at all: a `[static N]` parameter tells the compiler that the parameter will never be NULL, but the function can still be called with a NULL pointer anyway.

        That's is the current state of both gcc and clang: they will both happily, without warnings, pass a NULL pointer to a function with a `[static N]` parameter, and then REMOVE ANY NULL CHECK from the function, because the argument can't possibly be NULL according to the function signature, so the check is obviously redundant.

        See the example in [1]: note that in the assembly of `f1` the NULL check is removed, while it's present in the "unsafe" `f2`, making it actually safer.

        Also note that gcc will at least tell you that the check in `f1()` is "useless" (yet no warning about `g()` calling it with a pointer that could be NULL), while clang sees nothing wrong at all.

        [1] https://godbolt.org/z/ba6rxc8W5

        • MobiusHorizons 10 minutes ago

          Wow, that’s crazy. Does anyone have any context on why they didn’t fix this by either disallowing NULL, or not treating the pointer as non-nullable? I’m assuming there is code that was expecting this not to error, but the combination really seems like a bug not just a sharp edge.

        • jacquesm 5 hours ago

          Interesting, I wasn't aware of that and thought the compiler would at least throw up a warning if it had seen that function prototype.

          • moefh 5 hours ago

            It's not intuitive, although arguably conforms to the general C philosophy of not getting in the way unless the code has no chance of being right.

            For example, both compilers do complain if you try to pass a literal NULL to `f1` (because that can't possibly be right), the same way they warn about division by a literal zero but give no warnings about dividing by a number that is not known to be nonzero.

            • jacquesm 4 hours ago

              Right, so if the value is known at compile time it will flag the error but if it only appears at runtime it will happily consume the null and wreak whatever havoc that will lead to further down the line. Ok, thank you for pointing this out, I must have held that misconception for a really long time.

        • OneDeuxTriSeiGo 4 hours ago

          Note that the point of [static N] and [N] is to enforce type safety for "internal code". Any external ABI facing code should not use it and arguably there should be a lint/warning for its usage across an untrusted interface.

          Inside of a project that's all compiled together however it tends to work as expected. It's just that you must make sure your nullable pointers are being checked (which of course one can enforce with annotations in C).

          TLDR: Explicit non-null pointers work just fine but you shouldn't be using them on external interfaces and if you are using them in general you should be annotating and/or explicitly checking your nullable pointers as soon as they cross your external interfaces.

  • kazinator 4 hours ago

    The pointer-to-array solution is okay, with the caveat that pointer-to-array typedefs should be avoided.

    The problem is that they are attractive for reducing repeated declarations:

      typedef unsigned char thing_t[THING_SIZE];
    
      struct red_box_with_a_hook {
         thing_t thing1, thing2;
      }
    
      void shake_hands_with(thing_t *thing);
    
    That is all well. But thing_t is an array type which still decays to pointer.

    It looks as if thing_t can be passed by value, but since it is an array, it sneakily isn't passed by value:

      void catch_with_net(thing_t thing);  // thing's type is actually "usnsigned char *"
    
      // ...
        unsigned char x[42]];
        catch_with_net(x);        // pointer to first element passed; type checks
  • 1over137 2 hours ago

    Anyone know if there's a flag to tell clang to treat `void fn(int array[N])` as if it was `void fn(int array[static N])`?

  • halayli 5 hours ago

    on a similar note, there are also these field attributes that are very helpful for catching similar issues:

    https://clang.llvm.org/docs/AttributeReference.html#counted-...

  • Philpax 5 hours ago

    Excited to for Walter to drop by and extol the virtues of fat pointers :-)

    For reference: https://digitalmars.com/articles/C-biggest-mistake.html