The fix wasn't easy, or C precedence bites

(boston.conman.org)

28 points | by ingve 4 days ago ago

41 comments

  • procaryote 15 hours ago

    The code style on this makes my eyes bleed. It was a long time since I saw anyone do "if () single statement; else { block of statements }"

    Making the first thing a block doesn't add any lines and makes it less brittle, and makes future diffs better

    And they do some weird alignment of assignments, and for some reason carry on adding extra spaces for some assignments even when they're alone in a block?

    And they go out of their way to do pointer arithmetic rather than array operations that are more readable

    And the code is essentially sscanf(str, "%%%2x", &value) plus some checks, so why not write that instead?

    Also what kind of psycho uses CppStyleFunctionNames() in C?

    • zabzonk 13 hours ago

      > Also what kind of psycho uses CppStyleFunctionNames() in C?

      People influenced by the Win32 C API. I prefer it that way myself.

      • pjmlp 11 hours ago

        That style predates Windows.

        • GabrielTFS 10 hours ago

          I would guess a significant portion of people using the style (if not most), did so inspired by Windows, though

          • pjmlp 9 hours ago

            It was common across all not UNIX operating systems.

            You will find it on MS-DOS, Amiga, OS/2, Mac OS.

            It was based on what was common in ALGOL derived languages.

            On UNIX you will find it on X Windows, and Motif.

            • uxp100 8 hours ago

              Macintosh toolbox was Pascal first right? Or at least there was an era where it was. And I think this naming convention is kinda a pascal thing.

              • pjmlp 8 hours ago

                Yes, Apple is the actual creator of Object Pascal variant, and there was a Pascal based OS using P-Code for Apple II GS.

    • stevage 15 hours ago

      Tbh it's been a while since I've seen anyone manually format code. Automatically formatted code definitely does help avoid certain kinds of bugs.

      Does C not have the equivalent of Prettier?

      I am curious what their editing process was that changed:

      > assert(isxdigit((src+1)));

      to

      > if (!isxdigit(src+1)) return '\0';

      • commandersaki 14 hours ago

        Does C not have the equivalent of Prettier?

        There is clang-format, but it is not always easy to use on a code base because you don't want to have it run on imported/3rd party source which may mix in with your regular source tree. I've been meaning to use it on a project but only have run on an explicit list of files.

      • 9029 10 hours ago

        > I am curious what their editing process was

        They said:

        > I typed in the new code as that's faster than modifying the existing code

        • stevage 10 hours ago

          Ah. I bet there's a lot of vim users violently disagreeing...

      • ciupicri 11 hours ago

        There's the good old GNU indent https://www.gnu.org/software/indent/

        > But even if you fail in getting emacs to do sane formatting, not everything is lost: use indent.

        > Now, again, GNU indent has the same brain-dead settings that GNU emacs has, which is why you need to give it a few command line options. However, that’s not too bad, because even the makers of GNU indent recognize the authority of K&R (the GNU people aren’t evil, they are just severely misguided in this matter), so you just give indent the options -kr -i8 (stands for K&R, 8 character indents), or use scripts/Lindent, which indents in the latest style.

        > indent has a lot of options, and especially when it comes to comment re-formatting you may want to take a look at the man page. But remember: indent is not a fix for bad programming.

        (Linux kernel coding style, https://www.kernel.org/doc/html/latest/process/coding-style....)

    • lastdong 14 hours ago

      I was thinking the same thing; also being more explicit would have prevented the bug described in the first place

  • foofoo12 15 hours ago

    It's really easy to add a remote code execution feature to your project, especially if you code like this. You might even add that feature subconsciously at 3am, while hacking on a different feature.

  • johnfn 14 hours ago

    Is this really a difficult precedence issue? It seems quite obvious to me that *foo + 1 parses as (*foo) + 1.

    One that has gotten me more than once in Python (I really don’t code Python that much) is “1 + 2 if True else 3”. I keep thinking the parenthesis are “1 + (2 if True else 3)”, but it’s actually (1 + 2). Or am I lying to you and it’s the other way around?! I don’t know, why don’t you go check the Python interpreter :)

    • jolmg 10 hours ago

      Conditional expressions/operators, including e.g. the `?:` ternary operator in C-like languages, typically have about the lowest precedence, higher only than the assignment operators (and the comma operator in C-like languages). It's not just a Python thing; you'll find `+` having higher precedence basically everywhere.

      Think of

        x = 1 + 2 if True else 3
      
      like a shorthand for:

        if True:
          x = 1 + 2
        else:
          x = 3
      
      which can be a common pattern in languages that don't really have conditional expressions, like bash.
    • II2II 10 hours ago

      > Is this really a difficult precedence issue? It seems quite obvious to me that foo + 1 parses as (foo) + 1.

      Keep in mind that precedence rules are arbitrary constructs, typically based upon what the rule maker perceived as more convenient. Perceptions will vary from person to person, so there is no objective obvious about them. Heck, there isn't even anything obvious about infix notation (see Forth or Lisp). Or, in the case of unary operators, it isn't obvious that the operator should come before or after the object it is operating on (consider how we negate as a prefix, while factorial is a suffix).

      • johnfn 5 hours ago

        Do you think ++foo + 1 parses as ++(foo + 1)? Despite what you say, this seems obvious to me.

    • magicalhippo 10 hours ago

      I really dislike relying on precedence when it's more complicated than a few terms of basic arithmetic.

      Parentheses are free and makes it absolutely clear what the intention is.

    • afiori 14 hours ago

      I have often felt this doubt but the only two cases where my intuition was actually wrong was with `new` operators and php's ternary operator

    • ErroneousBosh 14 hours ago

      Yes but as I said yesterday on another post, "Yngwie Malmsteen Code".

      You could write it clearly by saying foo[1] instead of *(foo+1) which is what they ended up doing, but hey, pointer arithmetic looks complicated and clever, so let's show off with a WEEDLYWEEDLYWEEDLY guitar solo bit of code.

      • quietbritishjim 29 minutes ago

        When you are manipulating (mostly) one pointed-to element at a time, and incrementing the pointer itself in between, then that's quite a different mindset compared to using an index into an array. I agree that the subscript operator is the cleanest solution here. My point is just that I think it was missed because it's easy to overlook, rather than because, as you say, someone is deliberately trying to be too clever.

  • bsder 17 hours ago

    Even as someone who doesn't mind writing in C, I would absolutely flag that function as way, way, way, way too terse for no good reason.

    The code gets ridiculously easier to read if you write src[0] and src[1] instead of (*src) and (*(src+1)). And, as a bonus, the precedence problem disappears.

    I really don't understand people why write C code like the original code. It's just asking for a bug.

    • shash 17 hours ago

      In general avoid frivolous use of pointer arithmetic. foo[k] and *(foo+k) will usually generate identical asm, and the former is just easier to read…

      • danhau 9 hours ago

        Usually? I‘m willing to bet it will always. I wouldn‘t be surprised if the standard even specifies these two to be identical.

        • 1718627440 6 hours ago

          a[b] is defined as syntactic sugar for *(a+b), so yes.

      • cozzyd 8 hours ago

        And if you want to go for eclectic, you can do [k]foo

        • loeg an hour ago

          You mean k[foo].

          • cozzyd 4 minutes ago

            Yes indeed.

    • hyghjiyhu 8 hours ago

      For me it would be natural to use src[1] for an array but *(src+1) for an iterator pointer.

    • 17 hours ago
      [deleted]
    • commandersaki 14 hours ago

      I know the precedence because I memorise the terse implementation of strcpy, but (also) write *src++ as *(src++) and even though it remains a bit of a mouthful the parenthesis ensures there is no doubt how to evaluate the expression.

    • 15 hours ago
      [deleted]
    • stevage 15 hours ago

      To be fair, OP rewrote it as `src[1]`. It sounds like this was old code, maybe they weren't a good coder when they started.

    • spyrja 16 hours ago

      And for the love of God, please don't do 0[src].

  • loeg 18 hours ago

    There's another bug, where you don't update psrc in those error return cases. The parser will be stuck at the malformed % forever. Or maybe that is desired; it's hard to tell.

    The precedence stuff here is pretty basic. When in doubt, using parentheses to make order explicit is never wrong. Or consult https://en.cppreference.com/w/c/language/operator_precedence... .

    • augustk 10 hours ago

      Better stick to a single point of exit.

      https://news.ycombinator.com/item?id=20311080

      • foofoo12 2 hours ago

        There is nothing wrong with early exit, but you have to be sensible.

        Just like there's nothing wrong with the ternary operator if you are sensible. I've seen nested ?: abominations that would make Jesus give you a funny look.

      • loeg an hour ago

        Well, maybe. Could also just use psrc directly instead of doing the manipulation in src.

  • Surac 12 hours ago

    I have a little cheat sheet sticking on the side of my monitor that I update everything I have to look up some syntax. There I have a table on operator precedence and a reminder when to use & and &&. I never need the sheet but I always have ist in clear eyesight

    • abanana 11 hours ago

      > I never need the sheet

      Often, just the act of making a cheat sheet somehow helps to fix the principles in the brain, so you rarely if ever need to refer to it afterwards. Something I've personally found multiple times anyway.