You can use C-Reduce for any language

(bernsteinbear.com)

162 points | by Tomte 3 hours ago ago

27 comments

  • asmeurer 2 hours ago

    A paper by the authors John Regehr et al. from 2012 explaining how it works https://fsl.cs.illinois.edu/publications/regehr-chen-cuoq-ei....

    • gnulinux 2 hours ago

      I read this paper and I still feel lost as to how can this even be possible. It seems to understand how to tokenize, merge lines, remove tokens etc for arbitrary programming languages. Is there another paper that explains this algorithm alone?

      • IshKebab an hour ago

        It basically runs transformation passes over the code until they stop changing the code or break its behaviour.

        It seems like a small number of the passes are not specific to C:

        https://github.com/csmith-project/creduce/blob/master/creduc...

        `"C" => 1,` means it is only for C.

        I would guess `pass_lines` is the most important for non-C code; I'm guessing (it's written in unreadable Perl) that it removes lines.

        So while it can work for languages other than C, most of its features are C-specific so it's not going to work nearly as well. Still I'd never heard of C-Reduce; pretty cool tool!

        Someone make one based on Tree Sitter, quick!

    • nyrikki 27 minutes ago

      This paper is not about C-Reduce as a whole but 3 new "domain-specific test-case reducers" that the authors added to the project.

      Most of the non-domain specific reductions in C-Reduce is simply brute force IIRC.

  • lambda an hour ago

    Note that as recommended by John Regehr, author of C-Reduce, for this use case you might also want to try Shrinkray, a tool that was written to be format independent and works well for cases that C-Reduce dow not: https://mastodon.social/@regehr/113489759789563570

  • graypegg 2 hours ago

    Found this article with examples of before and after: https://pramodkumbhar.com/2024/01/c-reduce-systematically-ta...

    But still having trouble understanding how it knows WHAT to remove when trying each iteration. There must be some tokenization going on, but then I don't know how that would work across programming languages

  • jchw 2 hours ago

    I did not know about C-Reduce until just now, and I'm already hooked. This is basically like discovering git bisect for the first time.

    I'll have to stuff that into the back of my mind until I run into something where it might fit.

  • wood_spirit 2 hours ago

    Creduce is awesome.

    I used a test script that spent hours using CSmith to generate random test programs when developing an esoteric llvm target. When they crashed it automatically Creduced and left them for inspection. Invaluable!

  • def- an hour ago

    Works well for SQL too, I‘ve been using it at my day job, found out via https://github.com/sqlancer/sqlancer?tab=readme-ov-file#redu...

  • judofyr an hour ago

    Well, aren't you going to share the reduced file?

    Okay fine, let's see for ourselves:

        # Setup
        git clone git@github.com:RustPython/RustPython.git && cd RustPython && cargo build --release
        git clone git@github.com:tekknolagi/scrapscript.git && cp scrapscript/scrapscript.py .
    
        # Download interesting.sh, replace the path to RustPython
        chmod +x interesting.sh
    
        # Command in article:
        nix run nixpkgs#creduce -- --not-c interesting.sh scrapscript.py
    
    
    Niiice:

        (93.2 %, 13717 bytes)
        (93.2 %, 13683 bytes)
        (93.3 %, 13614 bytes)
        (93.3 %, 13592 bytes)
        (93.3 %, 13571 bytes)
        (93.3 %, 13517 bytes)
        (93.3 %, 13449 bytes)
        (93.4 %, 13412 bytes)
        (93.4 %, 13365 bytes)
        (93.4 %, 13333 bytes)
        (93.4 %, 13313 bytes)
    
    
    It seems to have stopped at "(96.4 %, 7347 bytes)" with the following output: https://gist.github.com/judofyr/47cba8a20cb2cd5798943ef975d0...
  • furyofantares 2 hours ago

    Without an explanation of why it works on languages other than C, that is a hard claim to believe! I mean, I don't think they're lying but (given it doesn't use an LLM) I am bewildered.

    • wiml 2 hours ago

      I recommend skimming the PLDI paper linked by asmeurer, it has a good summary. Some of its transforms are pretty C-specific, using the Clang frontend; some are pretty generic and probably work on any Algol-descended language. It's meant to be a modular tool, so you could add transforms that understand other languages if you like.

    • bqmjjx0kac 2 hours ago

      Yes, and without understanding how it works, I'm left wondering whether it's even safe to use this way. Will creduce execute mutated versions of the input script, potentially deleting my files or eating my lunch?

      • None4U 2 hours ago

        it will execute whatever script you pass it (immutable) and mutate whatever other files you pass it

        • defen an hour ago

          I think the question is, how do you know C-reduce isn't going to mutate your test case into calling `remove("/path/to/my/important/file");`

          • frabert 10 minutes ago

            C-reduce is meant to... Reduce your files, it would not add stuff that was not there in the first place. Also, I think it's meant to only be run against the "frontend" of most languages, not full execution

    • Kuinox 2 hours ago

      Without looking at the paper, I would guess it would be like a fuzzer that apply mutations that reduce the size of the input.

      • gnulinux an hour ago

        Isn't that potentially unsafe though? Random permutations can reduce arbitrary programs to destructive programs, in particular any program can be mutated to `rm -rf /` given enough time. Also even the problem of finding syntactically valid programs is combinatoric, it's pretty surprising that it can go toward a local optima in such a short time without having any understanding of the function or its derivatives.

        • jewel an hour ago

          For compiled languages it should be fine, as you're only going to compile the permuted source code, not execute it.

          Given a sufficiently bad compiler bug anything is possible, but I think given that it's trying to minimize the size of an input that gives a particular output I don't think it's likely to explore distant branches.

          • Quekid5 an hour ago

            > Given a sufficiently bad compiler bug anything is possible, ...

            I can't help but wonder about the consteval/constexpr machinery in the various C++ compilers... It'd be interesting to see how much adversarial input has been tested for those.

            (I guess Zig's comptime might also be relevant, but I'm not sure what that's allowed to do. Maybe execute arbitrary code?)

            ... anyway just a stray thought.

    • johnisgood 2 hours ago

      Yeah, I would have liked to see more examples and an explanation.

      • evmar 2 hours ago

        I glanced at the code. It seems to have multiple possible "passes" which reduce the code in various ways, and the passes here not tagged with "C"=>1 are used in the not-c mode recommended in the post.

        https://github.com/csmith-project/creduce/blob/31e855e290970...

        You can see the pass implementation in files on the left.

        • wat10000 2 hours ago

          The key thing is that the transforms it makes aren’t required to always produce a program that can actually run, or even build.

          The user provides a script which determines if a program is “interesting.” A program with build-time errors should be considered “not interesting” in most cases. (And if you’re hunting an incorrectly reported error, you’ll want to make sure you only consider that error to be interesting.)

          Then it yoinks out parts of your program and checks if the result is still interesting. If it’s not, that attempt is discarded and it tries something else. If the reduced program is still interesting, it will try yoinking out more stuff. Repeat until you like the result.

          There doesn’t need to be any understanding of the program in order for this to work. You just need something where removing some bit of the program has a chance of producing a new program that’s still interesting. That works in most programming languages.

          This process can take a long time, and it’s slower when there’s a lower probability of a removal producing an interesting program. Thus heuristics are added: don’t remove random character ranges, but work at a token level. When removing a brace, find the matching brace and remove the insides. That sort of thing. Most modern languages are similar enough to C that many of these rules are helpful for them too. And even if they don’t help, this just slows the process down, but it still works.

    • keybored 2 hours ago

      > I mean, I don't think they're lying but (given it doesn't use an LLM) I am bewildered.

      I’m pretty sure (based on the last time I saw this) that this is just good old fashioned computer science.[1]

      I really hope that HN hasn’t forgotten about good old fashioned computer science stuff already.

      [1] All the algorithm and whatnot stuff, not the spooky machine learning stuff. Even things like Prolog-for-AI, although that has the slight downside of not working (for the purposes of making AI).

      • furyofantares 30 minutes ago

        To be clear my comment was meant to be an awareness that it is good old fashioned computer science. Without LLMs, which this predates, it is surprising to me that you'd have a lot of success reducing a program in an arbitrary language and still having something that's valid syntax!

        I do get that it will reject a lot of stuff as not working (and has to even in the target language) and I also get that algol-like languages are all very similar, but I am still surprised that it works well enough to function on ~arbitrary languages.

        These are very LLM-era properties for a program to have. The question is not "does it work for language x" but "how well does it work for language x", and the answer is not "is it one of the languages it was designed for" but instead "idunno try it out and see".

  • andrewchambers an hour ago

    I wrote a general purpose delta debugger in python - https://github.com/andrewchambers/ddmin-python. Can be used to do the same thing.