Making memcpy(NULL, NULL, 0) well-defined

(developers.redhat.com)

234 points | by gslin 10 months ago ago

262 comments

whytevuhuni 10 months ago

How interesting. GCC does indeed remove that branch.

https://godbolt.org/z/aPcr1bfPe

[-]

ndesaulniers 10 months ago

> For example, GCC will happily remove the dest == NULL branch in the following code

I think the blog should mention `-fno-delete-null-pointer-checks`

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#ind...

[-]

AceJohnny2 10 months ago

> -fdelete-null-pointer-checks

> [...]

> This option is enabled by default on most targets.

What a footgun.

I understand that, in an effort to compete with other compilers for relevance, GCC pursued performance over safety. Has that era passed? Could GCC choose safer over fast?

Alternatively, has someone compiled a list of flags one might want to enable in latest GCC to avoid such kinds of dangerous optimizations?

[-]

comex 10 months ago

Just for the record, that's not the main purpose of -fdelete-null-pointer-checks.

Normally, it only deletes null checks after actual null pointer dereferences. In principle this can't change observable behavior. Null dereferences are guaranteed to trap, so if you don't trap, it means the pointer wasn't null. In other words, unlike most C compiler optimizations, -fdelete-null-pointer-checks should be safe even if you do commit undefined behavior.

This once caused a kerfuffle with the Linux kernel. At the time, x86_64 CPUs allowed the kernel to dereference userspace addresses, and the kernel allowed userspace to map address 0. Therefore, it was possible for userspace to arrange for null pointers to not trap when dereferenced in the kernel. Which meant that the null check optimization could actually change observable behavior. Which introduced a security vulnerability. [1]

Since then, Linux has been compiled with `-fno-delete-null-pointer-checks`, but it's not really necessary: Linux systems have long since enforced that userspace can't map address 0, which means that deleting null pointer checks should be safe in both kernel and userspace. (Newer CPU security features also protect the kernel even if userspace is allowed to map address 0.)

But anyway, I didn't know that -fdelete-null-pointer-checks treated "memcpy with potentially-zero size" as a condition to remove subsequent null pointer checks. That means that the optimization actually isn't safe! Once GCC is updated to respect the newly well-defined behavior, though, it should become truly safe. Probably.

The same can't be said for most UB optimizations – most of which can't be turned off.

[1] https://lwn.net/Articles/342330/

[-]

robinsonb5 10 months ago

> Null dereferences are guaranteed to trap, so if you don't trap, it means the pointer wasn't null.

[-]

dfe 10 months ago

I once spent hours if not days debugging a problem with some code I had recently written because of this exact optimization.

It wasn't an embedded system, but rather an x86 BIOS boot loader, which is sort of halfway there. Protected mode enabled without paging, so there's nothing to trap a NULL.

Completely by accident I had dereferenced a pointer before doing a NULL check. I think the dereference was just printing some integer, which of course had a perfectly sane-looking value so I didn't even think about it.

The compiler, I can't remember if it was gcc or clang by this point, decided that since I had already successfully dereferenced the pointer it could just elide the null check and the code path associated with it.

Finally I ran it in VMware and attached a debugger, which skipped right over the null check even though I could see in the debugger the value was null. So then I went to look at the assembly the compiler generated, and that's when I started to understand what had happened.

It was a head-slapper when I found the dereference above. I added a second null check or moved that code or some such, and that was it.

[-]

pjmlp 10 months ago

Now map the hours and days spent into actual money, being taken from project budget, and then you realise why some business prefer some languages over others.

rcxdude 10 months ago

There was a more egregoius one which got Linus further pissed off with GCC, which was due to a 'dereference' that would not trap, but still deleted a later null check (because e.g. int *foo = &bar->baz is basically just calculating an offset to bar, and so will not fail at runtime, but it is still a dereference according to the abstract machine and so is undefined if bar is NULL). I think the risk of something like that is why it's still disabled.

ryao 10 months ago

Usually, when one marks an argument as nonnull via a function attribute, one wants NULL checks to be removed.

[-]

ndesaulniers 10 months ago

There's two similar but distinct function attributes for nullability. One affects codegen, one affects diagnostics only.

[-]

ryao 10 months ago

Which are those? I only know about nonnull, nonnull_if_nonzero and returns_nonnull:

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attribute...

AceJohnny2 10 months ago

Irrelevant, because delete-null-pointer-checks happens even in absence of nonnull function attribute, see GP's godbolt link, and the documentation that omits any reference to that function attribute.

That's what makes it dangerous!

[-]

ryao 10 months ago

That is a side effect of passing the pointer as a function parameter marked nonnull. It implies that the pointer is nonnull and any NULL checks against it can be removed. Pass it to a normal function and you will not see the NULL check removed.

mjg59 10 months ago

Explanation for the above: passing NULL as the destination argument to memcpy() is undefined behaviour at present. gcc assumes that the fact that memcpy() is called therefore means that the destination argument can't be NULL, so "knows" that the dest == NULL check can never be true, and so removes the test and the do_thing1() branch entirely.

Interestingly, replacing len in the memcpy() call results in gcc instead removing the memcpy() call and retaining the check - presumably a different optimisation routine decides that it's a no-op in that case. https://godbolt.org/z/cPdx6v13r is, therefore, interesting - despite this only ever calling test() with a len of 0, the elision of the dest == NULL check is still there, but test() has been inlined without the memcpy (because len == 0) but with do_thing2() (because the behaviour is undefined and so it can assume dest isn't NULL even though there's a NULL literally right there!)

Fucking compilers, man.

[-]

jpollock 10 months ago

How does gcc infer anything about memcpy? Can't I replace the c-library memcpy with my own, so how does it know that dest == NULL can never be true?

[-]

ryao 10 months ago

You can, but gcc may replace it with an equivalent set of instructions as a compiler optimization, so you would have no guarantee it is used unless you hack the compiler.

On a related note, GCC optimizing away things is a problem for memset when zeroing buffers containing sensitive data, as GCC can often tell that the buffers are going to be freed and thus the write is deemed unnecessary. That is a security issue and has to be resolved by breaking the compiler’s optimization through a clever trick:

https://github.com/openzfs/zfs/commit/d634d20d1be31dfa8cf06e... 12352

Similarly, GCC may delete a memcpy to a buffer about to be freed, although I have never observed that as you generally don’t do that in production code.

[-]

sfink 10 months ago

> Similarly, GCC may delete a memcpy to a buffer about to be freed, although I have never observed that as you generally don’t do that in production code.

It's not that crazy. You could have a refcounted object that poisons itself when the refcount drops to zero, but doesn't immediately free itself because many malloc implementations can have bad lock contention on free(). So you poison the object to detect bugs, possibly only in certain configurations, and then queue the pointer for deferred freeing on a single thread at a better time.

(Ok, this doesn't quite do it: poisoning is much more likely to use memset than memcpy, but I assume gcc would optimize out a doomed memset too?)

[-]

Chaosvex 10 months ago

Yes, it potentially could be optimised out, which is why platforms provide functions like SecureZeroMemory() for cases where you want to be sure that memory is zeroed out.

ryao 10 months ago

That would be why I introduced an explicit_memset() into the OpenZFS encryption module in the commit that I linked. It uses two different techniques to guard against the compiler deleting it.

mjg59 10 months ago

The valid inputs to memcpy() are defined by the C specification, so the compiler is free to make assumptions about what valid inputs are even if the library implementation chooses to allow a broader range of inputs

int_19h 10 months ago

Per ISO C, the identifiers declared or defined with external linkage by any C standard library header are considered reserved, so the moment you define your own memcpy, you're already in UB land.

MindSpunk 10 months ago

Many standard C functions are treated as “magic” by compilers. Malloc is treated as if it has no side effects (which of course it does, it changes allocator state) so the optimiser can elide allocations. If not you wouldn’t be able to elide the call because malloc looks like it has side effects, which it does but not ones we care about observing.

[-]

gpderetta 10 months ago

Not only that, malloc is also assumed to return pointer that don't alias anything else.

0xffff2 10 months ago

If I'm understanding the OP correctly, the C standard says so, i.e. the semantics of memcpy are defined by the standard and the standard says that it's UB to pass NULL.

[-]

tialaramex 10 months ago

Unlike all the more complicated languages the "freestanding" mode C doesn't even have a memcpy feature, so it may not define how one works - maybe you've decided to use the name "memcpy" for your function which generates a memorandum about large South American rodents, and "memo_capybara" was too much typing.

In something like C++ or Rust, even their bare metal "What do you mean Operating System?" modes quietly require memcpy and so on because we're not savages, clearly somebody should provide a way to copy bytes of memory, Rust is so civilised that even on bare metal (in Rust's "core" library) you get a working sort_unstable() for your arbitrary slice types!

[-]

bonzini 10 months ago

The compiler is free to give a meaning to memcpy if run in the (default) hosted mode. There's -ffreestanding for freestanding environments.

[-]

tialaramex 10 months ago

Right, though I guess I wasn't clear enough about that for the down voters, but whatever.

bonzini 10 months ago

If you do so you have to add -fno-builtins (or just -fno-builtin-memcpy).

mpweiher 10 months ago

> that memcpy() is called therefore means that the destination argument can't be NULL

The whole idea that undefined behavior cannot happen and you can therefore do optimization based on "knowing" it cannot happen is incredibly bonkers.

[-]

UncleMeat 10 months ago

Imagine this program.

   int foo() {
     int x = 1;
     havoc();
     return x;
   }

Can this function be compiled to store x in a register? Can it be compiled to remove x entirely and return the constant 1? That relies on "knowing that undefined behavior cannot happen." This program will behave differently if we store x on the stack and then return it after we call havoc() than if we call havoc() and then return the constant 1, if havoc() just writes to out of bounds memory addresses or whatever.

In this case the undefined behavior just feels "more extreme" to most people, but it is remarkably hard for people to rigorously define the undefined behavior that should and should not be considered when making optimizations.

[-]

mpweiher 10 months ago

> That relies on "knowing that undefined behavior cannot happen."

No it doesn't.

[-]

UncleMeat 10 months ago

Yes it does. The optimizing this to return the constant 1 is not producing an equivalent program unless we make assumptions about the behavioral bounds of havoc().

[-]

mpweiher 10 months ago

That’s not at all the same thing.

[-]

UncleMeat 10 months ago

What is the difference between "writing past the end of an array is UB" and "dereferencing a null pointer is UB" and "passing null as the destination argument to memcpy is UB"? The two programs I listed above are only observationally equivalent if writing past the end of valid allocations is UB.

A core problem with this discussion in almost all circumstances is that people have a vibe for which of these things it feels okay for a compiler to make logical deductions from and which it feels not okay but if you actually sit down and try to formalize this in a way that would be meaningful to compiler vendors, you can't.

[-]

mpweiher 10 months ago

You are still completely missing the point.

This example is not "I know that UB doesn't happen, therefore ...", which is what the memcpy() case is.

It is "I don't care that UB might happen, I am going to act as if it didn't. If the UB then makes the program behave differently than without the UB, that's not my problem".

Which, incidentally, is one of the suggested/permitted responses to UB in the standards text (that was made non-binding).

nayuki 10 months ago

> Fucking compilers, man.

They're just acting as agents that derive the logical consequences of the code.

The fact that the given example code is "surprising" is analogous to this mathematical derivation:

    a = b
    a*a = b*a
    a*a - b*b = b*a - b*b
    (a - b)(a + b) = b(a - b)
    (a - b)(a + b)/(a - b) = b(a - b)/(a - b)
    ^ Divide by 0, undefined behavior!
    Everything below is not necessarily true.
    a + b = b
    b + b = b
    2b = b
    2 = 1
    2 - 1 = 1 - 1
    1 = 0

The source of truth about what is/isn't allowed is the C standard, not your personal simplified model of it that may contain dangerous misconceptions. The fact that your mental model doesn't match the document is an education problem, not a problem with the compiler.

[-]

marssaxman 10 months ago

> They're just acting as agents that derive the logical consequences of the code.

In a particularly pedantic, uptight, and sometimes un-helpful way, yes.

Compilers don't have to be designed this way; in fact it is a relatively recent development in the history of such tools.

saurik 10 months ago

> The fact that your mental model doesn't match the document is an education problem, not a problem with the compiler.

Or it is a problem with the document, which is the entire reason we are having this discussion: N3322 argued the document should be fixed, and now it will be for C2y.

badmintonbaseba 10 months ago

I just skimmed through the proposed wording in [N3322]. It looks like it silently fixes a defect too, NULL == NULL was also undefined up until C23. Hilarious.

[N3322] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3322.pdf

[-]

mananaysiempre 10 months ago

This is probably related to the issue with NULL - NULL mentioned in the article.

Imagine you’re working in real mode on x86, in the compact or large memory model[1]. This means that a data pointer is basically struct{uint16_t off,seg;} encoding linear address (seg<<4)+off. This makes it annoying to have individual allocations (“objects”) >64K in size (because of the weird carries), so these models don’t allow that. (The huge model does, and it’s significantly slower.) Thus you legitimately have sizeof(size_t) == 2 but sizeof(uintptr_t) == 4 (hi Rust), and God help you if you compare or subtract pointers not within the same allocation. [Also, sizeof(void *) == 4 but sizeof(void (*)(void)) == 2 in the compact model, and the other way around in the medium model.]

Note the addressing scheme is non-bijective. The C standard is generally careful not to require the implementation to canonicalize pointers: if, say, char a[16] happens to be immediately followed by int b[8], an independently declared variable, it may well be that &a+16 (legal “one past” pointer) is {16,1} but &b is {0,2}, which refers to the exact same byte, but the compiler doesn’t have to do anything special because dereferencing &a+16 is UB (duh) and comparing (char *)(&a+16) with (char *)&b or subtracting one from the other is also UB (pointers to different objects).

The issue with NULL == NULL and also with NULL - NULL is that now the null pointer is required to be canonical, or these expressions must canonicalize their operands. I don’t know why you’d ever make an implementation that has non-canonical NULLs, but I guess the text prior to this change allowed such.

[1] https://devblogs.microsoft.com/oldnewthing/20200728-00/?p=10...

[-]

amluto 10 months ago

> now the null pointer is required to be canonical

Yikes! This particular oddity seems annoying but sort of harmless in x86 real mode, but not necessarily in protected mode. Imagine code that wants to load a pointer into a register: it loads the offset into an ordinary register and the selector portion into a segment register. It’s permissible to load the 0 (null) selector, but loading garbage will fault immediately. So, if you allow non canonical NULL, then knowing that a pointer is either valid or NULL does not allow you to hoist a segment load above a condition that might mean you never actually dereference the pointer.

(I have plenty of experience with low-level OS code in all kinds of nasty x86 modes but, thankfully, not so much experience writing ordinary C code targeting protected mode. It sometimes boggles my mind that anyone ever got decent performance with anything involving far data pointers. Segment loads are slow, and there are not a lot of segment registers to go around.)

[-]

bonzini 10 months ago

In real mode assembly days, ES and sometimes DS were just another base register that you could use in a loop. Given the dearth of addressing modes it was quite nice to assume that large arrays started at xxxx0h and therefore that the offset part of the far pointer was zero.

pm215 10 months ago

If so, it's one that's been introduced at some point post C99 -- the C99 spec explicitly defines the behaviour of NULL == NULL. Section 6.5.9 para 6 says "Two pointers compare equal if and only if both are null pointers, both are pointers to the same object [etc etc]".

[-]

dwattttt 10 months ago

I don't imagine NULL is defined as "pointing to an object", so I don't expect that clause to apply.

[-]

tsimionescu 10 months ago

You completely skipped over the first part: "Two pointers compare equal if and only if both are null pointers"

[-]

lelanthran 10 months ago

> You completely skipped over the first part: "Two pointers compare equal if and only if both are null pointers"

Maybe he elided it in an optimisation pass?

dwattttt 10 months ago

Can't get much more of a reading comprehension failure than that. Good thing I don't write compilers.

[-]

tsimionescu 10 months ago

Happens to everyone, don't worry, especially when trying to focus on details, ironically...

nikic 10 months ago

NULL == NULL was already defined -- but NULL <= NULL wasn't :)

[-]

badmintonbaseba 10 months ago

My mistake.

IWeldMelons 10 months ago

Cannot find any confirmation to your statement. Otoh "All null pointer values (of compatible typewithin the same address space) are already required to compare equal. " in the limked paper.

[-]

PaulDavisThe1st 10 months ago

NULL is not single type in any conventional sense (and is actually tricky to define in a way that makes it usable in the way most programmers expect).

Thus:

  T1* a = NULL;
  T2* b = NULL
  a == b; /* may be undefined at present, depending on the nature of T1 & T2 */

[-]

IWeldMelons 10 months ago

"NULL" in fact is a macro, not a part of the language. null (zero pointer) is, and it is explicitly defined in standard, that comparison of two null pointers lead to equality. You example simply won't compile, it is not undefined; the pointers simply are of different type, period.

here what standard says:

"A pointer to void may be converted to or from a pointer to any object type.

Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal."

therefore, convert any of them or both to void amd compare. you'll get equality.

10 months ago

[deleted]

voidUpdate 10 months ago

I feel like I've misunderstood something here... shouldn't memcpy(anything, anything, 0) just do nothing, because you're copying 0 bytes?

[-]

mjg59 10 months ago

That's a reasonable intuitive interpretation of how it should behave, but according to the spec it's undefined behaviour and compilers have a great degree of freedom in what happens as a result.

[-]

david-gpu 10 months ago

More information on this behavior in the link below.

> Note that, apart from contrived examples with deleted null checks, the current rules do not actually help the compiler meaningfully optimize code. A memcpy implementation cannot rely on pointer validity to speculatively read because, even though memcpy(NULL, NULL, 0) is undefined, slices at the end of a buffer are fine. [And if the end of the buffer] were at the end of a page with nothing allocated afterwards, a speculative read from memcpy would break

https://davidben.net/2024/01/15/empty-slices.html

[-]

Someone 10 months ago

> [And if the end of the buffer] were at the end of a page with nothing allocated afterwards, a speculative read from memcpy would break

‘Only’ on platforms that have memory protection hardware. Even there, the platform can always allocate an overflow page for a process, or have the page fault handler check whether the page fault happened due to a speculative read, and repair things (I think the latter is hugely, hugely, hugely impractical, but the standard cannot rule it out)

[-]

immibis 10 months ago

Platforms without memory protection hardware also have no problem reading NULL.

[-]

Someone 10 months ago

My comment is a reply to (part of) a comment that isn’t talking about reading from NULL. That’s what the [And if the end of the buffer] part implies.

Even if it didn’t, I don’t think the standard should assume that “Platforms without memory protection hardware also have no problem reading NULL”

An OS could, for example, have a very simple memory protection feature where the bottom half of the memory address range is reserved for the OS, the top half for user processes, and any read from an address with the high bit clear by code in the top half of the address range traps and makes the OS kill the process doing the read.

[-]

BenjiWiebe 10 months ago

Doesn't it take memory protection hardware to trap on a memory read?

[-]

lmm 10 months ago

As a philosophical matter, by definition that would be memory protection hardware, sure. But the point is that it's at least conceivable that some platforms might have some crude, hardwired memory protection without having a full MMU.

kevin_thibedeau 10 months ago

They may also expect writes to address 0.

hun3 10 months ago

Not really. MMIO mapped at 0x0 for example.

[-]

david-gpu 10 months ago

Yikes! I would love sipping coffee watching the chief architect chew up whoever suggested that. That sounds awful even on a microcontroller.

[-]

bonzini 10 months ago

On s390 the memory at address 0 (low core) has all sorts of important stuff. Of course s390 has paging enabled pretty much always but still...

colejohnson66 10 months ago

AVR’s registers are mapped to address 0. So reading and writing NULL is actually modifying r0.

[-]

formerly_proven 10 months ago

AVR’s r0 is also a totally normal register, unlike most other RISC which typically have r0 == 0.

[-]

david-gpu 10 months ago

Thanks for saving me a search, because I was expecting r0 to be hardcoded to zero.

Sometimes hardware is designed with insufficient input from software folks and the result is something asinine like that. That, or some people like watching the world burn.

Zondartul 10 months ago

What does "speculative" mean in this case? I understand it as CPU-level speculative execution a.k.a. branch mis-prediction, but that shouldn't have any real-world effects (or else we'd have segfaults all the time due to executing code that didn't really happen)

[-]

dwattttt 10 months ago

Turns out you can have that kind of speculative failure too! https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-d...

voidUpdate 10 months ago

Why didn't they just... define it, back when they wrote it?

[-]

larschdk 10 months ago

When C was conceived, CPU architectures and platforms were more varied than what we see today. In order to remain portable and yet performant, some details were left as either implementation defined, or completely undefined (i.e. the responsibility of the programmer). Seems archaic today, but it was necessary when C compilers had to be two-pass and run in mere kilobytes of RAM. Even warnings for risky and undefined behavior is a relatively modern concept (last 10-20 years) compared to the age of C.

[-]

actionfromafar 10 months ago

When C was conceived, it was made for a specific DEC CPU, for making an operating system. The idea of a C standard was in the future.

If you wanted to know what (for instance) memcpy actually did, you looked at the source code, or even more likely, the assembler or machine code output. That was "the standard".

[-]

da_chicken 10 months ago

I think it's reasonable to assume that GP clearly meant the C standard being conceived, as, obviously, K&R's C implementation of the language was ad hoc rather than exhibiting any prescribed specification.

anticensor 10 months ago

No, K&R's book was the standard.

[-]

actionfromafar 10 months ago

First came the language, then a few years later they described it in a book.

scoutt 10 months ago

> Seems archaic today ... run in mere kilobytes of RAM

There is an entire industry that does pretty much that... today. They might run in flash instead of RAM, but still, a few kilobytes.

Probably there are more embedded devices out there than PCs. PIC, AVR, MSP, ARM, custom archs. There might be one of those right now under your hand, in that thing you use to move the cursor.

[-]

krisoft 10 months ago

> There is an entire industry that does pretty much that... today.

Which industry runs C compilers on embeded devices? Because that is what the part you elipsised out was talking about.

[-]

scoutt 10 months ago

Oh... yes. You are right. My bad.

sitzkrieg 10 months ago

many do tho. i have targetted c89 and maybe c99 on several embedded devices

[-]

vlovich123 10 months ago

They cross compile. No one is compiling code on these machines.

0xffff2 10 months ago

But you're running the compiler on the device rather than cross-compile?

Narishma 10 months ago

I doubt you're running C compilers on those devices.

killerstorm 10 months ago

From what I understand:

1. Initially, they just wanted to give compiler makers more freedom: both in the sense "do whatever is simplest" and "do something platform-specific which dev wants". 2. Compiler devs found that they can use UB for optimization: e.g. if we assume that a branch with UB is unreachable we can generate more efficient code. 3. Sadly, compiler devs started to exploit every opportunity for optimization, e.g. removing code with a potential segfault.

I.e. people who made a standard thought that compiler would remove no-op call to memcpy, but GCC removes the whole branch which makes the call as it considers the whole branch impossible. Standard makers thought that compiler devs would be more reasonable

[-]

kllrnohj 10 months ago

> Standard makers thought that compiler devs would be more reasonable

This is a bit of a terrible take? Compiler devs never did anything "unreasonable", they didn't sit down and go "mwahahaha we can exploit the heck out of UB to break everything!!!!"

Rather, repeatedly applying a series of targeted optimizations, each one in isolation being "reasonable", results in an eventual "unreasonable" total transformation. But this is more an emergent property of modern compilers having hundreds of optimization passes.

At the time the standards were created, the idea of compilers applying so many optimization passes was just not conceivable. Compilers struggled to just do basic compilation. The assumption was a near 1:1 mapping between code & assembly, and that just didn't age well at all.

[-]

LegionMammal978 10 months ago

One could argue that "optimizing based on signed overflow" was an unreasonable step to take, since any given platform will have some sane, consistent behavior when the underlying instructions cause an overflow. A developer using signed operations without poring over the standard might have easily expected incorrect values (or maybe a trap if the platform likes to use those), but not big changes in control flow. In my experience, signed overflow is generally the biggest cause of "they're putting UB in my reasonable C code!", followed by the rules against type punning, which are violated every day by ordinary usage of the POSIX socket functions.

[-]

kllrnohj 10 months ago

> One could argue that "optimizing based on signed overflow" was an unreasonable step to take

That optimization allows using 64-bit registers / offset loads for signed ints which it can't do if it has to overflow, since that overflow must happen at 32-bits. That's not an uncommon thing.

uecker 10 months ago

I started to like signed overflow rules, because it is really easy to find problems using sanitizers.

The strict aliasing rules are not violated by typical POSIX socket code as a cast to a different pointer type, i.e. `struct sockaddr` by itself is well-defined behavior. (and POSIX could of course just define something even if ISO C leaves it undefined, but I don't think this is needed here)

[-]

LegionMammal978 10 months ago

> The strict aliasing rules are not violated by typical POSIX socket code as a cast to a different pointer type, i.e. `struct sockaddr` by itself is well-defined behavior.

Basically all usage of sendmsg() and recvmsg() with a static char[N] buffer is UB, is one big example I've run into. Unless you memcpy every value into and out of the buffer, which literally no one does. Also, reading sa_family from the output of accept() (or putting it into a struct sockaddr_storage and reading ss_family) is UB, unless you memcpy it out, which literally no one does.

[-]

uecker 10 months ago

Using a static char buffer would indeed UB but we just made the change to C2Y that this ok (and in practice it always was). Incorrect use of sockaddr_storage may lead to UB. But again, most socket code I see is actually correct.

lmm 10 months ago

> Compiler devs never did anything "unreasonable", they didn't sit down and go "mwahahaha we can exploit the heck out of UB to break everything!!!!"

Many compiler devs are on record gleefully responding to bug reports with statements on the lines of "your code has undefined behaviour according to the standard, we can do what we like with it, if you don't like it write better code". Less so in recent years as they've realised this was a bad idea or at least a bad look, but in the '00s it was a normal part of the culture.

killerstorm 10 months ago

What stops compiler makers from treating UB as platform-specific behavior rather than as something which cannot happen?

"You are not allowed to do this, and thus..." reasoning assumes that programmers are language lawyers, which is unreasonable.

[-]

kllrnohj 10 months ago

    bool foo(some_struct* bar) {
        if (bar->blah()) {
            return true;
        }
        if (bar == nullptr) {
            return false;
        }
        return true;
    }

Can the compiler eliminate that nullptr comparison in your opinion yes or no? While this example looks stupid, after inlining it's quite plausible to end up with code in this type of a pattern. Dereferencing a nullptr is UB, and typically the "platform-specific" behavior is a crash, so... why should that if statement remain? And then if it can't remain, why should an explicit `_Nonnull` assertion have different behavior than an explicit deref? What if the compiler can also independently prove that some_struct->blah() always evaluates to false, so it eliminates that entire branch - does the `if (bar == nullptr)` still need to remain in that specific case? If so, why? The code was the same in both cases, the compiler just got better at eliminating dead code.

UncleMeat 10 months ago

There isn't a "find UB branches" pass that is seeking out this stuff.

Instead what happens is that you have something like a constant folding or value constraint pass that computes a set of possible values that a variable can hold at various program points by applying constraints of various options. Then you have a dead code elimination pass that identifies dead branches. This pass doesn't know why the "dest" variable can't hold the NULL value at the branch. It just knows that it can't, so it kills the branch.

Imagine the following code:

   int x = abs(get_int());
   if (x < 0) {
     // do stuff
   }

Can the compiler eliminate the branch? Of course. All that's happened here is that the constraint propagation feels "reasonable" to you in this case and "unreasonable" to you in the memcpy case.

[-]

meonukk 10 months ago

Why is it allowed to eliminate the branch? In most architectures abs(INT_MIN) returns INT_MIN which is negative

[-]

plorkyeran 10 months ago

Calling abs(INT_MIN) on twos-complement machine is not allowed by the C standard. The behavior of abs() is undefined if the result would not fit in the return value.

[-]

ryao 10 months ago

Where does it say that? I thought this was a famous example from formal methods showing why something really simple could be wrong. It would be strange for the standard to say to ignore it. The behavior is also well defined in two’s complement. People just don’t like it.

[-]

plorkyeran 10 months ago

https://busybox.net/~landley/c99-draft.html#7.20.6.1

"The abs, labs, and llabs functions compute the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. (242)"

242 The absolute value of the most negative number cannot be represented in two's complement.

Sohcahtoa82 10 months ago

I didn't believe this so I looked it up, and yup.

Because of 2's complement limitations, abs(INT_MIN) can't actually be represented and it ends up returning INT_MIN.

UncleMeat 10 months ago

It's possible that there is an edge case in the output bounds here. I'm just using it as an example.

Replace it with "int x = foo() ? 1 : 2;" if you want.

robinsonb5 10 months ago

> value constraint pass that computes a set of possible values that a variable can hold

Surely that value constraint pass must be using reasoning based on UB in order to remove NULL from the set of possible values?

Being able to disable all such reasoning, then comparing the generated code with and without it enabled would be an excellent way to find UB-related bugs.

[-]

UncleMeat 10 months ago

There are many such constraints, and often ones that you want.

"These two pointers returned from subsequent calls to malloc cannot alias" is a value constraint that relies on UB. You are going to have a bad time if your compiler can't assume this to be true and comparing two compilations with and without this assumption won't be useful to you as a developer.

There are a handful of cases that people do seem to look at and say "this one smells funny to me", even if we cannot articulate some formal reason why it feels okay for the compiler to build logical conclusions from one assumption and not another. Eliminating null checks that are "dead" because they are dominated by some operation that is illegal if performed on null is the most widely expressed example. Eliminating signed integral bounds checks by assuming that arithmetic operations are non-overflowing is another. Some compilers support explicitly disabling some (but not all) optimizations derived from deductions from these assumptions.

But if you generalize this to all UB you probably won't end up with what you actually want.

mjevans 10 months ago

More reasonable: Emit a warning or error to make the code and human writing it better.

NOT-reasonable: silently 'optimize' a 'gotcha' into behavior the programmer(s) didn't intend.

[-]

gpderetta 10 months ago

NOT-reasonable: expecting the compiler to read the programmer's mind.

[-]

mjevans 10 months ago

OK, you want a FORMAL version?

Acceptable UB: Do the exact same type of operation as for defined behavior, even if the result is defined by how the underlying hardware works.

NOT-acceptable UB: Perform some operation OTHER than the same as if it were the valid code path, EXCEPT: Failure to compile or a warning message stating which code has been transformed into what other operation as a result of UB.

[-]

gpderetta 10 months ago

I don't understand, if the operation is not defined, what exactly the compiler should do?

If I tell you "open the door", that implies that the door is there. If the door is not there, how would you still open the door?

Concretely, what do you expect this to return:

  #include <cstddef>
  void sink(ptrdiff_t);
  ptrdiff_t source();

  int foo() {    
    int x = 1;
    int y;
    sink(&y-&x);
    *(&y - source()) = 42;
    return x;
  }

assuming that source() returns the parameter passed to sink()?

Incidentally I had to launder the offset through sink/source, because GCC has a must-alias oracle to mitigate miscompiling some UB code, so in a way it already caters to you.

[-]

mjevans 10 months ago

Evaluated step by step...

Offhand, *sink(&y-&x);* the compiler is not _required_ to lay out variables adjacently. So the computation of the pointers fed to sink does not have to be defined and might not be portable.

It would be permissible for the compiler to refuse to compile that ('line blah, op blah' does not conform the the standard's allowed range of behavior).

It would also be permissible to just allow that operation to happen. It's the difference of two pointer sized units being passed. That's the operation the programmer wrote, that's the operation that will happen. Do not verify bounds or alter behavior because the compiler could calculate that the value happens to be PTRMAX-sizeof(int)+1 (it placed X and Y in reverse of how a naive assumption might assume).

The = 42 line might write to any random address in memory. Again, just compile the code to perform the operation. If that happens to write 42 somewhere in the stack frame that leads to the program corrupting / a segfault that's fine. If the compiler says 'wait that's not a known memory location' or 'that's going to write onto the protected stack!' it can ALSO refuse to compile and say why that code is not valid.

I would expect valid results to be a return of: 42, 1 (possibly with a warning message about undefined operations and the affected lines), OR the program does not compile and there is an error message which says what's wrong.

[-]

gpderetta 10 months ago

&y-&x doesn't require the variables to adjacent, just to exist in the same linear address space. It doesn't even imply any specific ordering .

> Again, just compile the code to perform the operation. If that happens to write 42 somewhere in the stack frame that leads to the program corrupting / a segfault that's fine. If the compiler says 'wait that's not a known memory location' or 'that's going to write onto the protected stack!

As far as the compiler is concerned, source() could return 0 and the line be perfectly defined, so there is no reason to produce an error. In fact as far as the compiler is concerned 0 is the only valid value that source could return, so that line can only be writing to y. As that variable is a local variable that going out of scope, the compiler omits the store. Or you also believe that dead store elimination is wrong?

> possibly with a warning message about undefined operations and the affected lines

There is no definitely undefined operation in my example; there can be UB depending on the behaviour of externally compiled functions, but that's true of almost any C++ statement.

What most people in the "compiler must warn about UB" camp fail to realize, is that 99.99% of the time the complier has no way of realizing some code is likely to cause UB: From the compiler point of view my example is perfectly standard compliant [1], UB comes only from the behaviour of source and sink that are not analysable by the compiler.

[1] technically to be fully conforming the code should cast the pointers to uintptr_t before doing the subtraction.

[-]

mjevans 10 months ago

I'm not familiar with the stack-like functions mentioned, but that is indeed something it should NOT eliminate.

In fact, the compiler should not eliminate 'dead stores'. That should be a warning (and emit the code) OR an error (do not emit a program).

The compiler should inform the programmer so the PROGRAM can be made correct. Not so it's particular result can be faster.

menaerus 10 months ago

Charitable interpretation may be: Back then when the contract of this function was standardized, presumably in C89 which is ~35 years ago, CPUs but also C compilers were not as powerful so wasting an extra couple of CPU cycles to check this condition was much more expensive than it is today. Because of that contract, and which can be seen in the example in the below comments, the compiler is also free to eliminate the dead code which also has the effect of shaving off some extra CPU cycles.

ynik 10 months ago

Probably because they did not think of this special case when writing the standard, or did not find it important enough to consider complicating the standard text for.

In C89, there's just a general provision for all standard library functions:

> Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow. If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer), the behavior is undefined. [...]

And then there isn't anything on `memcpy` that would explicitly state otherwise. Later versions of the standard explicitly clarified that this requirement applies even to size 0, but at that point it was only a clarification of an existing requirement from the earlier standard.

People like to read a lot more intention into the standard than is reasonable. Lots of it is just historical accident, really.

lmm 10 months ago

Back when they wrote it they were trying to accommodate existing compilers, including those who did useful things to help people catch errors in their programs (e.g. making memcpy trap and send a signal if you called it with NULL). The current generation of compilers that use undefined behaviour as an excuse to do horrible things that screw over regular programmers but increase performance on microbenchmarks postdates the standard.

wat10000 10 months ago

The original C standard was more descriptive than prescriptive. There was probably an implementation where it crashed or misbehaved.

FartyMcFarter 10 months ago

Because the benefit was probably seen as very little, and the cost significant.

When you're writing a compiler for an architecture where every byte counts you don't make it write extra code for little benefit.

Programmers were routinely counting bytes (both in code size and data) when writing Assembly code back then, and I mean that literally. Some of that carried into higher-level languages, and rightly so.

hyperman1 10 months ago

memcpy used to be a rep movsb on 8086 DOS compilers. I don't remember if rep movsb stops if cx=0 on entry, or decrements first and wraps around, copying 64K of data.

[-]

dfox 10 months ago

The specification does not explicitly say that, but the clear intention is that REP with CX=0 should be no-op (you get exactly that situation when REP gets interrupted during the last iteration, in that case CX is zero and IP points to the REP, not the following instruction).

[-]

bonzini 10 months ago

Rep movsb copies 64K if CX=0 (that's actually very useful), but memcpy could be implemented as two instructions:

    jcxz skip 
    rep movsb
    skip:

connicpu 10 months ago

I know at least MSVC's memcpy on x86_64 still results in a rep movsb if the cpuid flag that says rep movsb is fast is set, which it should be on all x86 chips from about 2011/2012 and onward ;)

10 months ago

[deleted]

frabert 10 months ago

Every time they leave something undefined, they do so to leave implementations free to use the underlying platform's default behavior, and to allow compilers to use it as an optimization point

[-]

lucozade 10 months ago

> time they leave something undefined, they do so to leave implementations free to use the underlying platform's default behavior

That's implementation defined (more or less) ie teh compiler can do whatever makes mst sense for its implementation.

Undefined means (more or less) that the compiler can assume the behaviour never happens so can apply transforms without taking it into account.

> to allow compilers to use it as an optimization point

That's the main advantage of undefined behaviour ie if you can ignore the usage, you may be able to apply optimisations that you couldn't if you had to take it into account. In the article, for example, GCC eliminated what it considered dead code for a NULL check of a variable that couldn't be NULL according to the C spec.

That's also probably the most frustrating thing about optimisations based on undefined behaviour ie checks that prevent undefined behaviour are removed because the compiler thinks that the check can't ever succeed because, if it did, there must have been undefined behaviour. But the way the developer was ensuring defined behaviour was with the check!

[-]

frabert 10 months ago

AFAIK, something having undefined behavior in the spec does not prevent an implementation- (platform-)specific behavior being defined.

As to your point about checks being erased, that generally happens when the checks happen too late (according to the compiler), or in a wrong way. For example, checking that `src` is not NULL _after_ memcpy(sec, dst, 0) is called. Or, checking for overflow by doing `if(x+y<0) ...` when x and y are nonnegative signed ints.

jcelerier 10 months ago

Here it's more that it allows to assume that this is never the case, thus no need to have an additional check in it I assume ?

10 months ago

[deleted]

nephanth 10 months ago

I mean, they might not have given thought to that particular corner case, they probably wrote something like

> memcpy(void* ptr1, void* ptr2, int n)

Copy n bytes from ptr1 to ptr2. UNDEFINED if ptr1 is NULL or ptr2 is NULL

‐------

It might also have come from a "explicit better than implicit" opinion, as in "it is better to have developers explicitly handle cases where the null pointer is involved

[-]

jbverschoor 10 months ago

I think it's more a strategy. C was not created to be safe. It's pretty much a tiny wrapper around assembler. Every limitation requires extra cycles, compile time or runtime, both of which were scarce.

Of course, someone needs to check in the layers of abstraction. The user, programmer, compiler, cpu, architecture.. They chose for the programmer, who like to call themselves "engineers" these days.

[-]

poincaredisk 10 months ago

I disagree with your premise. C was designed to be a high level (for its time) language, abstracted from actual hardware

>It's pretty much a tiny wrapper around assembler

Assebler has zero problem with adding "null + 4" or computing "null-null". C does, because it's not actually a tiny wrapper.

[-]

jbverschoor 10 months ago

Not high-level.. Portable. Portable layer above assembler/arch.

NULL doesn't exist in assembler, and in C, NULL is only a defined as a macro. It's not something built-in.

C doesn't have any problems adding 4 to NULL nor subtracting NULL from NULL.

[-]

teo_zero 10 months ago

> C doesn't have any problems adding 4 to NULL nor subtracting NULL from NULL.

"Having problems" is not a fair description of what's at stake here. The C standard simply says that it doesn't guarantee that such operations give the results that you expect.

Also please note that the article and this whole thread is about the address zero, not about the number zero. If NULL is #defined as 0 in your implementation and you use it in an expression only involving integers, of course no UB is triggered.

jbverschoor 10 months ago

  #include <stddef.h>
  int main() {
    if (NULL + 4) {}
    if (NULL - NULL) {}
    return 0;
  }

wruza 10 months ago

Not sure what your last remark means wrt everything else.

captainmuon 10 months ago

I feel strongly they should split undefined behavior in behavior that is not defined, and things that the compiler is allowed to assume. The former basically already exists as "implementation defined behavior". The latter should be written out explicitly in the documentation:

> memcpy(dest, src, count)

> Copies count bytes from src to dest. [...] Note this is not a plain function, but a special form that applies the constraints dest != NULL and src != NULL to the surrounding scope. Equivalent to:

    assume(dest != NULL)
    assume(src != NULL)
    actual_memcpy(dest, src, count)

The conflation of both concepts breaks the mental model of many programmers, especially ones who learned C/C++ in the 90s where it was common to write very different code, with all kinds of now illegal things like type punning and checking this != NULL.

I'd love to have a flag "-fno-surprizing-ub" or "-fhighlevel-assembler" combined with the above `assume` function or some other syntax to let me help the compiler, so that I can write C like in the 90s - close to metal but with less surprizes.

[-]

Thorrez 10 months ago

>Note this is not a plain function, but a special form that applies the constraints dest != NULL and src != NULL to the surrounding scope.

Plain functions can apply constraints to the surrounding code:

https://godbolt.org/z/fP58WGz9f

tialaramex 10 months ago

> I'd love to have a flag "-fno-surprizing-ub" or "-fhighlevel-assembler" combined with the above `assume` function or some other syntax to let me help the compiler, so that I can write C like in the 90s - close to metal but with less surprizes.

The problem, which you may realise with some more introspection is that "surprising" is actually a property of you, not of the compiler, so you're asking for mind-reading and that's not one of the options. You want not to experience surprise.

You can of course still get 1990s compilers and you're welcome to them. I cannot promise you won't still feel surprised despite your compiler nostalgia, but I can pretty much guarantee that the 1990s compiler results in slower and buggier software, so that's nice, remember only to charge 1990s rates for the work.

jancsika 10 months ago

I get that for the library. But I'm a bit puzzled about the optimizations done by a compiler based on this behavior.

E.g., suppose we patch GCC to preserve any conditional containing the string 'NULL' in it. Would that have a measurable performance impact on Linux/Chromium/Firefox?

xbar 10 months ago

Upon which some people may rely...

[-]

int_19h 10 months ago

People will only rely on UB when it is well defined by a particular implementation, either explicitly or because of a long history of past use. E.g. using unions for type punning in gcc, or allowing methods to be called on null pointers in MSVC.

But there's nothing like that here.

[-]

pjmlp 10 months ago

Until a compiler version comes out and since it was UB anyway, the compiler sundenly now behaves in a different way.

bluetomcat 10 months ago

A trivial implementation wouldn't dereference dest or src in case the length is 0. That's how a student would write it with a for loop (byte-by-byte copy). A non-trivial implementation might do something with the pointers before entering the copy loop.

pkhuong 10 months ago

It does nothing, but is only defined when the pointers point into or one past the end of valid objects (live allocations), because that's how the standard defines the C VM, in terms of objects, not a flat byte array.

[-]

whytevuhuni 10 months ago

What if the objects are non-NULL, but invalid (not actually allocated)?

For example, Rust will use address 1 with length 0 for static empty strings, because 1 is a properly aligned non-null pointer.

I would imagine such strings end up being passed to C code sometimes, which may end up calling memcpy with a length of 0 on them.

[-]

creshal 10 months ago

> What if the objects are non-NULL, but invalid (not actually allocated)?

Still UB, since they're restricted pointers that must be valid to begin with.

[-]

bonzini 10 months ago

This is wrong. If you do p=malloc(256), p+256 is valid even though it does not point to a valid address (it might be in an unmapped page; check out ElectricFence). Rust's non-null aligned other pointer is the same, memcpy can't assume it can be dereferenced if the size is zero. The standard text in the linked paper says the same.

pkhuong 10 months ago

also UB according to the spec, but LLVM is free to define it. e.g., clang often converts trivial C++ copy constructors to memcpy, which is UB for self-assignment, but I assume that's fine because the C++ front-end only targets LLVM, and LLVM presumably defines the behaviour to do what you'd expect.

[-]

whytevuhuni 10 months ago

Where I work, it is quite normal to link together C code compiled with GCC and Rust code compiled with LLVM, due to how the build system is set up.

As far as I know that disables LTO, but the build system is so complex, and the C code so large, that nobody bothers switching the C side to Clang/LLVM as well.

badmintonbaseba 10 months ago

Still technically UB according to the proposed wording. The proposed wording only deals with allowing null pointers explicitly.

ryao 10 months ago

I have asked this question in the past and was told that memcpy() is allowed to preemptively read before it has determined it needs to write to make it faster on some CPUs. The presumption is that if you are going to be copying data, there is at least one cache line there already, so reading can start early.

[-]

10 months ago

[deleted]

rcxdude 10 months ago

Purely mechanically, yes, but in terms of the definition of the behaviour in the C abstract machine, no, because certain operations on null pointers are undefined, even if the obvious low-level compilation turns into nothing.

[-]

codedokode 10 months ago

Maybe we should get rid of "abstract machine" and treat pointers as memory addresses?

[-]

NobodyNada 10 months ago

If you do this, your C code will run significantly slower than, say, Java, Go, or C#, because the compiler is unable to apply even the most basic optimizations (which it can do still in all those other languages).

So, at that point why even use C at all? Today, C is used where the overhead of a managed language is unacceptable. If you could just eat the performance cost, you'd probably already be using a managed language. There's not much desire for a variant of C with what would be at least a 10x slowdown in many workloads.

[-]

cv5005 10 months ago

Or it could be made faster because certain manual optimizations become possible.

An example would a table of interned strings that you wanna match against (say you're writing a parser). Since standard C says thou shall not compare pointers with < or > unless they both point into the same 'object' you are forbidden from doing the speed of light code:

  char *keywords_begin, *keywords_end;
  if(some_str >= keywords_begin && some_str < keywords_end) ...

Official standard sanctioned workarounds would require extra indirection (using indices for example) which is suboptimal.

[-]

gpderetta 10 months ago

You can cast them to uintptr_t and compare them to your heart's desire.

gpderetta 10 months ago

  int* oracle();
  int foo() {
      int x = 1;
      *oracle() = 42;
      return x;
  }

Is the above program allowed to return anything other than 1 in your language?

[-]

kibwen 10 months ago

To elaborate, we treat pointers as more than just integers because it gives optimizers the latitude to reorder and eliminate pointer operations. In the example above we cannot do this, because we cannot prove at compile time that x doesn't live at the address returned by oracle.

For some high-quality further discussion, see Ralf Jung's series of blog posts starting with https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html

[-]

shultays 10 months ago

  However, given how low-level a language C++ is, we can actually break this assumption by setting i to y-x. Since &x[i] is the same as x+i, this means we are actually writing 23 to &y[0].

But that is undefined, you can't do x + (y - x) ie a pointer arithmetic that ends outside of bounds of an array. Since it is undefined, shouldn't C++ assume that changing x[..] can't change y[0]

edit: welp, if I read a few more lines into article I would see that it also tells it is undefined

gpderetta 10 months ago

to be clear, in my example the result of oracle() cannot possibly alias with 'x' in C or C++ (and in fact gcc will optimize accordingly). In a different language where addresses are mere integers, things would be more complicated.

[-]

codedokode 10 months ago

The result of oracle can point to anything if you write it as return (int *)rand();

Note that rand() returns 32-bit value so you have to call it twice and merge the results to obtain a 64-bit pointer.

[-]

gpderetta 10 months ago

The numerical value returned by oracle might physically match the address of the stack slot for 'x', assuming that it exists, but it doesn't mean that, from a language point of view, it is a valid pointer.

If forging pointers had defined behaviour, it would be impossible to use the language sanely or perform any kind of optimization.

10 months ago

[deleted]

shultays 10 months ago

Is it allowed to return anything else in C? Is there anything in standard C that would allow oracle() to access memory address of x?

Sure different compilers might allow inlining assembly or some other ways to access x on previous stack perhaps but then it is not really "C"

[-]

wat10000 10 months ago

That’s the point. C allows this function to be optimized to always return 1. A “pointers are addresses, just emit reads and writes and stop trying to be so clever” version of C would require x to be spilled to the stack, then the write, then reload x and return whatever it contained.

[-]

cv5005 10 months ago

Then use the register keyword or just reword the standard to assume the register behavior if a variables address hasn't been taken.

The majority of useful optimizations can be kept in a "Sane C" with either code style changes (cache stuff in local vars to avoid aliasing for example) or with minor tweaks to the standard.

[-]

wat10000 10 months ago

Register behavior is what you want essentially all of the time. So we’d just have to write `register` all over the place for no gain.

“Don’t optimize this, read and write it even if you think it’s not necessary” is a very rare case so it shouldn’t be the default. If you want it, use the volatile keyword.

There’s no need to reword the standard to assume the register behavior if the variable’s address hasn’t been taken. That’s already how it works. In this example, if you escape the value of `&x`, it’s not legal to optimize this function to always return 1.

codedokode 10 months ago

When using C, this can return anything (or crash of oracle function returns an invalid pointer, or rewrite its own code if the code section is writable). So if you get rid of "abstract machine", nothing changes - the program can return anything or crash.

[-]

atq2119 10 months ago

The point is that the C standard does guarantee that the function returns 1 if the program is a valid C program - which means there is no UB.

For example: If the oracle function returns an invalid pointer, then dereferencing that pointer is UB, and therefore the program isn't a valid C program.

wat10000 10 months ago

A conforming C compiler is allowed to emit that function to perform the write and then return the constant 1. Should that be allowed?

alerighi 10 months ago

Well even in C is not guaranteed to return anything other than 1, since oracle() may return the memory address of variable 1.

[-]

gpderetta 10 months ago

the literal 1 is not an object in C or C++ hence it does not have an address. If you meant 'x', then also no, oracle() can't return the address of 'x' because of pointer provenance rules.

layer8 10 months ago

That would restrict C to memory models with a linear address space. That is usually the case nowadays for C implementations, but maybe we don’t want to set that in stone, because it would be virtually impossible to revert such a guarantee.

There’s also cases like memory address ranges that map to non-memory hardware (i.e. that don’t behave like “dumb” memory), and how would you have the C standard define behavior for those?

Lastly, CPU caches require some sort of abstract model as soon as you have multi-threading.

Measter 10 months ago

The value of an abstract machine is that it allows you to specify how a given program behaves without needing to point to a specific piece of hardware. Compilers then have this as a target when compiling a program for a specific piece of hardware so that they know when the compiler's output is correct.

The issue here is that the abstract machine is under or badly specified.

sixfiveotwo 10 months ago

How would you define what a memory address is without first defining in which context it has a meaning?

[-]

codedokode 10 months ago

C was written as a portable assembly language, so I think a memory address is a number that CPU considers to be a memory address.

[-]

layer8 10 months ago

That’s currently the case in C, in that you can convert pointers to and from uintptr_t. However, not every number representable in that type needs to be valid memory (that’s true on the assembly level as well), hence it’s only defined for valid pointers.

sixfiveotwo 10 months ago

> I think a memory address is a number that CPU considers to be a memory address

I meant to say that, indeed, there must be some concept of CPU for a memory address to have a meaning, and for this concept of CPU to be as widely applicable as possible, surely defining it as abstract as possible is the way to go. Ergo, the idea of a C abstract machine.

Anyway, other people in this thread are discussing the matter more accurately and in more details than I could hope to do, so I'll leave it like that.

davidt84 10 months ago

Congratulations, you've invented an entirely new language.

Now, who's going to write the compiler for it?

[-]

anticensor 10 months ago

No, it's C at -O0.

[-]

davidt84 10 months ago

No, it's not.

Undefined behaviour is undefined behaviour whatever optimisation level you use.

Some -f flags may extend the C standard and remove undefined behaviour in some cases (e.g. strict aliasing, signed integer overflow, writable string constants, etc.)

lmm 10 months ago

20 years ago, making a C compiler that provided sane behaviour and better guarantees (going beyond the minimum defined in the standard) to make code safer and programmers' lives easier, even at the cost of some performance, might have been a good idea. Today any programmer who thinks things like not having security bugs are more important than having bigger numbers on microbenchmarks has already moved on from C.

[-]

uecker 10 months ago

This is certainly not true. Many programmers also learned to the use tools available to write reasonably safe code in C. I do not personally find this problematic.

[-]

lmm 10 months ago

> Many programmers also learned to the use tools available to write reasonably safe code in C.

And then someone compiled their code with a new compiler and got a security bug. This happens consistently. Every C programmer thinks their code is reasonably safe until someone finds a security bug in it. Many still think so afterwards.

[-]

uecker 10 months ago

There are couple of cases where compiler optimizations caused security issues, but that this happens all the time is a huge exaggeration. And many of the practically relevant cases can be avoided by using tools such as UBSan. The actual practical issue in C is people getting their pointer arithmetic wrong, which can also be avoided by having safe abstractions for buffer and string handling.

The other fallacy is that these issue then suddenly would disappear when using Rust, which is also not the case. Because the programmer cutting corners in C or prioritizing performance over safety will also use Rust "unsafe" carelessly.

Rust has a clear advantage for temporal memory safety. But it is also possible to have a clear strategy about what data structure owns what other object in C.

[-]

lmm 10 months ago

> And many of the practically relevant cases can be avoided by using tools such as UBSan.

"can be", but aren't.

> The other fallacy is that these issue then suddenly would disappear when using Rust, which is also not the case. Because the programmer cutting corners in C or prioritizing performance over safety will also use Rust "unsafe" carelessly.

The vast majority of these programmers aren't making a deliberate choice at all though. They pick C because they heard it's fast, they write it in the way that the language nudges them towards, or the way that they see done in libraries and examples, and they end up with unsafe code. Sure, someone can deliberately choose unsafe in Rust, but defaults matter.

> it is also possible to have a clear strategy about what data structure owns what other object in C.

Is it though? How can one distinguish a codebase that does from a codebase that doesn't? Other than the expensive static analysis tool mentioned elsewhere in the thread (at which point you're not really writing "C"), I've never seen a way that worked and was distinguishable from the ways that don't work.

[-]

uecker 10 months ago

> > And many of the practically relevant cases can be avoided by using tools such as UBSan.

> "can be", but aren't.

It is a possible option when one needs improved safety, and IMHO often the better option than using Rust.

> > The other fallacy is that these issue then suddenly would disappear when using Rust, which is also not the case. Because the programmer cutting corners in C or prioritizing performance over safety will also use Rust "unsafe" carelessly.

> The vast majority of these programmers aren't making a deliberate choice at all > though. They pick C because they heard it's fast, they write it in the way that the > language nudges them towards, or the way that they see done in libraries and examples, > and they end up with unsafe code. Sure, someone can deliberately choose unsafe in > Rust, but defaults matter.

The choice of handcoding some low-level string manipulation is similar to the choice of using unsafe rust. One can do it or not. There is certainly a better security culture in Rust at this time, but it is unclear to what extend this will be true in the long run. Also C security culture improves too and Rust culture will certainly deteriorate when usage spreads from highly motivated early adopters to the masses.

> > it is also possible to have a clear strategy about what data structure owns what other object in C.

> Is it though? How can one distinguish a codebase that does from a > codebase that doesn't?

This leads to the argument that it is trivial to see unsafe code in Rust because it is marked "unsafe" and just a small amount of code while in C you would need to look at everything. But this largely a theoretical argument: In practice you need to do some quality control for all code anyway, because memory safety is just a small piece of overall the puzzle. (and even for memory safety, you also need to look at the code surrounding code in RUst.) In practice, it is not hard to recognize the C code which is dangerous, it is the one where pointer arithmetic and string manipulation is not encapsulated in safe interfaces and it is the code where ownership of pointers is not clear.

>Other than the expensive static analysis tool mentioned elsewhere in the thread (at which point you're not really writing "C"), I've never seen a way that worked and was distinguishable from the ways that don't work.

I see some very high quality C code with barely any memory safety problems. Expensive static analysis can be used when no mistakes are acceptable, but then you should also formally verify the unsafe code in Rust.

[-]

lmm 10 months ago

> The choice of handcoding some low-level string manipulation is similar to the choice of using unsafe rust. One can do it or not.

But most of the time programmers don't make a conscious choice at all. So opt-out unsafety versus opt-in unsafety is a huge difference.

> In practice you need to do some quality control for all code anyway, because memory safety is just a small piece of overall the puzzle.

Memory safety is literally more than half of real-world security issues.

> In practice, it is not hard to recognize the C code which is dangerous

> I see some very high quality C code with barely any memory safety problems

I hear a lot of C people saying this sort of thing, but they never make it concrete - there's no list of which popular open-source libraries are dangerous and which are not, it's only after a vulnerability is discovered that we hear "oh, that project always had poor quality code". If I pick a random library to maybe use in my project (even big-name ones e.g. libpq or libtiff), no-one can ever actually answer whether that's high quality C code or low quality C code, or give me a simple algorithm that I can actually apply without having to read a load of code and make a subjective judgement. Whereas I don't have to read or judge anything or even properly know rust to do "how much of this rust code is unsafe".

[-]

uecker 10 months ago

> > The choice of handcoding some low-level string manipulation is similar to the choice of using unsafe rust. One can do it or not.

> But most of the time programmers don't make a conscious choice at all. So opt-out unsafety versus opt-in unsafety is a huge difference.

I don't think so. A programmer being careless will be careless with Rust "unsafe" too.

Don't get me wrong, I think marking code without guaranteed memory safety is a good idea. I just don't think it is a fundamental game changer.

> > In practice you need to do some quality control for all code anyway, because memory safety is just a small piece of overall the puzzle.

> Memory safety is literally more than half of real-world security issues.

https://www.horizon3.ai/attack-research/attack-blogs/analysi...

But I think even this is likely overstating it by looking at CVEs and not real world impact.

> > > In practice, it is not hard to recognize the C code which is dangerous

> > I see some very high quality C code with barely any memory safety problems

> I hear a lot of C people saying this sort of thing, but they never make it > concrete - there's no list of which popular open-source libraries are dangerous > and which are not, it's only after a vulnerability is discovered that we hear > "oh, that project always had poor quality code". If I pick a random library > to maybe use in my project (even big-name ones e.g. libpq or libtiff), no-one > can ever actually answer whether that's high quality C code or low quality C code > or give me a simple algorithm that I can actually apply without having to read > a load of code and make a subjective judgement. Whereas I don't have to read or > judge anything or even properly know rust to do "how much of this rust code is unsafe".

So you look at all the 300 unmaintained dependencies a typical Rust projects pulls in via cargo and look at all the "unsafe" blocks to screen it? Seriously, the issue is lack of open-source man power and this will hit Rust very hard once the ecosystem gets larger and this goes even more beyond the highly motivated first adopters. I would be more tempted to buy this argument if Rust would have no "unsafe" and I could pull in arbitrary code from anywhere and be safe. And this idea existed before with managed languages... Safe Java in the browser and so. Also sounded plausible but was similarly highly exaggerated as the Rust story.

[-]

lmm 10 months ago

> A programmer being careless will be careless with Rust "unsafe" too.

Programmers will be careless, sure, but you can't really use unsafe without going out of your way to. Like, no-one is going to write "unsafe { *arr.get_unchecked(index) }" instead of "arr[index]" when they're not thinking about it.

> So you look at all the 300 unmaintained dependencies a typical Rust projects pulls in via cargo and look at all the "unsafe" blocks to screen it?

No, of course not, I run "cargo geiger" and let the computer do it.

I think unmaintained dependencies are less likely, and easier to check, in the Rust world. Ultimately what defines the attack surface is the number of lines of code, not how they're packaged, and C's approach tends to lead to linking in giant do-everything frameworks (e.g. people will link to GLib or APR when they just wanted some string manipulation functions or a hash table, which means you then have to audit the whole framework to audit that program's dependencies. And while the framework might look well-maintained, that doesn't mean that the part your program is using is), reimplementing or copy-pasting common functions because they're not worth adding a dependency for (which is higher risk, and means that well-known bugs can keep reappearing, because there's no central place to fix it once and for all), or both. And C's limited dependency management means that people often resort to vendoring, so even if your dependency is being maintained, those bugfixes may not be making their way into your program.

> And this idea existed before with managed languages... Safe Java in the browser and so. Also sounded plausible but was similarly highly exaggerated as the Rust story.

Java has quietly worked. It didn't succeed in the browser or on the open-source or consumer-facing desktop for reasons that had nothing to do with safety (in some cases they had to do with the perception of safety), but backend processing or corporate internal apps are a lot safer than they used to be, without really having to change much.

quotemstr 10 months ago

> safe code in C

You're like a Japanese holdout in the 60s refusing to leave his bunker long after the war is over.

C lost. Memory safety is a huge boon for security. Human beings, even the best of them, cannot consistently write correct C code. (Look at OpenBSD.) You can keep fighting the war your side has already lost or you can move on.

[-]

uecker 10 months ago

Well, memory safety is great but it seems Rust programmers also manage to create memory safety issues just fine:

https://rustsec.org/advisories/RUSTSEC-2024-0401.html https://rustsec.org/advisories/RUSTSEC-2024-0400.html https://rustsec.org/advisories/RUSTSEC-2024-0377.html https://rustsec.org/advisories/RUSTSEC-2024-0374.html etc.

[-]

whytevuhuni 10 months ago

I think the first one, stack overflow, is technically not a memory safety issue, just denial-of-service on resource exhaustion. Stack overflow is well defined as far as I know.

The other three are definitely memory safety issues.

[-]

ryao 10 months ago

I would consider a stack overflow to be a memory safety issue. The C++ language authors likely would too. C++ famously refused to support variable length stack allocated arrays because of memory safety concerns. In specific, they were worried that code at runtime would make an array so big so big that it would jump the OS guard page, allowing access to unallocated memory that of course is not noticed ahead of time during development. This is probably easy to do unintentionally if you have more stack variables after an enormous stack allocated array and touch them before you touch the array. The alternative is to force developers to use compiler extensions such as alloca(). That makes it easy to pass pointers outside of the stack frame where they are valid and is a definite safety issue. The C++ nitpicking over variable length arrays is silly since it gives us a status quo where C++ developers use alloca() anyway, but it shows that stack overflows are considered a memory safety issue.

[-]

whytevuhuni 10 months ago

In the general case, I think you might be right, although it's a bit mitigated by the fact that Rust does not have support for variable length arrays, alloca, or anything that uses them, in the standard library. As you said though, it's certainly possible.

I was more referring to that specific linked advisory, which is unlikely to use either VLAs or alloca. In that case, where stack overflow would be caused by recursion, a guard frame will always be enough to catch it, and will result in a safe abort [0].

[0] https://github.com/rust-lang/rust/pull/31333

[-]

ryao 10 months ago

I cited the complaints against VLAs as support for stack overflows being a memory safety issue. I did not mean to imply that Rust supported them.

quotemstr 10 months ago

C++ is a better unsafe language than unsafe Rust, IMHO. The thing about the social dynamic of Rust, though, is that it keeps unsafe code to a minimum.

[-]

uecker 10 months ago

This may be true, but the minimum unsafe code still seems not that small. Maybe I just had bad luck, but one of the first things I looked at more closely was an implementation of a matrix transpose in Rust (as an example of something relevant to my problem domain) and that directly used unsafe Rust to be reasonably fast and then already had a CVE. This was a revealing experience because was just the same type of bug you might have had in similar C code, but in a language where countless people insist that this "can not happen".

uecker 10 months ago

I agree that one shouldn't have been included. My favorite ones aren't included here anyway, e.g. how a Rust programmer managed to create a safety issue in a matrix transpose or how the messed up str::repeat in their standard library.

And don't get me wrong. I think Rust is as safer language in C. Just the idea that C is completely unsafe and it is impossible even for experts to write reasonable safe code while it is completely impossible in Rust to create an issue is just a lot of nonsense. In reality, it is possible to screw up in both languages and people do this, and reality is that safety in Rust is only somewhat better when compared to C with good security practices. But this is not how it is presented. I also think the difference will become even smaller when C safety continues to improve as it did in the last years due to better tooling while Rust is being picked up by average programmers under time pressure who will use "unsafe" just as carelessly as they carelessly hand-roll pointer arithmetic in C today.

ryao 10 months ago

Use a sound static analyzer like astree and you can produce memory safe C code:

https://www.absint.com/astree/index.htm

Note that the key word here is sound. The more common static analyzers are unsound tools that will miss cases. Sound tools do not, but few people know of them, they are rare and they are typically proprietary and expensive.

[-]

quotemstr 10 months ago

Sure. I'm also a big fan of what Microsoft has done with SAL. And of course you have formally proven C, as used in seL4. I'd say that the contortions you have to go through to write code with these systems takes you out of the domain of "C" and into a domain of a different, safer language merely resembling C. Such a language might be a fine tool! But it's not arbitrary C.

[-]

uecker 10 months ago

Note that my original comment above was "reasonably safe" and not "perfectly memory safe". You can formally prove something with a lot of effort, but you can also come reasonably close for practical purposes with a lot less effort and more commonly available tools.

You are right that "arbitrary C" is not safe while safe Rust is safe, but this is mostly begging the question. The question is what can you do with the language depending on your requirements. If you need safe C this doable with enough effort, if you need reasonably safe C this is even practical in most projects, and all this should be compared to Rust as used in a similar situation which very well may include use of unsafe Rust or C libraries which may also limit the safety.

ryao 10 months ago

It is C. It is just written with the aid of formal methods. It would be nice if all software were written that way. That said, if you want another language “resembling C”, there is always WUFFS:

https://github.com/google/wuffs

The output of the WUFFS compiler certainly resembles C because it is C.

[-]

lmm 10 months ago

> It is C. It is just written with the aid of formal methods.

It is not C in the sense that many of the usual reasons to use C no longer apply. E.g. a common reason to use C is the availability of libraries, but most popular libraries will not pass that analyser so you can't use them if you're depending on that analyser. E.g. a common reason to use C is standard tooling for e.g. automated refactoring, but will those standard tools preserve analyser-passing? Probably not.

IcePic 10 months ago

"man bcopy" on BSD:

'If len is zero, no bytes are copied.'

Seems reasonable.

[-]

crest 10 months ago

As I understand that doesn't imply that it's not undefined to pass NULL pointers. While not what most users expect/want it's possible to this is just a wrapper around an memcpy() which will only be correct to call with valid destination and source pointers even if the length is zero.

ryukoposting 10 months ago

Yes and no.

No, because ISO never said it must behave this way.

Yes, because every libc I've personally encountered acts this way. At a glance, glibc's x86 implementation[1, 2], musl, and picolibc all handle 0-length memcpy as you'd expect. I'm sure other folks could dig up the code for Newlib, uclibc, and others, and they'd see the same thing.

On a related note, ISO C has THREE different things that most people tend to lump together as "undefined behavior." They are:

Implementation-defined behavior: ISO doesn't require any particular behavior, but they do require implementations to consistently apply a particular behavior, and document that behavior.

Unspecified behavior: ISO doesn't require any particular behavior, but they do require implementations to consistently use a particular behavior, but they don't require that behavior to be documented.

Undefined behavior: ISO doesn't require any particular behavior, and they don't require implementations to define any particular behavior either.

[1]: https://github.com/lattera/glibc/blob/master/string/memcpy.c [2]: https://github.com/lattera/glibc/blob/895ef79e04a953cac14938...

nmilo 10 months ago

> However, the most vocal opposition came from a static analysis perspective: Making null pointers well-defined for zero length means that static analyzers can no longer unconditionally report NULL being passed to functions like memcpy—they also need to take the length into account now.

How does this make any sense? We don't want to remove a low hanging footgun because static analyzers can no longer detect it?

[-]

sfink 10 months ago

No, it means the static analyzers can't report on a different error because a subset of that class of errors is no longer an error, and the static analysis can't usually distinguish between that subset and the rest.

    memcpy(NULL, NULL, 0); // Formerly bad, now ok.
    memcpy(NULL, NULL, s); // Formerly bad, now unknown (unless it can be proven that s != 0).

and

    memcpy(NULL, b, c); // Same issue.

(where "NULL" == "statically known to be NULL", not necessarily just a literal NULL. Not that that changes the difficulty here.)

Previously: warn if either address might be NULL.

Now: warn if either address might be NULL and the length might be nonzero, and prepare for your users to be annoyed and shut this warning off due to the false alarms.

Any useful static analysis tool does a careful balance between false positives and false negatives (aka false alarms and missed bugs). Too many false positives, and that warning will be disabled, or users will get used to ignoring it, or it will be routinely annotated away at call sites without anyone bothering to figure out whether it's valid or not. Soon the tool will cease to be useful and may be entirely abandoned. In actual practice, the sophistication of a static analysis tool is far less relevant than its precision. It's quite common to have an incredibly powerful static analysis tool that is used for only a small handful of blazingly obvious warnings, sometimes ones that the compiler already has implemented! (All the tool's fancy warnings got disabled one by one and nobody noticed.)

hatthew 10 months ago

My understanding is that with this change, static analyzers have three options:

1. False positive on code that would have been an issue previously

2. False negative on a ton of similar footguns

3. Add complexity to differentiate between these cases

None of these options are fun.

[-]

nitwit005 10 months ago

Yes, but that tradeoff exists for most things those tools do. If you can easily and perfectly detect an error, it should just go into the compiler (and perhaps language spec).

[-]

Dylan16807 10 months ago

> If you can easily and perfectly detect an error, it should just go into the compiler (and perhaps language spec).

Nobody seems to care much about removing UB even when it's super easy. For example, a bunch of basic syntax errors like forgetting the closing quote on a string or not having a newline at the end of the file are UB.

[-]

Chaosvex 10 months ago

I know this is a discussion about C but I'll add that C++ removed the newline requirement.

MrMcCall 10 months ago

Isn't it more sensible to just check that the params that are about to be sent to memcpy be reasonable?

That is why I tend to wrap my system calls with my own internal function (which can be inlined in certain PLs), where I can standardize such tests. Otherwise, the resulting code that performs the checks and does the requisite error handling is bloated.

Note that I am also loath to #DEFINE such code because C is already rife with them and my perspective is that the less of them the better.

At the end of the day, quick and dirty fixes will prove the adage "short cuts make long delays", and OpenBSD's approach is the only really viable long-term solution, where you just have to rewrite your code if it has ill-advised constructs.

For designing libraries such as C's stdlib, I don't believe in 'undefined behavior', clearly define your semantics and say, "If you pass a NULL to memcpy, this is what will happen." Same for providing a (n == 0), or should (src == dst).

And if, for some strange reason, fixing the semantics breaks calling code, then I can't imagine that their code wasn't f_cked in the first place.

[-]

hwc 10 months ago

> internal function

every time you introduce something nonstandard, you add one little hardship to anyone trying to read or modify your code.

if a programmer is familiar with the language, it's standard library, and the normal idioms, then they should be able to just jump in.

int_19h 10 months ago

As the article points out, all major memcpy implementations already do this check inside memcpy. Sure, the caller can also check, but given that it's both redundant in practice and makes some common patterns harder to use than they would otherwise be, there's no reason to not just standardize what's already happening anyway and make everyone's lives easier in the process.

ape4 10 months ago

Only about 1000 more functions to do this to.

hwc 10 months ago

Well, that seems like something that should have been there from the beginning .

MuffinFlavored 10 months ago

> because NULL + 0 is undefined behavior in C.

Why? It's 2024. Make it not be? Sure, some older stuff already written might no longer compile and need to be updated. Put it behind a "newer" standard flag/version or whatever.

Or is it that it can't be caught at compile time and only run time... hmm...

[-]

sophiebits 10 months ago

They are making it not be. That’s the whole point of the article.

10 months ago

[deleted]

10 months ago

[deleted]

high_na_euv 10 months ago

>On the one hand, UB can be important for compiler optimizations

e.g?

[-]

GuB-42 10 months ago

Generally, undefined behavior removes the need for systematically checking for special cases, the most common being out of bounds access.

But it can go further than that. Dereferencing a NULL pointer is undefined behavior, so if a pointer is dereferenced, it can be assumed by the compiler not to be NULL and the code can be optimized. For example:

  void foo(int *p) {
    *p++;
    if (p == NULL) {
      printf("val is NULL\n");
    } else {
      printf("val is %d\n", *p);
    }
  }

can be optimized to:

  void foo(int *p) {
    *p++;
    printf("val is %d\n", *p);
  }

Note that static analyzers will most likely issue a warning here as such a trivial case is most likely a mistake. But the check for NULL may be part of an inline function that is used in many places, and thanks to the undefined behavior, the code that handles the NULL case will only be generated when relevant. The problem, of course, is that it assumes that the programmer knows what he is doing and doesn't make mistakes.

In the case of memcpy(NULL, NULL, 0), there probably isn't much to gain making it undefined. It most likely doesn't help with the memcpy implementation (len=0 is a generally no-op), and inference based on the fact that the arguments can't be NULL is more likely to screw the programmer up than to improve performance.

[-]

high_na_euv 10 months ago

But how much actual performance is gained here?

[-]

bagels 10 months ago

It all adds up. All those instructions you don't have to execute, especially memory access and cache misses from jumps, pipeline stalls from conditionals, not just from this optimization.

menaerus 10 months ago

It depends on your CPU microarchitectural details, on the complexity and size of your binary executable and the workload of your binary.

So there's no universal answer to your question but it could very well be "much".

ncruces 10 months ago

Imagine that you created a function GetPixel that reads an RGB pixel at a memory address, and which has a NULL check as a precondition.

If the compiler can "prove" that the pointer is not NULL it can (after inlining the call) remove 20 million checks for a 20 megapixel image.

The silly issue is the compiler using "you accessed it before" (aka "undefined behaviour") to "prove" that the pointer is not NULL.

But I can attest that avoiding 20 million such checks does indeed make a huge difference.

[-]

cv5005 10 months ago

Just make a non null checking version: GetPixelUnsafe() and let the responsibility onto the user to do the null check before the loop.

All of these 'problems' have simple and straigtforward workarounds, I'm not convinced these UB are needed at all.

[-]

nemothekid 10 months ago

>All of these 'problems' have simple and straigtforward workarounds, I'm not convinced these UB are needed at all.

He gave you a simple and straightforward example, but that example may not be representative of a real world program where complex analysis leads to better performing code.

As a programmer, its far easier to just insert bounds checks everywhere, and trust the system to remove them when possible. This is what Rust does, and it safe. The problem isn't the compiler, the problem is the standard. More broadly, the standard wasn't written with optimizing compilers in mind.

ncruces 10 months ago

That's a non solution for existing code that already calls GetPixel 20 million times.

It's not like I'm saying C is the best possible way to write new code.

I'm just commenting why this matters for performance, and “remove all undefined behavior" from C compilers is a non-starter.

Now go write Rust for all I care.

Dylan16807 10 months ago

If we're inlining the call, then we can hoist the NULL check out of the loop. Now it's 1 check per 20 million operations. There's no need to eliminate it or have UB at that point.

cesarb 10 months ago

The simplest example of a compiler optimization enabled by UB would be the following:

  int my_function() {
    int x = 1;
    another_function();
    return x;
  }

The compiler can optimize that to:

  int my_function() {
    another_function();
    return 1;
  }

Because it's UB for another_function() to use an out-of-bounds pointer to access the stack of my_function() and modify the value of x.

And the most important example of a compiler optimization enabled by UB is related to that: being UB to access local variables through out-of-bounds pointers allows the compiler to place them in registers, instead of being forced to go through the stack for every operation.

[-]

MrMcCall 10 months ago

I don't find those compelling reasons and, to the contrary, I think that kind of semantic circumvention to be a symptom of a poorly developed industry.

How can we have properly functioning programs without clearly-defined, and sensible, semantics?

If the developer needs to use registers, then they should choose a dev env/PL that provides them, otherwise such kludges will crash and burn, IMO.

[-]

wat10000 10 months ago

Are you saying that C compilers should change every local variable access to read and write to the stack just in case some function intentionally does weird pointer arithmetic to change their values without referring to them in the source code?

gpderetta 10 months ago

We stopped explicitly declaring locals with the 'register' keyword circa 40 years ago. Register allocation is a low hanging fruit and one of those things that is definitely best left to a compiler for most code.

wruza 10 months ago

And now they have to manage register pressure for it to keep being faster. And false dependencies. And some more. It doesn’t work like that. Developers can’t optimize like compilers do, not with modern CPUs. The compilers do the very heavy lifting in exchange for the complexity of a set of constraints they (and you as a consequence, must) rely on. The more relaxed these constraints are, the less performant code you get. Modern CPUs run modern interpreters as fast as dumbest-compiled C code basically, so if you want sensible semantics, then Typescript is one of the absolutely non-ironic answers.

bagels 10 months ago

We pay for the flexibility of not wearing seatbelts for increasing the consequences of crashes.

cv5005 10 months ago

You dont need UB for that.

A simple model for both compilers and programmers to understand:

"A variable whose address has not been taken need not be reachable via a random pointer".

I mean that's how an assembly programmer would think - if I put something in r0 I don't expect a store instruction to clobber it.

[-]

UncleMeat 10 months ago

What you describe there is UB. If you define this in the standard, you are defining a kind of runtime behavior that can never happen in a well formed program and the compiler does not have to make a program that encounters this behavior do anything in particular.

alerighi 10 months ago

Does this still matters today? I mean, first registers are anyway saved on the stack when calling a function, and caches of modern processors are really nearly as fast (if not as fast!) as a register. Registers these days are merely labels, since internally the processor (at least for x86) executes the code in a sort of VM.

To me it seems that all these optimizations were really something useful back in the day, but nowadays we can as well just ignore them and let the processor figure it out without that much loss of performance.

Assuming that the program is "bug free" to me is a terrible idea, since even mitigations that the programmer puts in place to mitigate the effect of bugs (and no program is bug free) are skipped because the compiler can assume the program has no bug. To me security is more important than a 1% more boost in performance.

[-]

gpderetta 10 months ago

Register allocation is one of the most basic optimizations that a compiler can do. Some modern cpus can alias stack memory with internal registers, but it is still not as fast as not spilling at all.

You can enjoy -O0 today and the compiler will happily allocate stack slots for all your variables and keep them up to date (which is useful for debugging). But the difference between -O0 and -O3 is orders of magnitude on many programs.

cesarb 10 months ago

> I mean, first registers are anyway saved on the stack when calling a function

No, they aren't. For registers defined in the calling convention as "callee-saved", they don't have to be saved on the stack before calling a function (and the called function only has to save them if it actually uses that register). And for registers defined as "caller-saved", they only have to be saved if their value needs to be kept. The compiler knows all that, and tends to use caller-saved registers as scratch space (which doesn't have to be preserved), and callee-saved registers for longer-lived values.

> and caches of modern processors are really nearly as fast (if not as fast!) as a register.

No, they aren't. For instance, a quick web search tells me that the L1D cache for a modern AMD CPU has at least 4 cycles of latency. Which means: even if the value you want to read is already in the L1 cache, the processor has to wait 4 cycles before it has that value.

> Registers these days are merely labels, since internally the processor (at least for x86) executes the code in a sort of VM.

No, they aren't. The register file still exists, even though register renaming means which physical register corresponds to a logical register can change. And there's no VM, most common instructions are decoded directly (without going through microcode) into a single µOp or pair of µOps which is executed directly.

> To me it seems that all these optimizations were really something useful back in the day, but nowadays we can as well just ignore them and let the processor figure it out without that much loss of performance.

It's the opposite: these optimizations are more important nowadays, since memory speeds have not kept up with processor speeds, and power consumption became more relevant.

> To me security is more important than a 1% more boost in performance.

Newer programming languages agree with you, and do things like checking array bounds on every access; they rely on compiler optimizations so that the loss of performance is only that "1%".

wbl 10 months ago

Many calling conventions use registers. And no loads and stores are extremely complex and not free at all: fewer can issue in each cycle and there's some very expensive hardware spent to maintain the ordering on execution.

rwmj 10 months ago

This explanation of why signed int overflow is undefined is interesting (although the behaviour is still very annoying): https://kristerw.blogspot.com/2016/02/how-undefined-signed-o... (HN discussion: https://news.ycombinator.com/item?id=11146384)

More examples here: http://blog.llvm.org/2011/05/what-every-c-programmer-should-...

Arch-TK 10 months ago

http://blog.llvm.org/2011/05/what-every-c-programmer-should-...

In a real world program removing all UB is some cases impossible without adding new breaking features to the C language. But, taking a real world program and removingh all UB which IS possible to remove will introduce an overhead. In some programs this overhead is irrelevant. In others, it is probably the reason why C was picked.

If you want speed without overhead, you need to have more statically checked guarantees. This is what languages such as Rust attempt to achieve (quite successfully).

[-]

uecker 10 months ago

Many real world C programs have no UB.

What Rust attempts to achieve is the possibility of accidentally introducing UB by designing the language in away that makes it impossible to have UB when sticking to the safe subset.

It also possibly to make sure to ensure that C programs have no UB and this does not require any breaking features to C. It usually requires some refactoring the program.

[-]

tialaramex 10 months ago

> Many real world C programs have no UB.

A bold claim, I've written a whole lot of software in C, and most of it I'd be astonished if it truly has no UB. Even some of the relatively small, carefully written programs probably have edge case UB I never worried about when writing them.

[-]

uecker 10 months ago

It is certainly true that many C programs have edge cases which trigger UB. I also have written many such programs where I did not care. This does not contradict my statement though. There are programmers who meticulously care (and/or have to care) about getting the edge cases right and this is entirely possible.

Arch-TK 10 months ago

I think I worded it poorly. In a real world program, a lot of optimizations rely on assumptions of not triggering UB.

Rephrased:

In a real world program removing all opportunities for UB is in some cases impossible without adding new breaking features to the C language.

This has nothing to do with whether you can or can't write a program without invoking UB. I am talking about a hypothetical large program which does not exhibit undefined behaviour but where if you modified it then you could trigger UB in many ways. The idea I am positing is that to make it such that you could not modify such a program in any way which could trigger UB, would be impossible without adding new breaking features to the C language (e.g. you would need to figure out some way of preventing pointers from being used outside of the lifetime of the object they point to).

[-]

uecker 10 months ago

This is exactly what I am working on.

But this does not need breaking features, it only needs 1) a opt-in safe mode, and 2) annotations to express additional invariant such as for lifetime. This would not break anything.

[-]

Arch-TK 10 months ago

It doesn't break existing code, unless you want to statically guarantee that it does not trigger UB, in which case it does. The point is that if you need an opt-in safe mode or annotations to express additional invariants then you can't magically make existing code safe.

[-]

uecker 10 months ago

A lot of existing code is already safe. You can't prove all (or even most) existing code safe automatically. This is also true for Rust if you do not narrowly define safe as memory safe. You could transform a lot of C code to be memory safe by adding annotations and do some light refactoring and maybe pushing some residual pieces to "unsafe" blocks. This would be very similar to Rust.

[-]

Arch-TK 10 months ago

> A lot of existing code is already safe.

Again, I am not trying to argue either way. The point I was making was about how you can't define away all UB in the C standard without needing to modify the language in a breaking way.

> You can't prove all (or even most) existing code safe automatically.

No but rust provides a proper type system which goes a long way to being able to prove and enforce a lot more about program behavior at compile time.

> You could transform a lot of C code to be memory safe by adding annotations and do some light refactoring and maybe pushing some residual pieces to "unsafe" blocks. This would be very similar to Rust.

It would only be somewhat similar to super basic entry level rust which ignores all the opportunities for type checking.

[-]

uecker 10 months ago

> > A lot of existing code is already safe.

> Again, I am not trying to argue either way. The point I was making was about how you can't define away all UB in the C standard without needing to modify the language in a breaking way.

This depends on how you define "breaking". I think one can add annotations and transform a lot of C code to memory safe C with slight refactoring without introducing changes into the language that would break any existing code. You can not simply switch on a flag make existing code safe ... except you can do this too ... it just then comes with a high run-time cost for checking.

> > > No but rust provides a proper type system which goes a long way to being able to prove and enforce a lot more about program behavior at compile time.

> > You could transform a lot of C code to be memory safe by adding annotations and do some light refactoring and maybe pushing some residual pieces to "unsafe" blocks. This would be very similar to Rust.

> It would only be somewhat similar to super basic entry level rust which ignores all the opportunities for type checking.

I do not believe you can solve a lot more issues with strong typing than you can already solve in C simply by building good abstractions.

[-]

Arch-TK 10 months ago

> You can not simply switch on a flag make existing code safe ... except you can do this too ... it just then comes with a high run-time cost for checking.

I don't think you can reasonably implement this even at a high runtime cost without breaking programs. Either way, you've managed to re-state the crux of my argument.

> I do not believe you can solve a lot more issues with strong typing than you can already solve in C simply by building good abstractions.

Then I don't think you have much familiarity with strong typing or are underestimating the performance impact of equivalently "safe" (in a broader sense than what rust uses the term for) abstractions in C.

The only way to get equivalent performance while maintaining the same level of guarantees in C is to generate C code, at which point you're definitely better off using another programming language.

cwzwarich 10 months ago

The example in this blurb is a pretty good one: https://www.hboehm.info/c++mm/why_undef.html

10 months ago

[deleted]

10 months ago

[deleted]

guerrilla 10 months ago

[flagged]

[-]

bla3 10 months ago

It's a big jump from "this behavior annoys me" to "I hate a person".

redleader55 10 months ago

Not only on your phone. It does the same on my laptop on Firefox.

gsck 10 months ago

Yeah its fine updating the url, but adding history checkpoints really does suck

1kaizen 10 months ago

[flagged]