A core part of the problem of UB in C and C++, is that it is gratuitously over applied.
Mercifully the article calls out the BS argument of "old hardware" justifying UB. It is simply a false argument. The overwhelming majority of UB in C and C++ should be either implementation defined or unspecified behaviour. Security vulnerabilities due to overflow or null dereferences being UB should never have been possible because there are no platforms in which those operations are not defined (some trap, some wrap, some go to infinity), but that is all under the banner of implementation defined behavior. Labelling these things as UB is _solely_ to allow performance optimizations in narrow cases, at the cost of safety in all cases.
In committee meetings I've been in recently the new refrain I'm hearing/reading that has replaced "we need to support various hardware" is an even more stupid argument: if we make it so that these aren't UB then people will rely on the common behavior and write code that is incorrect on platforms that behave differently. e.g. instead of software that is always wrong on one platform, you make software that is semi-randomly wrong on all platforms (because whether or not a compiler removes UB in one case is dependent on compiler version, flags, inlining, etc and if any of those change then suddenly the same code you had yesterday has a security bug when shipped).
My favourite description of undefined behaviour. The poster is corrected later on in the thread about whether the specific operation discussed would invoke undefined behaviour, but the description of what happens when undefined behaviour occurs is gold:
https://groups.google.com/g/comp.lang.c/c/ZE2B2UorTtM/m/1ROv...
Joona I Palaste, 2001-01-19, comp.lang.c
This isn't about the post-increment operator, this is about the order
of evaluation of the operands.
Since you're modifying the value of i twice without a sequence point
in between, either of the two results are exactly as much "expected".
Also, equally "expected" behaviour includes incrementing every
variable in the array, flipping all the bits in every variable in the
array, converting all instances of the text string "/usr" in memory
to "fsck", changing the colours of your screen to purple, calling the
police on your modem line and telling them you're being attacked by
a one-eyed Martian wielding a herring while singing "Hi ho, it's off
to work we go", and even weirder stuff.
So... what it all boils to... when writing your compiler, just flip
a coin and use the one of the two behaviours you listed that
corresponds with the coin's face.
Not the first discussion of this topic, by any means. In this case, I've tried to boil it down to the essential points a practical programmer needs to know, but the article still ended up longer than I initially aimed for.
Here's another interesting post if you want to delve further into an example of undefined behavior created by gcc optimization: https://thephd.dev/c-undefined-behavior-and-the-sledgehammer....
Also, this quote comes to mind: "C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off": https://www.stroustrup.com/quotes.html
In the bit where he shows
void error(const char* msg);
int successor(int a) {
if (a + 1 < a) error("Integer overflow!");
return a + 1;
}
and says the if is compiled away at -O3, does any one know if it remains at any lower optimization level? I know some of the more aggressive optimizations intentionally ignore some checks, I don't know if that applies here. I found the -O3 odd for trying to help make his point, unless it doesn't work at -O2.I recommend reading the resources under https://en.cppreference.com/w/c/language/behavior#External_l... (–> External links).
I don't get it.
How can UB on double-free, use-after-free, dangling pointers, etc lead to optimizations?
Once again, I want to plead. At least have a Warning option to annotate any time undefined behavior is encountered by a compiler. The goal should be to promote optimizations to written code and improve code quality. Not just the result of one particular compiler.
I keep finding myself angry about the recent (some number of years) focus on C and C++'s undefined behavior. I have been writing C and C++ for 27 years, 16 years professionally, and despite all the scary implications, I do not understand why ANYONE cares. I do not get it. This is yet another article that goes on and on about nonsensical situations that are just shitty code. Integer overflow? Who cares? Unless you're targeting a specific compiler and architecture, it doesn't matter. C and C++ have footguns. Everyone knows that. Who cares?
I am anger commenting, because I'm just sick of this, but this article still says nothing to convince me that any of this matters.