> String to float conversion had a table missing four values. This caused an array access overflow which resulted in imprecise values in some cases.
I've once wrote a function to parse the date format from log files that Go doesn't natively support, and forgot to add November. I quit that job in April, so I never saw any issues. However when 1st of November came my ex-colleagues saw no logs for this day, and when they found out the reason they created a hash tag #nolognovember which you can probably find somewhere to this day :)
> the vast bulk of sanitizer complaints came from invoking undefined or implementation-defined behavior in harmless ways
This is patently false. Any Undefined Behavior is harmful because it allows the optimizer to insert totally random code, and this is not a purely theoretical behavior, it's been repeatedly demonstrated happening. So even if your UB code isn't called, the simple fact it exists may make some seemingly-unrelated code behave wrongly.
> Passing pointers to the middle of a data structure. For example, free takes a pointer to the start of an allocation. The management structure appears just before that in memory; computing the address of which appears to be undefined behavior to the compiler.
To clarify, the undefined behavior here is that the sanitizer sees `free` trying to access memory outside the bounds of what was returned by `malloc`.
It's perfectly valid to compute the address of a struct just before memory pointed to by a pointer you have, as long as the result points to valid memory:
void not_free(void *p) {
struct header *h = (struct header *) (((char *)p) - sizeof(struct header));
// ...
}
In the case of `free`, that resulting pointer is technically "invalid" because it's outside what was returned by `malloc`, even though the implementation of `malloc` presumably returned a pointer to memory just past the header.> [...] detect places where the program wanders into parts of the C language specification [...]
Small nitpick, the UB sanitizer also has some checks specific for C++ https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
And don't forget -fbounds-safety, which is in Apple's clang/llvm and perhaps other versions. https://clang.llvm.org/docs/BoundsSafety.html
That arithmetic shift right implementation is also what I came up with for a video game fantasy architecture that only has logical shift right. (16-bit registers)
; asr rd, rs1, rs2 ; rd = signed(rs1) >> rs2
and rt, rs1, 0x8000 ; isolate sign bit
lsr rt, rt, rs2 ; shift sign bit to final position
neg rt, rt ; sign-extended part of final result
lsr rd, rs1, rs2 ; base part of final result
or rd, rd, rt ; combine both parts
It might be easier to understand broken down this way for anyone who didn't understand the article's one-liner.
Wow, this: "random() was returning values in int range rather than long." is a very nice bug find. Randomness is VERY hard to check for humans. For example, Python's binomial distribution is very bad on some inputs [1], giving widely wrong values, but nobody found it. I bumped into it when I implemented an algorithm to compute the approximate volume of solutions to a DNF, and the results were clearly wrong [2]. The algorithm is explained here by Knuth, in case you are interested [3]
[1] https://www.cs.toronto.edu/~meel/Slides/meel-distform.pdf [2] https://github.com/meelgroup/pepin [3] https://cs.stanford.edu/~knuth/papers/cvm-note.pdf