If I were the author, I would skip the part about using the compiler explorer and reading the assembly. When I write C code, I need to satisfy the rules of the C language. Unless I’m debugging the compiler or dealing with performance issues, my experience is that reading the generated assembler and understanding it is usually a slow way of debugging. The author eventually did compile with -fsanitize=undefined but I would honestly do that first. It prints a nice error message about the exact issue as the author has shown.
Understanding what the C compiler generates is interesting, but without a corresponding intuition about the optimizer passes, such understanding is shallow and unlikely to be generalized to other problems in the future. You probably won’t even remember this the next time you debug another undefined behavior. On the other hand, if you were to know the optimizer passes employed by the compiler and could deduce this code from that, then it is a useful exercise to enhance your intuition about them.
I do agree that knowledge of compiler optimizations is really important to working this way, though you'll eventually pick them up anyway. I don't see much value in looking at -O0 or -Og disassembly. You want the strongest stuff the compiler can generate if you're going to do this, which is usually either -O3 or -Oz (both of which are strong in their own ways). -O0 disassembly is... just so much pain for so little gain. Besides, -O3 breaks more stuff anyway!
For someone without this level of experience (and who isn't interested in learning)... yeah, I can see why you'd want to do this another way. But if you've got the experience already, it's plenty fast enough.
> Option 1) seemed like the easiest one, but it also felt a bit like kicking the can down the road – plus, it introduced the question of which standard to use.
Arguably, that's the sanest one: you can't expect the old C code to follow the rules of the new versions of the language. In a better world, each source file would start with something like
#pragma lang_ver stdc89
and it would automatically kick off the compatibility mode in the newer compilers, but oh well. Even modern languages such as Go miss this obvious solution.On the topic of the article, yeah, sticking anything other than 0 or 1 into C99 bool type is UB. Use ints.
If you’re just a packager, it’s your job to get the package to build and work correctly; for your own sanity, you should be making minimal changes to the underlying code to facilitate that. Get it building with the old language version and file a bug report.
Well, to be pedantic, the entire point of the C standard, and the standard body, is that you should expect it to work, as long as you're working within the standard!
Yikes. I think this article undersells the point somewhat. This line of code undermines the type system by spraying -1's into an array of structs, so the only surprise to me is that it took this long to break.
typedef struct
{
// If false use 0 for any position.
// Note: as eight entries are available,
// we might as well insert the same name eight times.
boolean rotate;
// Lump to use for view angles 0-7.
short lump[8];
// Flip bit (1 = flip) to use for view angles 0-7.
byte flip[8];
} spriteframe_t;
which is okay to splat with all-ones as long as `boolean` is either a typedef for a fundamental type(*) or an enum – because C enums are just ints in a trenchcoat and have no forbidden bit patterns! The C99 `bool`/`_Bool` is, AFAICS, the first type in C that has fewer values than possible bit patterns.So yeah, on C99 and C++ this always had UB and could've broken at any time – though I presume compiler devs were not particularly eager to make it ill-behaved just because. But in pre-C99 it's entirely fine, and `rotate == true || rotate == false` could easily be false without UB.
---
(*) other than `char` for which setting the MSB is… not UB but also not the best idea in general.
> Ah-ha! The generated instructions were ever so slightly different. This would be great news, if it wasn't for me forgetting about one little detail: I have zero knowledge of x86 assembly.
Lol'd at this. I've been there: "ah hah! I found you. hrm, now what does this mean..."
TFA makes me thankful my work doesn't involve C / C++. Learning it earlier in life was enough.
Ah yes, obfuscation
if (sprtemp[frame].rotate == false)
Note that this is explicitly comparing two values, which is very different from checking whether a single value is true. Surely you wouldn't expect -1 == 0 to evaluate to true.I wouldn't, no - but that's exactly what's happening in the test case.
Likewise, I wouldn't expect -1 == 1 to evaluate to true, but here we are.
The strict semantics of the new bool type may very well be "correct", and the reversed-test logic used by the compiler is certainly understandable and defensible - but given the long-established practice with integer types - i.e "if(some_var) {...}" and "if(!some_var) {...}" - that non-zero is "true" and zero is "false", it's a shame that the new type is inconsistent with that.
if (something == true)
I haven't done so ever since (1997), and thus I avoid the contrary (with == false) as well, using ! instead. But I would be a lot less ashamed if I knew that there are such conditions in production software.
I would also never guess that the problem described in the article may occur...