Wednesday, July 4, 2018

A one word change to the C standard to make undefined behavior sane again

A lot has been written on C undefined behavior, some of it by myself and a lot more by people who know a lot more about compilers than I do. However, I now believe that a seemingly innocuous but far-reaching change to the standard has given permission for the current craziness, and I think undoing that change could be a start in rectifying the situation.


In section 3.4.3, change the word "possible" back to "permissible", the way it was in C89.


In all versions of the standard I have checked, section 3.4.3 defines the term "undefined behavior".
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

So that seems pretty clear, the compiler can do whatever it wants. But wait, there is a second paragraph that clarifies:

Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
So it's not a free-for-all, in fact it is pretty clear about what the compiler is and is not allowed to do, as there are essentially three options:
  1. It "ignores" the situation completely, so if the CPU hardware produces an overflow or underflow on an arithmetic operation, well that's what you get. If you write to a string constant, the compiler emits the write and either the string constant might get changed if there is no memory protection for string constants or you might get a segfault if there is.
  2. It "behaves in a manner characteristic of the environment". So no "demons flying out of your nose" nonsense, and no arbitrary transformations of programs. And whatever you do, you have to document it, though you are not required to print a diagnostic.
  3. It can terminate with an error message.
I would suggest that current behavior is not one of these three, and it's not in the range bounded by these three either. It is clearly outside that defined range of "permissible" undefined behavior.

But of course compiler writers have an out, because more recent versions of the standard changed the word "permissible", which clearly restricts what you are allowed to do, to "possible", which means this is just an illustration of what might happen.

So let's change the word back to "permissible".