Wednesday, July 4, 2018

A one word change to the C standard to make undefined behavior sane again

A lot has been written on C undefined behavior, some of it by myself and a lot more by people who know a lot more about compilers than I do. However, I now believe that a seemingly innocuous but far-reaching change to the standard has given permission for the current craziness, and I think undoing that change could be a start in rectifying the situation.

Proposal

In section 3.4.3, change the word "possible" back to "permissible", the way it was in C89.

Background

In all versions of the standard I have checked, section 3.4.3 defines the term "undefined behavior".
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

So that seems pretty clear, the compiler can do whatever it wants. But wait, there is a second paragraph that clarifies:

Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
So it's not a free-for-all, in fact it is pretty clear about what the compiler is and is not allowed to do, as there are essentially three options:
  1. It "ignores" the situation completely, so if the CPU hardware produces an overflow or underflow on an arithmetic operation, well that's what you get. If you write to a string constant, the compiler emits the write and either the string constant might get changed if there is no memory protection for string constants or you might get a segfault if there is.
  2. It "behaves in a manner characteristic of the environment". So no "demons flying out of your nose" nonsense, and no arbitrary transformations of programs. And whatever you do, you have to document it, though you are not required to print a diagnostic.
  3. It can terminate with an error message.
I would suggest that current behavior is not one of these three, and it's not in the range bounded by these three either. It is clearly outside that defined range of "permissible" undefined behavior.

But of course compiler writers have an out, because more recent versions of the standard changed the word "permissible", which clearly restricts what you are allowed to do, to "possible", which means this is just an illustration of what might happen.

So let's change the word back to "permissible".

15 comments:

Shafik Yaghmour said...

That second paragraph is labeled as a "Note" which means that it is non-normative ( informational) and do not contain requirements so they are not binding on the implementation.

See this Stack Overflow answer https://stackoverflow.com/a/21364467/1708801 which quotes the relevant ISO/IEC directive.

Shafik Yaghmour said...

Interesting it looks like C89 did not have notes, but a lot of content was moved to notes in C99. Though the intent looks like this paragraph was not meant to be binding.

Marcel Weiher said...

Well, speaking of a range of "permissible" things does sound fairly binding to me.

https://www.merriam-webster.com/dictionary/permissible

YMMV.

Peter said...

I don't think that permissible/possible is the problem. "Ignoring the situation completely with unpredictable results" is precisely what current compilers do: they apply a series of program transformations, each preserving all defined behavior, and ignoring any undefined behavior.

Consider how "deleted NULL tests" happen. The compiler sees a pointer dereference. If that pointer were NULL, the program would have undefined behavior, so the compiler can ignore that situation completely, even following the C89 wording. If the compiler then sees a comparison against NULL, it considers all situations that the standard requires it to consider, and it sees that in those situations the comparison is always false.

That's how that phrase is understood.

mcherm said...

Believe it or not, the compiler writers are not malicious operators out to get you. They have not designed their compilers to make demons fly out of your nose simply in order to teach those careless programmers a lesson about careful attention to standards.

C (and, to be fair, nearly every programming language) purports to operates on a very simple machine: one in which some CPU performs steps one at a time, reading from and writing to variables (stored in main memory) before and after each step. You can still purchase CPUs that work this way, but no modern computer does.

In a modern computer, it takes hundreds to thousands of times longer to read data from memory and write it back as it does to perform a step of a calculation. And doing just one step of a calculation is terribly wasteful when pipelining and other parallellism can perform many steps simultaneously. And that's just on a single core -- even your phone likely is running multiple cores. To make these things run thousands of times faster, we build layer after layer of caching and cleverly pack multiple commands into pipelines that can be executed together.

To take an analogy that one can understand without knowing anything about modern CPUs, imagine writing an algorithm to sort an array. Now imagine the change in how that algorithm would perform if you ran it as written versus if you performed a bounds check before every read or write. Your code might spend 80% of its time performing bounds checks, or (to put it differently) run 5 times faster without the checks.

Now, compiler-writers cannot eliminate most of the "checks" that they use to ensure that the complex web of caches and pipelining we use is producing the results that we desire. But they can eliminate a lot of them. Doing so requires making assumptions. And the C spec provides the assumptions that they can make -- specifically, that programmers are not doing any of the things that are listed as "undefined" behavior.

If they ARE doing those things, the compilers do not actually go out of their way to program in flying nose demons. Instead, what they do is to ignore the problem -- one of the specific actions you propose that the spec permit. The problem is that you are wrong when you say that ignoring the problem will clearly and obviously lead to simple, understandable results (like changing the string constant or segfaulting). In fact, ignoring it can lead to changes like modification of a different value in memory -- the compiler writer doesn't have a way to know just what that other value does, but given how this programmer writes, it's probably the flag that controls the daemon behavior and setting it to "3" will direct the daemons out through your nose.

In other words, changing the spec would NOT have the effect you desire of making modern computer hardware behave in a simple, straightforward fashion. It MIGHT have the effect of drastically slowing down properly written programs in order to slightly constrain the kinds of errors we could get from running invalid programs.

victor yodaiken said...

This is essentially the proposal I have been circulating for the last couple of months.

https://docs.google.com/document/d/1xouelPcphQ-o7DmdSwz5UcL42M6bdA3t93Nm_5Hbomc/edit?usp=sharing

victor yodaiken said...

mcherm repeats some common justifications for the current sad state of C undefined behavior, but those are wrong.

Modern CPUs don't have any behavior that would have shocked CPU designers of the 1970s or, really, even before. In partcular, pipelining was common and even used in the PDP11 processors. The C abstract machine is totally compatible with pipelines, OO, caches, speculation and with standard techniques of compiler optimization like peephole optimization, constant reduction etc. etc. None of these things are new discoveries.

Nothing abuot pipelining or any other feature of processors requires bounds checking. In fact, the original purpose of the C undefined behavior provision was to license compilers to *ignore* bounds checking, which was left as a responsibility of programmers.

It is wrong to claim that our choice is between the compiler *preventing* undefined behavior and the compiler abandoning the C semantics when it enounters undefined behavior. Nobody is asking compilers to transform C into a safe language. Everyone understands that e.g. writing past an array boundary might cause unpredictable program behavior - what they are insisting on is that the compiler does not create surprises. It is really irritating to have people consistently attempt to defend the misbehavior of C compilers by (incorrectly) explaining some elementary aspect of processor design or compilation.

Marcel Weiher said...

Thank you, Victor, my thoughts exactly.

@mchen: I know a tiny little bit about optimization (hint: check out the book!), and about CPUs. I've been programming in C for well over 30 years. What you write has absolutely nothing to do with the things compilers do these days, and nothing to do with the hardware. The 68K I wrote my first C programs on had two's complement arithmetic just like modern CPUs do, and I even wrote code on our school's PDP-11.

No, C does not do bounds checks. It is modern optimizing compilers that check the bounds and eliminate the code if the last iteration goes outside bounds. Including the parts of the loop that do not go out of bounds. It is modern C compilers that eliminate bounds checks that the programmer put in, because they "proved" that the accesses cannot go out of bounds. Incorrectly, it turns out.

And no, those compilers are not "ignoring" the undefined behavior. Ignoring it would leave the out-of-bounds access in, and the rest of the loop, and it would also leave the bounds-check in. What they are doing is actively exploiting undefined behavior.

And no, there is no evidence that doing so has any noticeable positive effect on performance, except in contrived benchmark situations, and even there it's nothing to write home about. See http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf

Also compare Proebsting's law: the overall contribution of compiler technology to performance is not all that overwhelming in the first place, and most of the gains come from the low-hanging fruit.

Victor Yodaiken said...

Peter De Wachter:
This is a situation where it's easy to make something seem plausible, even when it is false. The compiler is not igoring the previous null use, it is deducing unsoundly

deference of pointer -> pointer is not null

There is nothing in the standard that supports that deduction. The standard specifies how if(p == null ){... }else{...} should work and the compiler is ignoring that specification because some other code element has undefined behavior. The standard permits omission of calculation only when the calculation is redundant and has no side effects. In this case, no calculation was done previously and there may well be a side effect.

Code: ... if(p== null) ... if(p == null) ...

can be optimized per standard because this is a redundant computation with no side effects. However

Code: ... p->x; .... if(p== null) ...

does not meet the requirements. The idea that undefined behavior is somehow impossible makes no sense at all.

Horacio Mijail said...

On one hand I wanted to say that it's interesting how defenders of Undefined Behavior keep boiling down the subject to favoring performance over correctness – AND doing so while seemingly misunderstanding or not even being well informed about the subject, so they will probably be particularly prone to trigger UB. It's just scary.

That's why personally I would like some of the proposals for a safer C to be implemented - Regehr's or DJB's, for example.

On the other hand I'm surprised to see other kind of proposals here, both Marcel's and Victor's, to "just" remove the active exploiting of UB.

So I wanted to ask: why? I don't see any explained rationale, and to me sounds risky; adding minimal changes to the text sounds suspiciously easy when both standard and compiler writers are lawyering the hell out of the text already. Why not go for a full clamp down?

I'd like to understand the point of view.

Unknown said...

Much of the debate over UB would be rendered moot if people on both sides would recognize a couple of simple facts: the Standard makes no attempt to forbid obtuse compiler writers from producing implementations that are "conforming", but are of such poor quality as to be useless. It also makes no attempt to specify everything that might be necessary to make a compiler suitable for any particular purpose.

A useful principle that quality compilers intended for most purposes should follow is that actions which have no defined side-effects should have no side effects at all beyond those that would result from attempting to perform them in "literal" fashion. Having integer overflows trigger an implementation-defined trap may be useful for some purposes, and having it cause no side-effects but yielding a value that might behave "strangely" may be useful for others. Having integer overflows affect the behavior of other code even in situations where the result ends up being ultimately ignored, however, serves no useful purpose, and is unlikely to facilitate any significant useful "optimizations" outside contrived circumstances.

The Standard does not recognize any situation where a value of type "int" might be observed to be greater than INT_MAX, without the program invoking Undefined Behavior. Consequently, it includes no means by which a compiler could be allowed to optimize "x+1 > y" into "x >= y" while still imposing any constraints whatsoever on how `x+1` will behave when `x==INT_MAX`. The fact that the design of the Standard is incapable of imposing any constraints, however, does not imply that quality implementations shouldn't uphold at least some behavioral guarantees (such as the above regarding side-effects) anyhow.

Unknown said...

Looking at the linked proposal document's proposals for UB and optimization, I would suggest that what would be most helpful would be to have the Standard recognize a category of implementation that runs within an environment it does not control (but the programmer might), and which documents a "nominal" relationship between most actions and the underlying environment. There should then be a standard means via which programs can invite or forbid various optimizations that would be allowed to affect observable program behavior in some cases.

The Standard's normal approach of trying to specify allowable optimizations purely by characterizing cases where programs have fully-defined behavior and others where implementations are allowed to jump the rails entirely, will make things needlessly difficult for spec writers, compiler writers, and programmers alike. There are many situations where an optimization may allow some action to be performed in several different ways, but all satisfy a program's requirements. Inviting compilers to jump the rails in such cases would often prevent programmers from writing what would otherwise be optimal code. On the other hand, trying to precisely characterize all such cases would make the Standard absurdly complicated. Saying that a compiler configured to use a particular set of optimizations will exhibit loosely-characterized behavior in a certain situation and letting programmers decide whether the behavior is limited enough to be useful would make it possible to simplify the spec, compilers, and user programs.

Incidentally, touching briefly on the third subject, "aliasing", I'd suggest that should in some measure tie in with the above. The authors of the Standard tried to characterize allowable actions in a program by saying that writing to objects in memory causes the bytes therein to become associated with types, but that model ends up being needlessly ambiguous and complicated, while failing to fit the needs of compilers and programmers alike. If one instead recognizes that most useful "aliasing" optimizations involve reordering of accesses to seemingly-unrelated objects, one could then say that compilers are allowed to reorder accesses to different objects except in the presence of certain kinds of evidence suggesting that such ordering may adversely affect required aspects of program behavior. Trying to characterize all programs which contain sufficient evidence to force correct orderings in all cases that matter would be complicated and difficult; but saying that a program has defined behavior iff its behavior would be defined under all allowable orderings is much simpler.

Unknown said...

Many of the "proponents" of UB would love to have the Standard define ways to do all of the things their code needs to do, as straightforwardly as their code can do them.

Unfortunately, in cases where programmers trying to perform some tasks need to perform some action and have it behave predictably, but implementations intended to be suitable for other tasks would benefit from not having to define the behavior, the Standards Committee almost always give the latter priority. They do so on the premise that people seeking to write high-quality implementations that are suitable for tasks that would require action in question should be capable of defining the action themselves whether or not the Standards Committee does so.

The problem with the Standard which led to the rift between the dialect processed by gcc/clang optimizers and dialects that are suitable for low-level programming isn't that it allows compilers to behave in nonsensical fashion. Instead, the problem is that the authors relegated some important principles to the rationale, rather than mentioning them in the Standard. According to the rationale, the question of how to behave in particular situations where the Standard imposes no requirements is often a Quality of Implementation issue. Consequently, the proper answer to many questions about the Standard would allow a conforming compiler to do X when given a particular piece of code should be "Doing X would not make a compiler non-conforming; a compiler that does X, however, may be less suitable for many purposes than one that would reliably do Y instead".

The Standard allows garbage-quality implementations to be conforming. The authors didn't think that should be a problem, though, because they expected compiler writers to seek to produce good quality ones that were suitable for their stated purposes. Unfortunately, they failed to sufficiently obvious the fact that what the authors of gcc and clang have been arguing for is the right to their compiler behave in garbage-quality-but-conforming fashion.

Horacio Mijail said...

John Payson:

'actions which have no defined side-effects should have no side effects at all beyond those that would result from attempting to perform them in "literal" fashion.' -> The very fact that you are quoting the word literal (oh the irony!) shows how handwavy is your definition. How is that useful? Also, "side effect" has a very concrete meaning in the Standard.

'yielding a value that might behave "strangely" may be useful for others' ->
What does "strangely" (quotes and all) even mean?
FWIW, current UB has already been documented as able to generate values that can appear to be true and false at the same time. Sounds like what you're wishing here, and I fail to see how would anyone find this non-scary, let alone useful.

Maybe what you are saying is that you would prefer unspecified or implementation-defined behavior? But of course those are already in the Standard, and even gcc voluntarily turns some UB into them IIRC.

More generally, every time you use expressions like "quality compiler", "for most purposes", "may be useful for others", "in some measure", "all cases that matter" etc, a language-lawyer gets his wings.

"what would be most helpful would be to have the Standard recognize a category of implementation [...] which documents a "nominal" relationship between most actions and the underlying environment." -> Sounds like you're wanting to turn compiler and runtime implementation details into part of the Standard? Good luck with that. Remember that the Standard stays so purposefully abstract that it doesn't even mention a stack.

"a standard means via which programs can invite or forbid various optimizations " -> FWIW, DJB argued at some point that the only sensible way to make a powerful optimizing compiler is by making the language expressive enough to express those optimizations in itself. I think that this is a great point. But to me sounds like C is by far not such a language, and doesn't want to be.

Regarding aliasing: on one hand you want the compiler to gather evidence of non-aliasing, on the other you discard types as such an evidence? Sounds "No True Scotsman" to me. Note, I'm not defending strict aliasing, but I don't see how your idea improves anything over it.

Unknown said...

The meaning of "literal fashion" depends somewhat upon the operation, but in most cases the meaning is fairly straightforward. Adding two numbers means trying to add two numbers. Writing to an object means attempting to write the bit pattern associated with the object into the storage associated with it, etc.

Some compiler writers insist that there is no reason they should be expected to support any behaviors beyond what the Standard mandates. Because the Standard makes no attempt to require that all compilers be suitable for every purpose, a compiler that seeks to do nothing more than what the Standard requires will be unsuitable for any purpose for which that is not sufficient.

C was invented as language that can be used for low-level programming. What features and guarantees are available in a low-level programming language should vary in significant measure upon those of the underlying platform.

Given "float *p; ... *(uint32_t)p += 1;", the pointer/lvalue formed in the compound assignment expression accesses p, but doesn't "alias" it, because between the creation and last actual or hoisted use thereof, there are no other actual or hoisted accesses to the storage. The only way that lvalue could be said to "alias" p would be if a compiler ignored evidence that it is derived from p and thus doesn't alias it.