Does the C++ standard allow for an uninitialized bool to crash a program?

325

I know that "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I would have assumed the code looked safe enough. In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization were enabled.

I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string "true" or "false" to an existing destination buffer. Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value.

// Zero-filled global buffer of 16 characters

char destBuffer[16];



void Serialize(bool boolValue) {

    // Determine which string to print based on boolValue

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    const size_t len = strlen(whichString);



    // Copy string into destination buffer, which is zero-filled (thus already null-terminated)

    memcpy(destBuffer, whichString, len);

}

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."

I have setup a Compiler Explorer example that shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.

#include <iostream>

#include <cstring>



// Simple struct, with an empty constructor that doesn't initialize anything

struct FStruct {

    bool uninitializedBool;



   __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem

   FStruct() {};

};



char destBuffer[16];



// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter

void Serialize(bool boolValue) {

    // Determine which string to print depending if 'boolValue' is evaluated as true or false

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    size_t len = strlen(whichString);



    memcpy(destBuffer, whichString, len);

}



int main()

{

    // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.

    FStruct structInstance;



    // Output "true" or "false" to stdout

    Serialize(structInstance.uninitializedBool);

    return 0;

}

The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:

const size_t len = strlen(whichString); // original code

const size_t len = 5 - boolValue;       // clang clever optimization

While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

edited 28 mins ago

Ian Kemp

16.5k126797

asked Jan 10 at 1:39

Remz

883239

New contributor

138

It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

– John Kugelman
Jan 10 at 2:04

3

Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

– Euro Micelli
Jan 10 at 3:48

1

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
2 days ago

On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

– TLW
yesterday

@TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

– supercat
yesterday

|
show 1 more comment

325

// Zero-filled global buffer of 16 characters

char destBuffer[16];



void Serialize(bool boolValue) {

    // Determine which string to print based on boolValue

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    const size_t len = strlen(whichString);



    // Copy string into destination buffer, which is zero-filled (thus already null-terminated)

    memcpy(destBuffer, whichString, len);

}

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

#include <iostream>

#include <cstring>



// Simple struct, with an empty constructor that doesn't initialize anything

struct FStruct {

    bool uninitializedBool;



   __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem

   FStruct() {};

};



char destBuffer[16];



// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter

void Serialize(bool boolValue) {

    // Determine which string to print depending if 'boolValue' is evaluated as true or false

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    size_t len = strlen(whichString);



    memcpy(destBuffer, whichString, len);

}



int main()

{

    // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.

    FStruct structInstance;



    // Output "true" or "false" to stdout

    Serialize(structInstance.uninitializedBool);

    return 0;

}

const size_t len = strlen(whichString); // original code

const size_t len = 5 - boolValue;       // clang clever optimization

Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

edited 28 mins ago

Ian Kemp

16.5k126797

asked Jan 10 at 1:39

Remz

883239

New contributor

138

It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

– John Kugelman
Jan 10 at 2:04

3

Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

– Euro Micelli
Jan 10 at 3:48

1

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
2 days ago

On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

– TLW
yesterday

@TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

– supercat
yesterday

|
show 1 more comment

325

// Zero-filled global buffer of 16 characters

char destBuffer[16];



void Serialize(bool boolValue) {

    // Determine which string to print based on boolValue

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    const size_t len = strlen(whichString);



    // Copy string into destination buffer, which is zero-filled (thus already null-terminated)

    memcpy(destBuffer, whichString, len);

}

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

#include <iostream>

#include <cstring>



// Simple struct, with an empty constructor that doesn't initialize anything

struct FStruct {

    bool uninitializedBool;



   __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem

   FStruct() {};

};



char destBuffer[16];



// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter

void Serialize(bool boolValue) {

    // Determine which string to print depending if 'boolValue' is evaluated as true or false

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    size_t len = strlen(whichString);



    memcpy(destBuffer, whichString, len);

}



int main()

{

    // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.

    FStruct structInstance;



    // Output "true" or "false" to stdout

    Serialize(structInstance.uninitializedBool);

    return 0;

}

const size_t len = strlen(whichString); // original code

const size_t len = 5 - boolValue;       // clang clever optimization

Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

edited 28 mins ago

Ian Kemp

16.5k126797

asked Jan 10 at 1:39

Remz

883239

New contributor

// Zero-filled global buffer of 16 characters

char destBuffer[16];



void Serialize(bool boolValue) {

    // Determine which string to print based on boolValue

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    const size_t len = strlen(whichString);



    // Copy string into destination buffer, which is zero-filled (thus already null-terminated)

    memcpy(destBuffer, whichString, len);

}

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

#include <iostream>

#include <cstring>



// Simple struct, with an empty constructor that doesn't initialize anything

struct FStruct {

    bool uninitializedBool;



   __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem

   FStruct() {};

};



char destBuffer[16];



// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter

void Serialize(bool boolValue) {

    // Determine which string to print depending if 'boolValue' is evaluated as true or false

    const char* whichString = boolValue ? "true" : "false";



    // Compute the length of the string we selected

    size_t len = strlen(whichString);



    memcpy(destBuffer, whichString, len);

}



int main()

{

    // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.

    FStruct structInstance;



    // Output "true" or "false" to stdout

    Serialize(structInstance.uninitializedBool);

    return 0;

}

const size_t len = strlen(whichString); // original code

const size_t len = 5 - boolValue;       // clang clever optimization

Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

c++ llvm undefined-behavior abi

edited 28 mins ago

Ian Kemp

16.5k126797

asked Jan 10 at 1:39

Remz

883239

New contributor

edited 28 mins ago

Ian Kemp

16.5k126797

asked Jan 10 at 1:39

Remz

883239

New contributor

edited 28 mins ago

Ian Kemp

16.5k126797

edited 28 mins ago

Ian Kemp

16.5k126797

edited 28 mins ago

Ian Kemp

16.5k126797

asked Jan 10 at 1:39

Remz

883239

New contributor

asked Jan 10 at 1:39

Remz

883239

asked Jan 10 at 1:39

Remz

883239

New contributor

Remz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

138

It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

– John Kugelman
Jan 10 at 2:04

3

Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

– Euro Micelli
Jan 10 at 3:48

1

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
2 days ago

On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

– TLW
yesterday

@TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

– supercat
yesterday

|
show 1 more comment

138

It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

– John Kugelman
Jan 10 at 2:04

3

Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

– Euro Micelli
Jan 10 at 3:48

1

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
2 days ago

On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

– TLW
yesterday

@TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

– supercat
yesterday

138

It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

– John Kugelman
Jan 10 at 2:04

Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

– Euro Micelli
Jan 10 at 3:48

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
2 days ago

On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

– TLW
yesterday

@TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

– supercat
yesterday

|
show 1 more comment

5 Answers
5

active

oldest

votes

196

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations.

It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.

You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register¹. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.

(An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)

ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.

In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)

The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
5U - boolValue. (BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data².)

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)

Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.

5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.

Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.

ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)

You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)

So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)

Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.

GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.

Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.

See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)

Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.

Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't let them optimize a<0 as always-true, only that tmp is always negative. (So they don't backtrack from the inputs of a calculation to derive range info, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is intentional user-friendliness or simply a missed optimization.)

Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128) for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don't dereference them. __m128i _mm_loadu_si128(const __m128i *) does unaligned loads by taking a misaligned __m128i* arg, not a void* or char*. Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?

GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.

There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't trying to make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.

Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.

(Other ABIs do make different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)

For example, a caller might have calculated a & 0x01010101 in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1 because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.

Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem] instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10] instead of movzx edi, byte [r10], because both require a REX prefix anyway.

This is why clang emits movzx eax, dil in Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
So I was interested to see that it doesn't do the same thing for bool.)

Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.

OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7): block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.

If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.

Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsb might be optimal. glibc memcpy might start using rep movsb for small sizes on CPUs with that feature, saving a lot of branching.

Tools for detecting UB and usage of uninitialized values

In gcc and clang, you can compile with -fsanitize=undefined to add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).

See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM. https://github.com/google/sanitizers/wiki/MemorySanitizer shows examples of clang -fsanitize=memory -fPIE -pie detecting uninitialized memory reads. It might work best if you compile without optimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2 in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memory changes the generated asm, and might result in a check for this.)

It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.

MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).

It should work for this case because the call to glibc memcpy with a length calculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.

Valgrind's memcheck will also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.

Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.

edited 17 hours ago

answered Jan 10 at 9:42

Peter Cordes

121k17184312

1

I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

– Joshua
Jan 11 at 3:27

3

xkcd.com/499 is pretty good explanation of what UB is.

– val
Jan 11 at 4:30

5

Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

– The_Sympathizer
Jan 11 at 7:04

1

And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

– davidbak
2 days ago

3

@The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

– supercat
yesterday

|
show 5 more comments

The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true or false). The true value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true and false -- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.

So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)

edited Jan 10 at 2:32

answered Jan 10 at 1:59

rici

153k19134199

10

The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

– ShadowRanger
Jan 10 at 2:08

3

@ShadowRanger You can always inspect the object representation directly.

– T.C.
Jan 10 at 2:12

6

@shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

– rici
Jan 10 at 2:28

2

Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

– Peter Cordes
Jan 10 at 8:21

3

The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

– Holger
Jan 10 at 10:47

|
show 6 more comments

The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).

Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

answered Jan 10 at 2:12

M.M

105k11116237

1

Is the first clause true? Does merely copying an uninitialized bool trigger UB?

– Joshua Green
Jan 10 at 3:25

9

@JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

– M.M
Jan 10 at 3:34

7

@JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

– David Schwartz
Jan 10 at 11:15

3

Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

– MSalters
Jan 10 at 20:03

3

On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

– supercat
Jan 10 at 21:23

|
show 2 more comments

A bool is only allowed to hold the values 0 or 1, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:

     // the compile could make asm that "looks" like this, from your source

const static char *strings = {"false", "true"};

const char *whichString = strings[boolValue];

If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.

edited Jan 10 at 9:45

Peter Cordes

121k17184312

answered Jan 10 at 2:02

Barmar

421k35245346

1

@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

– Barmar
Jan 10 at 2:09

1

You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

– Remz
Jan 10 at 2:25

2

@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

– Barmar
Jan 10 at 2:28

1

@Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

– Havenard
Jan 10 at 2:57

1

@Havenard, int is likely to be bigger than bool so that wouldn't prove anything.

– Sid S
Jan 10 at 4:11

|
show 6 more comments

Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

The standard says nothing about the internal representation of a bool. It only defines what happens when casting a bool to an int (or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).

So, the compiler, when it sees a bool is entitled to consider that said bool contains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for true and false are 1 and 0, respectively, the compiler is indeed allowed to optimise strlen to 5 - <boolean value>. Other fun behaviours are possible!

As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to

Your code working as you expected it to

Your code failing at random times

Your code not being run at all.

See What every programmer should know about undefined behavior

answered Jan 10 at 11:48

Tom Tanner

7,97522250

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Remz is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54120862%2fdoes-the-c-standard-allow-for-an-uninitialized-bool-to-crash-a-program%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

196

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)

See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.

Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.

If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.

Tools for detecting UB and usage of uninitialized values

See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.

MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).

edited 17 hours ago

answered Jan 10 at 9:42

Peter Cordes

121k17184312

1

I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

– Joshua
Jan 11 at 3:27

3

xkcd.com/499 is pretty good explanation of what UB is.

– val
Jan 11 at 4:30

5

Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

– The_Sympathizer
Jan 11 at 7:04

1

And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

– davidbak
2 days ago

3

@The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

– supercat
yesterday

|
show 5 more comments

196

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)

See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.

Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.

If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.

Tools for detecting UB and usage of uninitialized values

See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.

MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).

edited 17 hours ago

answered Jan 10 at 9:42

Peter Cordes

121k17184312

1

I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

– Joshua
Jan 11 at 3:27

3

xkcd.com/499 is pretty good explanation of what UB is.

– val
Jan 11 at 4:30

5

Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

– The_Sympathizer
Jan 11 at 7:04

1

And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

– davidbak
2 days ago

3

@The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

– supercat
yesterday

|
show 5 more comments

196

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)

See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.

Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.

If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.

Tools for detecting UB and usage of uninitialized values

See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.

MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).

edited 17 hours ago

answered Jan 10 at 9:42

Peter Cordes

121k17184312

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)

See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.

Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.

If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.

Tools for detecting UB and usage of uninitialized values

See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.

MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).

edited 17 hours ago

answered Jan 10 at 9:42

Peter Cordes

121k17184312

edited 17 hours ago

answered Jan 10 at 9:42

Peter Cordes

121k17184312

answered Jan 10 at 9:42

Peter Cordes

121k17184312

answered Jan 10 at 9:42

Peter Cordes

121k17184312

1

I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

– Joshua
Jan 11 at 3:27

3

xkcd.com/499 is pretty good explanation of what UB is.

– val
Jan 11 at 4:30

5

Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

– The_Sympathizer
Jan 11 at 7:04

1

And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

– davidbak
2 days ago

3

@The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

– supercat
yesterday

|
show 5 more comments

1

I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

– Joshua
Jan 11 at 3:27

3

xkcd.com/499 is pretty good explanation of what UB is.

– val
Jan 11 at 4:30

5

Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

– The_Sympathizer
Jan 11 at 7:04

1

And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

– davidbak
2 days ago

3

@The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

– supercat
yesterday

I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

– Joshua
Jan 11 at 3:27

xkcd.com/499 is pretty good explanation of what UB is.

– val
Jan 11 at 4:30

Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

– The_Sympathizer
Jan 11 at 7:04

And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

– davidbak
2 days ago

@The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

– supercat
yesterday

|
show 5 more comments

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)

edited Jan 10 at 2:32

answered Jan 10 at 1:59

rici

153k19134199

10

The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

– ShadowRanger
Jan 10 at 2:08

3

@ShadowRanger You can always inspect the object representation directly.

– T.C.
Jan 10 at 2:12

6

@shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

– rici
Jan 10 at 2:28

2

Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

– Peter Cordes
Jan 10 at 8:21

3

The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

– Holger
Jan 10 at 10:47

|
show 6 more comments

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)

edited Jan 10 at 2:32

answered Jan 10 at 1:59

rici

153k19134199

10

The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

– ShadowRanger
Jan 10 at 2:08

3

@ShadowRanger You can always inspect the object representation directly.

– T.C.
Jan 10 at 2:12

6

@shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

– rici
Jan 10 at 2:28

2

Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

– Peter Cordes
Jan 10 at 8:21

3

The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

– Holger
Jan 10 at 10:47

|
show 6 more comments

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)

edited Jan 10 at 2:32

answered Jan 10 at 1:59

rici

153k19134199

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)

edited Jan 10 at 2:32

answered Jan 10 at 1:59

rici

153k19134199

edited Jan 10 at 2:32

answered Jan 10 at 1:59

rici

153k19134199

answered Jan 10 at 1:59

rici

153k19134199

answered Jan 10 at 1:59

rici

153k19134199

10

The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

– ShadowRanger
Jan 10 at 2:08

3

@ShadowRanger You can always inspect the object representation directly.

– T.C.
Jan 10 at 2:12

6

@shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

– rici
Jan 10 at 2:28

2

Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

– Peter Cordes
Jan 10 at 8:21

3

The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

– Holger
Jan 10 at 10:47

|
show 6 more comments

10

The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

– ShadowRanger
Jan 10 at 2:08

3

@ShadowRanger You can always inspect the object representation directly.

– T.C.
Jan 10 at 2:12

6

@shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

– rici
Jan 10 at 2:28

2

Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

– Peter Cordes
Jan 10 at 8:21

3

The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

– Holger
Jan 10 at 10:47

The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

– ShadowRanger
Jan 10 at 2:08

@ShadowRanger You can always inspect the object representation directly.

– T.C.
Jan 10 at 2:12

@shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

– rici
Jan 10 at 2:28

Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

– Peter Cordes
Jan 10 at 8:21

The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

– Holger
Jan 10 at 10:47

|
show 6 more comments

The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

answered Jan 10 at 2:12

M.M

105k11116237

1

Is the first clause true? Does merely copying an uninitialized bool trigger UB?

– Joshua Green
Jan 10 at 3:25

9

@JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

– M.M
Jan 10 at 3:34

7

@JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

– David Schwartz
Jan 10 at 11:15

3

Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

– MSalters
Jan 10 at 20:03

3

On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

– supercat
Jan 10 at 21:23

|
show 2 more comments

The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

answered Jan 10 at 2:12

M.M

105k11116237

1

Is the first clause true? Does merely copying an uninitialized bool trigger UB?

– Joshua Green
Jan 10 at 3:25

9

@JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

– M.M
Jan 10 at 3:34

7

@JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

– David Schwartz
Jan 10 at 11:15

3

Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

– MSalters
Jan 10 at 20:03

3

On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

– supercat
Jan 10 at 21:23

|
show 2 more comments

The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

answered Jan 10 at 2:12

M.M

105k11116237

The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

answered Jan 10 at 2:12

M.M

105k11116237

answered Jan 10 at 2:12

M.M

105k11116237

answered Jan 10 at 2:12

M.M

105k11116237

answered Jan 10 at 2:12

M.M

105k11116237

1

Is the first clause true? Does merely copying an uninitialized bool trigger UB?

– Joshua Green
Jan 10 at 3:25

9

@JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

– M.M
Jan 10 at 3:34

7

@JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

– David Schwartz
Jan 10 at 11:15

3

Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

– MSalters
Jan 10 at 20:03

3

On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

– supercat
Jan 10 at 21:23

|
show 2 more comments

1

Is the first clause true? Does merely copying an uninitialized bool trigger UB?

– Joshua Green
Jan 10 at 3:25

9

@JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

– M.M
Jan 10 at 3:34

7

@JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

– David Schwartz
Jan 10 at 11:15

3

Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

– MSalters
Jan 10 at 20:03

3

On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

– supercat
Jan 10 at 21:23

Is the first clause true? Does merely copying an uninitialized bool trigger UB?

– Joshua Green
Jan 10 at 3:25

@JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

– M.M
Jan 10 at 3:34

@JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

– David Schwartz
Jan 10 at 11:15

Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

– MSalters
Jan 10 at 20:03

On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

– supercat
Jan 10 at 21:23

|
show 2 more comments

     // the compile could make asm that "looks" like this, from your source

const static char *strings = {"false", "true"};

const char *whichString = strings[boolValue];

If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.

edited Jan 10 at 9:45

Peter Cordes

121k17184312

answered Jan 10 at 2:02

Barmar

421k35245346

1

@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

– Barmar
Jan 10 at 2:09

1

You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

– Remz
Jan 10 at 2:25

2

@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

– Barmar
Jan 10 at 2:28

1

@Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

– Havenard
Jan 10 at 2:57

1

@Havenard, int is likely to be bigger than bool so that wouldn't prove anything.

– Sid S
Jan 10 at 4:11

|
show 6 more comments

     // the compile could make asm that "looks" like this, from your source

const static char *strings = {"false", "true"};

const char *whichString = strings[boolValue];

If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.

edited Jan 10 at 9:45

Peter Cordes

121k17184312

answered Jan 10 at 2:02

Barmar

421k35245346

1

@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

– Barmar
Jan 10 at 2:09

1

You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

– Remz
Jan 10 at 2:25

2

@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

– Barmar
Jan 10 at 2:28

1

@Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

– Havenard
Jan 10 at 2:57

1

@Havenard, int is likely to be bigger than bool so that wouldn't prove anything.

– Sid S
Jan 10 at 4:11

|
show 6 more comments

     // the compile could make asm that "looks" like this, from your source

const static char *strings = {"false", "true"};

const char *whichString = strings[boolValue];

If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.

edited Jan 10 at 9:45

Peter Cordes

121k17184312

answered Jan 10 at 2:02

Barmar

421k35245346

     // the compile could make asm that "looks" like this, from your source

const static char *strings = {"false", "true"};

const char *whichString = strings[boolValue];

If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.

edited Jan 10 at 9:45

Peter Cordes

121k17184312

answered Jan 10 at 2:02

Barmar

421k35245346

edited Jan 10 at 9:45

Peter Cordes

121k17184312

edited Jan 10 at 9:45

Peter Cordes

121k17184312

edited Jan 10 at 9:45

Peter Cordes

121k17184312

answered Jan 10 at 2:02

Barmar

421k35245346

answered Jan 10 at 2:02

Barmar

421k35245346

answered Jan 10 at 2:02

Barmar

421k35245346

1

@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

– Barmar
Jan 10 at 2:09

1

You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

– Remz
Jan 10 at 2:25

2

@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

– Barmar
Jan 10 at 2:28

1

@Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

– Havenard
Jan 10 at 2:57

1

@Havenard, int is likely to be bigger than bool so that wouldn't prove anything.

– Sid S
Jan 10 at 4:11

|
show 6 more comments

1

@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

– Barmar
Jan 10 at 2:09

1

You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

– Remz
Jan 10 at 2:25

2

@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

– Barmar
Jan 10 at 2:28

1

@Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

– Havenard
Jan 10 at 2:57

1

@Havenard, int is likely to be bigger than bool so that wouldn't prove anything.

– Sid S
Jan 10 at 4:11

@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

– Barmar
Jan 10 at 2:09

You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

– Remz
Jan 10 at 2:25

@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

– Barmar
Jan 10 at 2:28

@Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

– Havenard
Jan 10 at 2:57

@Havenard, int is likely to be bigger than bool so that wouldn't prove anything.

– Sid S
Jan 10 at 4:11

|
show 6 more comments

Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to

Your code working as you expected it to

Your code failing at random times

Your code not being run at all.

See What every programmer should know about undefined behavior

answered Jan 10 at 11:48

Tom Tanner

7,97522250

add a comment |

Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to

Your code working as you expected it to

Your code failing at random times

Your code not being run at all.

See What every programmer should know about undefined behavior

answered Jan 10 at 11:48

Tom Tanner

7,97522250

add a comment |

Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to

Your code working as you expected it to

Your code failing at random times

Your code not being run at all.

See What every programmer should know about undefined behavior

answered Jan 10 at 11:48

Tom Tanner

7,97522250

Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to

Your code working as you expected it to

Your code failing at random times

Your code not being run at all.

See What every programmer should know about undefined behavior

answered Jan 10 at 11:48

Tom Tanner

7,97522250

answered Jan 10 at 11:48

Tom Tanner

7,97522250

answered Jan 10 at 11:48

Tom Tanner

7,97522250

answered Jan 10 at 11:48

Tom Tanner

7,97522250

add a comment |

Remz is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Remz is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk

Does the C++ standard allow for an uninitialized bool to crash a program?

5 Answers
5

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Tools for detecting UB and usage of uninitialized values

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Tools for detecting UB and usage of uninitialized values

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Tools for detecting UB and usage of uninitialized values

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Tools for detecting UB and usage of uninitialized values

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Tools for detecting UB and usage of uninitialized values

Post as a guest

Popular posts from this blog

List directoties down one level, excluding some named directories and files

list processes belonging to a network namespace

list systemd RuntimeDirectory mounts

Does the C++ standard allow for an uninitialized bool to crash a program?

5 Answers 5

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

Tools for detecting UB and usage of uninitialized values

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

Tools for detecting UB and usage of uninitialized values

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

Tools for detecting UB and usage of uninitialized values

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

Tools for detecting UB and usage of uninitialized values

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

Tools for detecting UB and usage of uninitialized values

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

List directoties down one level, excluding some named directories and files

list processes belonging to a network namespace

list systemd RuntimeDirectory mounts

5 Answers
5

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

5 Answers
5

5 Answers
5

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for `bool`.