C Bit-Field Pitfalls

(os2museum.com)

25 points | by fanf2 6 hours ago ago

4 comments

wahern 25 minutes ago
It's even more confusing than described. C23 6.7.3.1p7 says,
> Each of the comma-separated multisets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier `int` designates the same type as `signed int` or the same type as `unsigned int`.
That means a bit-field member using plain `int` as the underlying type might itself be signed or unsigned, similar to whether plain `char` is signed or unsigned. There's a proposal to address this: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3828.pdf Fortunately, it found that GCC, Clang, MSVC, and ICC all treat `int` bit-fields as `signed int`, and the recommendation is to require this behavior.
That said, I don't think I've ever seen a bit-field deliberately used for signed values; there's some logic to the allowance granted by the standard. And I'm sure there's plenty of real-world code erroneously using `int` bit-fields for unsigned values that just happens to work because of twos-complement representations and the semantics of bitwise operations. But better to limit this kind of flexibility, especially when real-world implementations seem not to have taken the alternative route.
pjmlp 5 hours ago
It is much safer to pack/unpack bits manually than trusting bitfields will work as expected.
fsckboy 4 hours ago
>The troublesome behavior is demonstrated by the lines performing the left shift. We take a 12-bit wide bit-field, shift it left by 20 bits so ...
this is nonsense. I don't know what they expect would happen, but who cares? I wouldn't shift a 12 bit field by more than ±11 bits.
you can shift the "enclosing" word of memory if you want, just put the original definition in a union.
mwkaufma 5 hours ago
tl;dr standard is unclear if they should respect the signed-ness of the declaration (MSCV), or always promote to int before converting to a receiving type (GCC, Clang).
I suppose you could say MS's choice reflects a commitment to backwards compatibility, whereas GCC/Clang is always chomping at the bit to introduce more aggressive optimizations that signed-integer-undefined-behavior affords?