Recently I ran into a problem while extracting a IPv4 netmask from an IPv4 address in CIDR slash notation. My idea was to make use of bitshift. Since an IPv4 addresse is 4 bytes long it can be considered as an C++ integer. The following approach seemed promising:
netmask = (1 << bitmask) - 1
If you left-shift a 1 n times, you get a 1 followed by n zeros. Now substract 1 to invert the result and you got the netmask. This seems to be working as the following example shows. Let’s say we want to generate the netmask of a /8-network:
/8 means our bitmask is 8 = 1000b. So, apply the formular:
1 << 1000b = 100000000b - 1 = 11111111b (expand it to a 32 bit integer) = 11111111000000000000000000000000b
However this seems to fail if the bitmask equals 32. Actually one would expect a result of 1, but the computer returns 0:
(1 << bitmask(32)) - 1 = 0
This can be fatal. Just imagine passwordless login to your server for anyone from the machine which would be a /32 network …
Another funky thing: If you replace “bitmask” by the hard coded number 32, the result will be correct (
netmask = (1 << 32) - 1 = 0xFFFFFF). This seems broken…
A look at the generated assembler code reveals that it’s actually working (stripped down to the relevant part for the ease of understanding):
mov -0xc(%rbp),%ecx // ESX = bitcount
mov $0x1,%eax // EAX = 1
shl %cl,%eax // shift EAX by CL
sub $0x1,%eax // EAX = EAX - 1
Seems valid. So what’s going wrong. Well, Intel reveals the answer (IA-32 Intel Achitecture Software Developer’s Manual):
The 8086 does not mask the shift count. However, all other IA-32 processors (starting with the Intel 286 processor) do mask the shift count to 5 bits, resulting in a maximum count of 31. This masking is done in all operating modes (including the virtual-8086 mode) to reduce the maximum execution time of the instructions.
So it seems that all processors since the 286 do not allow a shift count greater then 5 bits, thus resulting in a maximum shift count of 31!
My further investigations showed that the compiler’s optimizer generates SAL instead of SHL instructions. So not only is the result of a SHL undefined (under certain conditions, as stated in the C language standard) and should not be relied on, but also it’s maximum shift width is 5 bits.
Damn. Lesson learned: Read the docs