I’m in a computer architecture class at UT. This is fine and dandy. Recently we got a homework/lab, and some people started noticing that one of the reference functions didn’t perform correctly on all machines. Here’s the offending code:
int test_fitsBits(int x, int n) { int TMin_n = -(1 << (n-1)); int TMax_n = (1 << (n-1)) - 1; return x >= TMin_n && x <= TMax_n; }
This function is supposed to return 1 if the number x can be represented as a two’s-complement number with n bits, and 0 otherwise. For instance, test_fitsBits(3, 2)=1 and test_fitsBits(-3, 2)=0
Can you spot the error? It took me an hour to determine what the error was and then another hour to determine that the error wasn’t a compiler error. If you enjoy hunting for bugs like you enjoy a good murder mystery novella, read on. Otherwise, you might want to stop here, because this is the tale of how I learned that two’s complement is not required by C spec.
So I’m getting errors on Ubuntu when I compile with clang3.5 using “-O -Wall”, but not when I use gcc 4.8 on the same machine. So there’s gotta be some sort of optimization thing going on there.
Oh, but check this out. From tests.c, this is how GCC compiled the fitsBits function:
test_fitsBits: .LFB4: .cfi_startproc leal -1(%rsi), %ecx movl $1, %eax sall %cl, %eax movl %eax, %ecx leal -1(%rax), %eax cmpl %eax, %edi setle %al negl %ecx cmpl %ecx, %edi setge %dl movzbl %dl, %edx andl %edx, %eax ret
And how LLVM did the same thing:
test_fitsBits: # @test_fitsBits .cfi_startproc # BB#0: # kill: ESI<def> ESI<kill> RSI<def> leal -1(%rsi), %ecx movl $1, %eax # kill: CL<def> CL<kill> ECX<kill> shll %cl, %eax movl %eax, %ecx negl %ecx cmpl %edi, %eax setg %al cmpl %ecx, %edi setge %cl andb %al, %cl movzbl %cl, %eax retq
The code was, of course:
int test_fitsBits(int x, int n) { int TMin_n = -(1 << (n-1)); int TMax_n = (1 << (n-1)) - 1; return x >= TMin_n && x <= TMax_n; }
So the LLVM code is setting %al to 1 if %edi > %eax. Then it sets %cl if %ecx >= %edi. Then it ANDs %al and %cl and returns the result.
LLVM:
- %edi is the argument x
- %cl (and %ecx) is the argument n - 1 (via the lea instruction)
- sets %eax to 1<<%cl
- copies %eax to %ecx
- (arithmetically) negates %ecx (so %ecx = ~%ecx + 1)
- sets %al if %eax > %edi (this is the “x <= TMax_n” comparison)
- sets %cl if %edi >= %ecx (this is the “x >= TMin_n” comparison)
- returns %al AND %cl
GCC, on the other hand:
- %edi is the argument x
- %cl (and %ecx) is the argument n - 1 (via the lea instruction)
- sets %eax to 1<<%cl
- copies %eax to %ecx
- Subtracts 1 from %eax (via the lea instruction)
- sets %al if %edi <= %eax (this is the “x <= TMax_n” comparison)
- (arithmetically) negates %ecx (so %ecx = ~%ecx + 1)
- sets %dl if %edi >= %ecx, and copies the result to %edx padding with zeros (this is the “x >= TMin_n” comparison)
- returns %edx AND %eax.
So let’s get down into the nitty-gritty. What are the differences?
- GCC subtracts 1 from %eax before checking if %edi <= %eax.
- LLVM doesn’t subtract 1, but instead checks for %eax > %edi.
Which are the same (the case where %eax is Tmin will never happen).
Hrm, that didn’t pan out. Well, time to look at the errors. The errors that occur with LLVM only occur when n==32==0x20.
Time for some traditional debugging. I added this printed right before the return in fitsBits:
printf("%i %i %i %i\n", TMin_n, TMax_n, x>=TMin_n, x<=TMax_n);
And what do we get? The errors still occur in LLVM, so that’s a good sign. However, check this out:
-2147483648 2147483647 1 0 Test fitsBits(1[0x1],32[0x20]) failed. Gives 1[0x1]. Should be 0[0x0]
So it’s checking to see if 1>=-2147483648 and getting 1, which is good. Then it checks to see if 1<=2147483647, but it’s getting 0. What gives?
Alright, back to the assembly output. Some differences between them:
- LLVM’s check is setting %al if %eax > %edi.
- GCC sets %al if %edi <= %eax.
So let’s say n is 32. Then %cl is 31. %eax becomes 1 shifted 31 times, or 0x80 00 00 00. This is where trouble begins - keen eyes will notice that this is a large negative number. INT_MIN, to be exact. GCC subtracts 1 from this number to get 0x7F FF FF FF, which is a large positive number - in fact, all 32 bit numbers are less than this. However, LLVM optimized away the -1 and did the comparison directly against a large negative number.
This leads us to the simplest reproducible manifestation of the bug: Subtracting 1 from a number, and checking to see if it’s <= something. Like so:
#include<stdlib.h> #include<stdio.h> #include<limits.h> int fn(int x, int n) { int v = x - 1; return n <= v; } int main(int argc, char **argv) { int x = INT_MIN; int n = 14; if (fn(x, n)) { printf("Good!\n"); } else { printf("Busted!\n"); } return 0; }
And the output:
lane@beryllium:~/ut/fall2014/CS429/lab1$ gcc -O -Wall llvm-bug.c lane@beryllium:~/ut/fall2014/CS429/lab1$ ./a.out Good! lane@beryllium:~/ut/fall2014/CS429/lab1$ clang -O -Wall llvm-bug.c lane@beryllium:~/ut/fall2014/CS429/lab1$ ./a.out Busted! lane@beryllium:~/ut/fall2014/CS429/lab1$
So this could be either a weird quirk in the spec, or a bug in the LLVM compiler. We check without using the -O flag:
lane@beryllium:~/ut/fall2014/CS429/lab1$ gcc -Wall llvm-bug.c lane@beryllium:~/ut/fall2014/CS429/lab1$ ./a.out Good! lane@beryllium:~/ut/fall2014/CS429/lab1$ clang -Wall llvm-bug.c lane@beryllium:~/ut/fall2014/CS429/lab1$ ./a.out Good! lane@beryllium:~/ut/fall2014/CS429/lab1$
Hrmm… Looks like the C spec is on our side here. Unless there’s something undefined, as Mr. Selfridge pointed out could be the case.
Unfortunately, the C99 specification costs 198 Swiss Francs to download, and I have 0 Swiss Francs.
Fortunately, things happened and my formal verification friends lent me their copy. At least that’s the story I’m going by.
Quick digression: C99 only specifies up to 63 nested levels of parenthesis within a full expression or declarator. Or 63 significant initial characters in an internal identifier or macro name. So if you have two macros, each named 64 a’s followed by a “1” or a “2”, respectively, then according to the C99 specification they are the same name. Also, you can’t have more than 1023 members in a single structure or union.
Un-digress. Note that the C99 spec limits for ints are that ints are at least 16 bits, longs are at least 32 bits, and long longs are at least 64 bits.
Also note that the unsigned overflow is defined as being the modulo. We knew that.
If you have a copy of ISO/IEC 9899:1999, you can follow along. Note in section 6.2.6.2(2) that two’s complement is not required for representing integers. If we go back to section 5.2.4.2.1(1) we see that the limits for int are -32767 to 32767, where a 16-bit two’s complement representation would be -32768 to 32767.
According to the definition of addition (6.5.6(5)), “the result of the binary + operator is the sum of the operands.” Thank you, I didn’t know that before I read that. I feel enlightened now.
But then I found my search button, and went to 3.4.3(3), an example of undefined behavior. They say that “an example of undefined behavior is the behavior on integer overflow.”
Go back to 6.2.5(9), which says that “a computation involving unsigned operands can never overflow.”
Go to the way bottom, section H.2.2(1) states that “The signed C integer types […] are compatible with LIA-1. […] An implementation that defined signed integer types as also being modulo need not detect integer overflow.” (emphasis added)
Alright, so what’s the take? C99 doesn’t define how signed integer overflow works. So all that stuff we learned in class may or may not apply, and you just need to know when it does and when it doesn’t. Not that anyone’s read this far.
Going to the “C Extensions to the C Language Family” page for GCC, we learn that they do not define signed integer overflow either.
So it’s not a compiler bug. Mr. Selfridge was correct. I guess if you’re a part of the formal verification group you get good insight into these things.
And so we see that GCC also has the same bug:
lane@beryllium:~/ut/fall2014/CS429/lab1$ gcc -O1 -Wall llvm-bug.c lane@beryllium:~/ut/fall2014/CS429/lab1$ ./a.out Good! lane@beryllium:~/ut/fall2014/CS429/lab1$ gcc -O2 -Wall llvm-bug.c lane@beryllium:~/ut/fall2014/CS429/lab1$ ./a.out Busted! lane@beryllium:~/ut/fall2014/CS429/lab1$
And so, from the analysis above, I suggest that fitsBits in test.c be changed to the following code for uniformity across C optimizers:
int test_fitsBits(int x, int n) { int TMin_n = -(1 << (n-1)); int TMax_n = (1 << (n-1)) - 1; if (n >= (sizeof(int)<<3)) return 1; return x >= TMin_n && x <= TMax_n; }