Sunday, February 3, 2013

The "Other" Integer Overflow

If you've been programming in C or a similar language for any period of time, you've inevitably had to pick up some of the esoteric intricacies of the underlying hardware. Defined behaviour doesn't always protect you, and can sometimes be surprising.

One of my favourite examples is how divide errors work on the Intel architecture, simply because it always surprises developers who've never seen it before. High level languages try to protect the programmer from having to know these edge cases, so the developers who have made the transition to C/C++ from other languages aren't always aware of it.

Here is what the C99 Standard says on the subject of integer division:
5 The result of the / operator is the quotient from the division of the first operand by the
second; the result of the % operator is the remainder. In both operations, if the value of
the second operand is zero, the behavior is undefined.
No surprises there, but the Intel Architecture manuals do make a curious generalisation on the subject of the IDIV instruction:
The action of this instruction depends on the operand size (dividend/divisor).
Non-integral results are truncated (chopped) towards 0. The remainder is always less than the divisor in magnitude. Overflow is indicated with the #DE (divide error) exception rather than with the CF flag.
A careful reader might notice the last sentence, indicating that all overflow conditions result in #DE, not just divide by zero. Later versions of the C standard account for this, with this language.

If the quotient a/b is representable, the expression
(a/b)*b + a%b shall equal a; otherwise, the behavior of both a/b and a%b is

If this is news to you, here's what this means. On twos complement systems, the absolute value of INT_MIN is higher than INT_MAX, so dividing INT_MIN by -1 overflows.  It's easy to see why if you look at the bit patterns for these values.

(gdb) p/t (char) -128 / (char) -1
$11 = 10000000
(gdb) p/t (char) -128
$12 = 10000000

It's always fun to try this on software that does integer division on user specified values. Few developers forget to check for division by zero, but you can always find someone who didn't know this, and the system will crash in surprising ways. Here is a neat one I just spotted on Windows:

// kd> vertarget
// Windows 8 Kernel Version 9200 MP (1 procs) Free x86 compatible 
// Product: WinNt, suite: TerminalServer SingleUserTS 
// Built by: 9200.16384.x86fre.win8_rtm.120725-1247 
// Machine Name: 
// Kernel base = 0x81a5a000 PsLoadedModuleList = 0x81c44de8 
// Debug session time: Mon Jan 28 12:10:37.216 2013 (UTC - 8:00) 
// System Uptime: 0 days 0:06:15.595 
// kd> .trap 9df1baf8 
// ErrCode = 00000000 
// eax=80000000 ebx=8e07fac7 ecx=94eedaf8 edx=ffffffff esi=94eed720 edi=9df1bbd4 
// eip=8e0763cc esp=9df1bb6c ebp=9df1bba4 iopl=0 ov up ei ng nz na pe cy 
// cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010a87 
// win32k!GreScaleWindowExtEx+0x8f: 
// 8e0763cc f77d10 idiv eax,dword ptr [ebp+10h] ss:0010:9df1bbb4=ffffffff 
// kd> kv 
// *** Stack trace for last set context - .thread/.cxr resets it 
// ChildEBP RetAddr Args to Child 
// 9df1bba4 8e07faeb 600107ba 80000000 ffffffff win32k!GreScaleWindowExtEx+0x8f (FPO: [Non-Fpo]) 
// 9df1bbf4 81bc52fc 600107ba 80000000 ffffffff win32k!NtGdiScaleWindowExtEx+0x24 (FPO: [Non-Fpo]) 
// 9df1bbf4 77b96954 600107ba 80000000 ffffffff nt!KiFastCallEntry+0x12c (FPO: [0,3] TrapFrame @ 9df1bc14) 
// 00a8faec 7634515b 76346d4b 600107ba 80000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0]) 
// 00a8faf0 76346d4b 600107ba 80000000 ffffffff GDI32!NtGdiScaleWindowExtEx+0xa (FPO: [6,0,0]) 
// 00a8fb28 00f91036 600107ba 80000000 ffffffff GDI32!ScaleWindowExtEx+0xeb (FPO: [Non-Fpo]) 

No comments: