Defeating Shadow Walker

Shadow Walker's TLB desynchronization is undoubtedly a great idea, and it's implementation should evolve to make scanning process as complicated as possible.
Current state of the page fault handler implementation allows to reveal all hidden pages in seconds and, moreover, to completely remove Shadow Walker's IDT hook leaving all rootkit pages visible to the ordinary memory scanning tools.

I've coded a Shadow Walker remover to speed up the Shadow Walker improvement process.

The main problem of the memory cloaking concept is the visibility of the page fault handler. Roughly speaking, the algorithm of the SW's page fault handler is this:
- Is this fault occurred with one of the hidden pages?
- If yes, do something and iretd, otherwise - pass execution to the original page fault handler.
So, it's possible to "bruteforce" SW's page fault handler with the fake page faults in different pages and monitor it's execution to see - whether it comes to iretd for a particular page or not, i.e. comes to the original handler.
If we iterate through all the kernel pages generating fake page faults for them and tracing the SW handler, we'll get a list of the hidden pages - these are the pages, for which SW returned execution without passing down exception to the original handler.

To speed up the tracing process we may abort tracing if we come to a code branch that ends up with iretd or exception passdown in any case. Obviously, a linear (jcc-free) code that has iretd or jump to the original handler is the example of these branches. Such linear iretd or passdown endpoints are marked on the disassembly stage. After this, all upper nodes are examined. For example, a branch that has one jcc and both children branches ends with iretd, should be treated as iretd-branch too. The general algo is this:

    do {        

        while (CurrentBranch!=(PBRANCH)&Branches) {


            if (CurrentBranch->LeftChild)
            if (CurrentBranch->RightChild)
                // non conditional jumps do not have right branch,
                // but we should assign them corresponding kind if their left branch'es kind is known.

            // if both children end equally (both iretd or both passdown) then
            // change current branch'es kind to the one of the children's.
            if ((CurrentBranch->Kind==KIND_UNKNOWN) &&
                (bLeftChildKind!=KIND_UNKNOWN) &&
                (bLeftChildKind==bRightChildKind)) {


    } while (bNewKnownBranch);

After finish of the bruteforcing process, we have a list of the hidden pages. Now we can uninstall SW's page fault handler and set our own handler that will prevent these pages from being marked as not present by the rootkit code. Thus, we must get the address of the real ntoskrnl's int 0x0e handler.
The needed address (_KiTrap0E) can be found in the _IDT array in the ntoskrnl. It resides in the INIT section, so it's freed after the system startup. I've done it this way:
1. Map ntoskrnl.exe file and begin iterating through it's relocs.
2. We should find the reloc that points to the _IDT. So, get the address current reloc points to and check for typical _IDT array layout: all records are 8 bytes length, 1st 4 bytes are the handler address, last 4 bytes are the IDT callgate flags and the segment selector in the reversed order.
3. Since all even DWORDs in the _IDT are the handlers addresses, they all have relocs on them. Check for this.
4. Check for standard IDT entry flags and selector: 1st 3 entries have segment selector 8, dpl 0, seg present, 32bit gate flags. Int 3 has dpl 3.
5. If it is a _IDT, get the address of the int 0x0e handler from it.

Our page fault handler is simple. It's main purpose is to restore "present" bit in the PTEs of the hidden pages when the rootkit code clears it after accessing its global variables.

// return TRUE if this page fault is handled (page is hidden), FALSE otherwise
static BOOLEAN NTAPI HandleSWPageFaults(PVOID Address)
    PHIDDEN_PAGE    HiddenPage;

    while (HiddenPage!=(PHIDDEN_PAGE)&g_HiddenPages) {
        if (HiddenPage->Address==(PVOID)((ULONG)Address & 0xfffff000)) {
            // some code (possibly which was hidden) cleared the P bit in this pte;
            // it's a hidden page: mark it Present and handle the page fault
            return TRUE;


    // this page is not hidden: passdown the page fault
    return FALSE;

static __declspec(naked) Int0EHandler()
    __asm {
        push    fs

        mov        bx, 0x30
        mov        fs, bx

        mov        eax, [esp+32+4]    // ErrorCode
        test    eax, 4
        jnz        passdown_pagefault    // it's a usermode pagefault

        mov        eax, cr2

        push    eax
        call    HandleSWPageFaults
        or        al, al
        jz        passdown_pagefault

        // page fault was handled
        pop        fs
        add        esp, 4


        // page fault was not handled: call the original handler
        pop        fs
        jmp        [g_RealInt0EHandler]

After installing our own page fault handler we should invalidate all TLB entries for the hidden pages.

// updates TLB entries for the hidden pages
static VOID TouchHiddenPages()
    PHIDDEN_PAGE    HiddenPage;
    PVOID    Address;

    while (HiddenPage!=(PHIDDEN_PAGE)&g_HiddenPages) {

        __asm {
            mov        eax, Address
            invlpg    [eax]        // clear TLB cache for this page
            mov        eax, [eax]    // touch the page: force our int 0x0e handler to make this page present


By the way, Shadow Walker's HookMemoryPage() routine has a bug.

        //Go ahead and flush the TLBs.  We want to guarantee that all      
        //subsequent accesses to this hooked page are filtered
        //through our new page fault handler.                              
        __asm invlpg pExecutePage       

This will not flush the needed TLB if the compiler has not moved pExecutePage value to the register: it will flush the TLB entry for the page where pExecutePage variable resides - one of the stack pages. Use mov reg32, Address/invlpg [reg32] instead or call KiFlushSingleTb(), which is the same.

As a solution to the described problem a strong polymorphism with antitrace code snippets comes to mind. SW's page fault handler should be generated in such a way that static code analyzing will be impossible - lots of polymorphic code like "mov reg32, Address/jmp reg32", i.e. unpredictable execution transfer will help. This may significantly decrease the speed of tracing.

wbr, 90210//HI-TECH

© 2014-2017 Сергей Воробьев