© 2014 by Zack Smith. All rights reserved.
Ever since data-execution prevention (DEP) in the form of the no-execute bit (NX) feature of modern CPUs became available to ward off code injection attacks by letting an operating system flag non-code pages as non-executable, two types of attack methods, called ROP and JOP, have come into use to get around NX. JOP stands for Jump-Oriented Programming, and ROP stands for Return-Oriented Programming.
These are code-reuse attacks meaning they do not provide code, but rather use data in non-executable pages to cause snippets of code (called gadgets) in executable pages such as libc or the program being attacked, to be run under the control of said data. An example means of doing this is by overrunning a data buffer that is known to be unfortunately located on the stack. They execute code by providing a sequence of addresses to return to (ROP) in the stack-based approach or to jump to (JOP). These addresses always point to the gadgets that are within existing software that is fully allowed to run. Both JOP and ROP rely on the ability of hackers to analyze library and program code to locate the gadgets in the versions of each that they are targeting.
Gadgets typically consist of just one or two instructions, followed by either an indirect jump instruction in the case of JOP, or a return instruction in the case of ROP. When the targeted software is updated, it is quite possible that some gadgets will disappear and others will move, rendering the addresses to them useless. Thus code reuse attacks have a shorter shelf life than did older code injection attacks.
A ROP attack works by using a buffer overrun on the stack to overwrite the stack with a series of return addresses that point to these gadgets, which are then executed sequentially.
With JOP, it is more complicated. A buffer overrun in the heap must overwrite addresses for instance in a setjmp buffer or elsewhere, causing a jump to a dispatcher gadget. Without a dispatcher, JOP cannot work. JOP then proceeeds sequentially through a set of addresses to gadgets, and each gadget must be able to increment the index into the sequence and jump back to the dispatcher.
Both ROP and JOP have been proven by researchers and hackers to work.
AntiJOP is my nascent project that I began in mid-February 2014 to reduce the number of viable gadgets in computer programs, focusing on the Intel64 instruction set.
AntiJOP works by rewriting assembly language code, so it must be able to take as input the assembly code generated by a compiler, and then provide output to the assembler and linker. Naturally this process requires recompilation of existing, installed software to achieve protection.
My program currently takes as its input NASM- or YASM-formatted assembly code, whereas popular compilers like GCC and LLVM use a different format. I may eventually update AntiJOP to accept these compilers' syntax.
Assembly code rewriting works as follows. ROP and JOP gadgets terminate with certain instructions that contain certain bytes. AntiJOP replaces those specific bytes wherever they may occur. In the case of ROP on x86 each gadget typically ends with the near-return instruction 0xC3 or 0xC2. A gadget of a JOP exploit typically ends with any of a range of indirect jump instructions that start with 0xFF.
Note that there is a second way to classify gadgets besides the ROP versus JOP distinction:
- Those based on intended instructions i.e. those that were generated by the compiler.
- Those based on unintended instructions i.e. those that start somewhere within the intended instructions of a program e.g. within a branch offset or immediate values.
On the x86, gadgets that start within instructions are more numerous than those based on intended instructions.
AntiJOP must address both intentional and unintentional types of gadgets. I do this in the case of JOP gadgets by attempting to replace every instance of an 0xFF byte with something else, and for ROP gadgets I attempt to replace 0xC3 and 0xC2.
Where to find the 0xFF, 0xC3 and 0xC2? They can begin in any part of an x86 instruction, and the meaning of each value is different in each place.
- If 0xFF is the MOD/RM byte, this can signify an operation involving DI/EDI/RDI and/or R15/R15D/R15W.
- If 0xFF is the
scale-index-baseor SIB byte, this can signify EDI*8 + EDI.
- If 0xFF is within a branch offset, it can mean a reverse branch as in a loop.
- If 0xFF is the offset within an effective address, it means a negative offset.
- Et cetera.
Why would this approach be effective?
What is essential for JOP or ROP to work is the existence of a large number and variety of gadgets in existing code. By removing intended and unintended instructions with which gadgets must terminate from the assembly code, the overall number of gadgets available to an attacker can be significantly reduced. It can not be brought to zero, but the more that the number of gadgets is whittled down, the more difficult the hacker's job becomes.
AntiJOP now attempts to remove both JOP and ROP gadgets.
This software is in an alpha stage.
AntiJOP still parses NASM/YASM assembly code, not the less legible GCC/LLVM format.