Mobile
My iOS apps
Other apps
Open source
  • Bandwidth benchmark
  • RAVM virtual machine
  • Big integer division
  • Prime numbers
  • AntiJOP sanitizer
  • TouchWidgets UI lib
  • Networking utils
  • Documentation
  • x86 instructions ref
  • GIT quick ref
  • GPG quick ref
  • Avoid Ubuntu
  • Android malware risks
  • iOS malware risks
  • OS/X security tips
  • Who blocks Tor
  • Software engineering
  • BASH aliases
  • I.B. pro/con
  • Vocal programming
  • Nutrition
  • Blog
  • Contact
    1 at zsmith dot co

    RAVM: compile once and run anywhere

    Revision 14
    © by
    All rights reserved.

    What is RAVM?

    RAVM is my nascent write-once, eventually-run-anywhere project. RAVM stands for RISC-Approximating Virtual Machine, meaning it uses a load-store architecture and a simplified instruction format to maximize performance.

    The project currently consists of the virtual machine itself, which I coded mostly in x86 assembly language, and a simple assembler (RASM) that produces bytecode, which is to say virtual machine language, that RAVM runs.

    What is the justification for RAVM?

    A series of rational deductions

    Q: What causes poor security in Java, Flash and JavaScript?

    A: Exploits based on converting data pages to code pages i.e. as required by just in time (JIT) compilation.

    Q: If many exploits rely on such just in time compilation, what is the alternative?

    A: Emulation.

    Q: What is the cause of poor performance of emulation?

    A: It may be the complexity of certain instruction sets.

    Q: Is emulation of x86 insufferably slow?

    A: Yes.

    Q: Can we identify an instruction set that is faster to emulate?

    A: All instruction sets that are designed to be optimal in hardware will be complex to emulate in software.

    Q: Can we design a new instruction set that is fast to emulate in software?

    A: That is the question that RAVM and its predecessors RXVM et cetera were meant to answer.

    Q: Was it indeed proven?

    A: Yes. And successively refined.

    Q: Why do you call this a series of rational deductions?

    A: Because the thinking is not motivated by the commercial interest of hardware companies nor careerism, and the development of RAVM/RXVM/etc proceeded as a series of scientific experiments.

    Speed

    As far as I know my RAVM implements the fastest emulating machine language on x86. The speed with which a given machine language can be emulated is inversely related to the complexity of the encoding of the instruction set. After much experimentation, I settled on an encoding that executes faster than ARM or MIPS could ever possibly be emulated, both of which being considerably faster than emulating x86 on x86.

    RAVM has fewer decisions to make to execute an instruction and fewer x86 instructions overall to execute one RAVM instruction, compared to ARM, MIPS or x86 emulation. It should also use less of the caches, allowing for better use of Intel's hyperthreading i.e. 2 threads per core, because one emulator won't be continually wiping out the cache lines of the other.

    Security

    RAVM is fast despite the fact that I check for out of bound accesses that could compromise RAVM itself.

    Simplicity

    The instruction set encoding is really quite simple and it would be easy to adapt e.g. LLVM to generate its machine language.

    Why emulate at all?

    • Emulators are portable and therefore architecture independent.
    • Emulation provides an isolated environment in which to run software.
    • Emulation avoids the security risks of virtualization e.g. breakout bugs.
    • Emulation avoids the security risks of just in time compilation i.e. converting data pages to code pages.

    Why did I start RAVM?

    After hearing for the 100th time that browser-borne Java is a major attack vector, not unlike mosquitoes carrying malaria, I asked myself the rhetorical question Could I do better than Java? Rather than assume the affirmative or the negative without evidence, which would be irrational, I set about experimenting in order to answer this question.

    Some light reading:

    There are two problems to solve:

    1. Achieving fast execution
    2. Achieving security

    One without the other limits usefulness. In the early years, Java had neither. Even today Java only has fast execution because of JIT (just in time) compilation, which is considered a good hack even though it adds enormously to the complexity of the VM, and opens a gaping hole through which exploits can flow.

    Complexity is also the enemy of security.

    JIT compilation relies on a major security weakness, namely the ability to convert a region of memory that is flagged as non-executable data into one that is executable code. This breaks a cardinal rule of computer security: Never execute data.

    Modern computer hackers use this feature when they deploy ROP and JOP attacks to overcome the NX (no execute) bit protection and it permits them to do code injection.

    Operating systems should never allow applications to convert data areas into code areas, but they do because of software like Java, and other languages that do JIT compilation such as Javascript and Flash.

    Related reading:

    Achieving speed

    RAVM is the end result of a series of assembly-language experiments in which I devised and tested various instruction set architectures (ISAs), instruction word sizes, and various implementations of the interpreter code before settling on the current one. I tried an 8-bit instruction size, 16-bit instruction size and 32-bit instruction size. I tried register sets with 8 registers, 16 registers, 32 registers and 256 registers.

    The maximum performance that I saw, running all experiments on my Intel Core i5 laptop, was from a virtual machine that I called RXVM, which used 8-bit instructions, and had only 8 registers, four of which resided within the 32-bit x86 register set. It peaked at about 800 MIPS on my 2.4 GHz Core i5, albeit only for an empty loop. Most arithmetic operations ran at half that speed. RXVM was (sometimes) fast but rather impractical, having only 8 registers. The rule of thumb is that most programs require at least 12 registers.

    The 16-bit instruction word size turned out to be not as fast as 8-bit and not faster than the 32-bit size. Why? Because register indices had to be shifted and AND'd. The 32-bit instruction word size with 256 registers is fastest because the shift-AND operation can be done with one move instruction: MOVZX.

    For this reason, RAVM provides 256 registers and runs very fast. On my 2.4 GHz Core i5, RAVM can execute its machine code at about 380 MIPS for most instructions with full bounds checking at runtime for newly calculated memory addresses. (Without checks it would be faster but less securely.)

    Really, 256 registers? Yes. This many registers, while not strictly necessary for most purposes, permits the fastest decoding of instructions by making use of MOVZX instead of shift and mask.

    Then how was RXVM faster still? Because every instruction opcode was an index causing a jump to hard-coded operations. No register indices were ever shifted and masked. An instruction moving R3 to R7 executed different code than one moving R3 to R6.

    Couldn't the RXVM approach be used for 16-bit instructions? Yes, in theory. I briefly tried it but my tests showed poor performance. Doing a lot of jumping to code segments seems to cause efficiency problems perhaps due to the Instruction Cache not being large enough.

    What CPUs was I targeting?

    My focus was on current x86 CPUs running in 32 bit mode. (However these days I rarely run any code in 32-bit mode so the next revision of RAVM will be 64-bit.)

    Early x86 CPUs like the Pentium 4 are not the focus of my work because most consumers have newer Intel CPUs. Nor are AMD CPUs as I simply don't have access to them. I would however be curious to know how well RAVM performs on current AMD CPUs.

    The ARM processors are naturally of interest as well.

    Achieving security

    To protect against buffer overruns and the like, RAVM does bounds-checking for four memory regions, each of which is separate in memory.
    • The program stack
    • The code area
    • The data area
    • The register set

    What is RAVM's architecture?

    Registers:
    • 256 32-bit registers.
    • A stack pointer.
    • An instruction pointer.
    There are no CPU flags.

    Unlike the Java VM, RAVM does not implement an object-oriented machine language. No instructions take an object pointer. Instead RAVM implements a somewhat ordinary 32-bit processor and this helps to keep it simple, small, fast and secure.

    Object or to programming is all good and well however it is not needed in every case.

    The KISS rule also applies. But the goal is to put a wall of separation between machine code and OOP. The conflation of the two risks adding more security holes.

    Instruction set architecture:
    • 32-bit instruction format.
    • 1- and 2-operands instructions.
    • Can support 3 operands in the future.
    • Presently only integer instructions.
    • Presently no vector instructions.

    The next iteration of RAVM will provide 64-bit registers.

    No flags? For now there are none, but some operations require a carry flag so I may add a carry flag later.

    Tools

    RASM

    My assembler is called rasm. It takes .asm files and outputs RAVM machine code.

    What about a compiler?

    At this time there is no compiler. I may eventually adapt a C compiler to produce RAVM machine code.

    Downloads

    x86 RAVM

    Downloads available here.

    ARM RAVM

    In mid-July 2013, I began a port of RAVM to the ARM processor, in the form of an iOS app. This effort has somewhat stalled.

    Intel64 (x86_64) RAVM

    This has not yet been started.

    Contributing

    There is a need for someone to adapt a C compiler (or any other language) to generate RAVM bytecode. Perhaps LLVM could be adapted to this purpose.



    © Zack Smith