zsmith.co

Object-Oriented Assembly Language Programming

Revision 6
© 2017-2018 by Zack Smith. All rights reserved.

Introduction

Object-oriented programming (OOP) became a dominant paradigm a few decades ago. People think of C++, Java and Objective-C has prominent examples of object-oriented languages. It is easy to assume that OOP requires such an OOP language. This is not the case. OOP is just a paradigm. Therefore it is not strictly necessary to use an object-oriented programming language to do object-oriented programming. You can do OOP for instance in C but you can even do it in assembly language. (See my article on Object-Oriented C programming.)

Four approaches to object-oriented x86 assembly programming

1. Objects strucs pointing to class strucs

A simple technique for implementing an object and its class in x86 assembly is to allocate a chunk of RAM for an object struc, which holds the instance variables (ivars) and a class pointer to get access to methods. That pointer points to another allocated chunk of RAM for each class struc, which contains method pointers plus a pointer to the superclass struc. If an object lacks a method, it can access the superclass's methods through the class's super pointer.

 ; Base object structure
 struc object
  .class resq 1 ; class pointer
  .size:
 endstruc
 
 ; Base class structure
 struc class
  .super resq 1 ; parent class
  .name resq 1 ; description string ivar
  .new resq 1 ; instantiation method
  .destroy resq 1 ; destruction method
  .print resq 1 ; diagnostic print method
  .size:
 endstruc
 
 section .data
 Class_name_string: db 'Class', 10, 0
 Class_print_string: db 'This is a test', 10, 0
 
 ; Hard-coded base class
 Class:
 istruc class
   at class.super, dq 0
   at class.name, dq Class_name_string
   at class.new, dq Class_new
   at class.destroy, dq Class_destroy
   at class.print, dq Class_print
 iend

In use:

 ; Allocate an object
   call [rel Class + class.new] ; object returned in rax
   mov rdi, rax
 
 ; Call the print method (rdi = object)
   mov rax, [rdi + object.class] ; get pointer to class
   call [rax + class.print] ; call method
 
 ; Call the superclass's print method
   mov rax, [rdi + object.class] ; get pointer to class
   mov rax, [rax + class.super] ; get pointer to superclass
   call [rax + class2.print] ; call class2's method

String class:

 ; String class structure
 struc string_class
  .super resq 1 ; parent class (Object)
  .name resq 1 ; description string ivar
  .new resq 1 ; instantiation method
  .setUTF8String resq 1 ; method
  .destroy resq 1 ; destruction method
  .print resq 1 ; diagnostic print method
  .characterAt resq 1 ; method
  .size:
 endstruc
 
 ; String object structure
 struc string_object
  .class resq 1 ; class pointer
  ; Instance variables:
  .buffer resq 1 ; ivar
  .length resq 1 ; ivar
  .size:
 endstruc

StringMutable class:

 ; StringMutable class structure
 struc string_mutable_class
  .super resq 1 ; parent class (String)
  .name resq 1 ; description string ivar
  .new resq 1 ; instantiation method
  .destroy resq 1 ; destruction method
  .print resq 1 ; diagnostic print method
  ; Methods for mutability:
  .append resq 1
  .truncateAt resq 1
  .toupper resq 1
  .tolower resq 1
  .size:
 endstruc
 
 ; StringMutable object structure
 struc string_mutable_object
  .class resq 1 ; class pointer
  ; String instance variables:
  .buffer resq 1
  .length resq 1
  ; StringMutable instance variables:
  .bufferSize resq 1
  .size:
 endstruc

Possible pitfall of this approach is:

  • A pointer has to be followed every time a message is to be called.

Benefits include:

  1. Objects are quite small.
  2. Object initialization is fast.

2. No class strucs, objects combine data and methods

An extra level of indirection to access a method may be unacceptable in some cases. To alleviate the slowness, we can do away with the class strucs. Here is a simple approach that does not allow for inheritance.

 ; StringMutable object structure
 struc string_mutable
  ;--------------
  ; from Object
  .magic resq 1 ; class identifier
  .name resq 1 ; class ivar
  .new resq 1 ; instantiation method
  .destroy resq 1 ; destruction method
  .print resq 1 ; diagnostic print method
  ;--------------
  ; from String
  .buffer resq 1 ; object ivar
  .length resq 1 ; object ivar
  .setUTF8String resq 1 ; method
  .characterAt resq 1 ; method
  ;--------------
  ; StringMutable
  .bufferSize resq 1
  .setCharacterAt resq 1
  .insertCharacterAt resq 1
  .append resq 1
  .truncateAt resq 1
  .toupper resq 1
  .tolower resq 1
  .size:
 endstruc

In use:

 ; Allocate an object
   call [rel StringMutable + string_mutable.new]
   mov rdi, rax
 
 ; Give the string a value.
   lea rsi, [rel test_string]
   call [rdi + string_mutable.setUTF8String]
 
 ; Call the print method (rdi = object)
   call [rdi + string_mutable.print]
 
 ; Call the print method of an object in rdx
   push rdi
   mov rdi, rdx
   call [rdi + string_mutable.print]
   pop rdi
 
 ; Call the superclass's print method
 ; NOTE! There is no super class struc.
   call String_print

Possible pitfalls of this approach are:

  • There is no super class struc so there is no way to call the super's methods except directly.
  • Objects are bigger.
  • Object initialization takes longer.

It should be noted that while assemblers like NASM should be able to have a struc inherit members from another struc, neither NASM nor YASM allow this. The struc command is actually just a limited macro.

3. Object contains inherited methods

To support fast inheritance without following pointers to superclass strucs, we have to include inherited methods in each object's struc, further worsening the memory use situation seen in approach 2.

 ; StringMutable object structure
 struc string_mutable
  ;--------------
  ; from Object
  .magic resq 1 ; class identifier
  .name resq 1 ; ivar
  .object_new resq 1 ; instantiation method
  .object_destroy resq 1 ; destruction method
  .object_print resq 1 ; diagnostic print method
  ;--------------
  ; from String
  .buffer resq 1 ; ivar
  .length resq 1 ; ivar
  .string_new resq 1 ; method
  .string_destroy resq 1 ; method
  .string_print resq 1 ; method
  .string_setUTF8String resq 1 ; method
  .string_characterAt resq 1 ; method
  ;--------------
  ; StringMutable
  .bufferSize resq 1 ; ivar
  .string_mutable_new resq 1 ; method
  .string_mutable_destroy resq 1 ; method
  .string_mutable_print resq 1 ; method
  .string_mutable_setCharacterAt resq 1
  .string_mutable_insertCharacterAt resq 1
  .string_mutable_append resq 1
  .string_mutable_truncateAt resq 1
  .string_mutable_toupper resq 1
  .string_mutable_tolower resq 1
  .size:
 endstruc
 
 ; Rewritten for concision:
 struc string_mutable
  ;--------------
  ; from Object
  .magic resq 1 ; class identifier
  .name resq 1 ; ivar
  .__new resq 1 ; method
  .__destroy resq 1 ; method
  .__print resq 1 ; method
  ;--------------
  ; from String
  .buffer resq 1 ; ivar
  .length resq 1 ; ivar
  ._new resq 1 ; method
  ._destroy resq 1 ; method
  ._print resq 1 ; method
  ._setUTF8String resq 1 ; method
  ._characterAt resq 1 ; method
  ;--------------
  ; StringMutable
  .bufferSize resq 1 ; ivar
  .new resq 1 ; method
  .destroy resq 1 ; method
  .print resq 1 ; method
  .setCharacterAt resq 1
  .insertCharacterAt resq 1
  .append resq 1
  .truncateAt resq 1
  .toupper resq 1
  .tolower resq 1
  .size:
 endstruc

Possible pitfalls of this approach are:

  • Objects are potentially much bigger, taking up multiple cache lines.
  • Typing out longer method names could be tedious.
  • Previouly compiled code becomes invalid if a programmer adds or removes a method or ivar to any superclass.
  • It stores pointers to methods that will be little used e.g. String_new.

Possible benefits:

  • It discourages bad coding practices:
    • Excessive inheritance depth.
    • Excessive methods per class.
  • It may be faster than the 1st approach due to fewer memory accesses.
  • Proper multi-level initialization and destruction can be done.
  • Multiple-inheritance is a no-brainer.

  1. Method pointers are not stored

For reasons of computer security, it may be undesirable to even store method pointers, since they can be overwritten with pointers to malware routines.

This approach also offers perhaps the fastest and simplest approach to object-oriented assembly (OOA) coding. Without a stored method pointer, you always have to know what the appropriate version of a method is for a given object.

There is no prospect for automatic resolution of what function implements a given method, which in the above 3 approaches is accomplished by reading a method pointer.

So let's say you have an Array of heterogenous objects and you want to run the same doThis method on each of them; you'd have to check the class of each object at runtime and call the appropriate method by name e.g. String_doThis, Matrix_doThis etc.

String class:

 ; String object structure
 struc string_object
  .class resq 1 ; class pointer
  ; Instance variables:
  .buffer resq 1 ; ivar
  .length resq 1 ; ivar
  .size:
 endstruc
 
 ; StringMutable object structure
 struc string_mutable_object
  .class resq 1 ; class pointer
  ; String instance variables:
  .buffer resq 1
  .length resq 1
  ; StringMutable instance variables:
  .bufferSize resq 1
  .size:
 endstruc

In use:

 ; Allocate an object
   call StringMutable_new
   mov rdi, rax
 
 ; Give the string a value.
   lea rsi, [rel test_string]
   call StringMutable_setUTF8String
 
 ; Call the print method (rdi = object)
   call StringMutable_print
 
 ; Call the print method of an object in rdx
   push rdi
   mov rdi, rdx
   call StringMutable_print
   pop rdi
 
 ; Call the superclass's print method
   call String_print

Possible pitfalls of this approach are:

  • Methods may end up simply calling superclass methods e.g. StringMutable_length just calls String_length. Whereas in a message pointer scenario the derived class pointer would overwrite the superclass pointer.

Possible benefits:

  • It removes a method pointer lookup making the code faster.

Summary

Ease of programming

Assembly is inherently more difficult than coding in a high-level language. The NASM assembler does not make OOP easier. For instance, calculating indices of ivars and methods is tedious. Strucs cannot inherit members from other strucs. However if you want to produce the absolute fastest possible object-oriented code, this may be the only way.

Inheritance

If inheritance doesn't matter, as might be the case in a very restrictive embedded system, the faster 2nd approach that uses only object strucs can be used.

If inheritance does matter, but speed is desired, the faster 3rd approach that uses only object strucs but with inherited methods included can be used, so long as the struc is not allowed to grow too large.

If speed is less important, and inheritance is needed because you plan to call the superclass's methods however then the slower first approach would be best.

For small object size and small code size, doing away with stored method pointers is the solution.