zsmith.co

Object-Oriented Assembly Language Programming

Revision 9
© 2017-2019 by Zack Smith. All rights reserved.

Introduction

Object-oriented programming (OOP) became a dominant paradigm a few decades ago. People think of C++, Java and Objective-C as prominent examples of object-oriented languages. It is easy to assume that OOP requires such an OOP language. This is not the case. OOP is just a paradigm. It is not strictly necessary to use an object-oriented programming language to do object-oriented programming. You can do OOP for instance in C but you can even do it in assembly language. (See my article on Object-Oriented C programming.)

The ideal approach to object-oriented x86 assembly programming

A simple technique for implementing an object and its class in assembly is to have two chunk of RAM for:

  • A class struc, which holds method pointers.
  • An object struc, which holds the instance variables (ivars), and a pointer to a class struct to get access to method pointers.

Because you're coding in assembly, which makes you write out every memory access as another instruction, it is best to have one class struc per class which does not point to a superclass struc, lest accessing method pointers become tedious. Thus each class struct should include all methods' pointers i.e. those of the class you're writing plus all superclass methods.

Likewise, each object struc should have superclass instance variables followed by ivars for subclasses including the class you're writing, so that if you call a superclass method, its idea of where its ivars are matches the layout of your object struc.

Here's an example of a class struct with two superclasses:

  • StringMutable
  • inherits from String
  • inherits from Object

 ; StringMutable class structure
 struc string_mutable_class
  ;--------------
  ; from Object
  .magic resq 1 ; class identifier
  .destroy resq 1 ; destructor
  .print resq 1 ; print method
  ;--------------
  ; from String
  .length resq 1 ; method to get length
  .characterAt resq 1 ; method to get character
  ;--------------
  ; StringMutable methods:
  .setString resq 1
  .setCharacterAt resq 1
  .insertCharacterAt resq 1
  .appendString resq 1
  .appendCharacter resq 1
  .truncateAt resq 1
  .toupper resq 1
  .tolower resq 1
  .reverse resq 1
  .size:
 endstruc

The initialization of the class struc calls the superclasses' initializers first, in order.

 StringMutable_init: ; rdi is object parameter
  call Object_init
  call String_init
  ;=== Set StringMutable's methods:
  lea rax, [rel StringMutableClass_struct]
  mov [rdi], rax
  lea rax, [rel StringMutableClass_print]
  mov [rdi + string_mutable_class.print], rax
  lea rax, [rel StringMutableClass_destroy]
  mov [rdi + string_mutable_class.destroy], rax
  lea rax, [rel StringMutableClass_insertCharacterAt]
  mov [rdi + string_mutable_class.insertCharacterAt], rax
  lea rax, [rel StringMutableClass_setCharacterAt]
  mov [rdi + string_mutable_class.setCharacterAt], rax
  lea rax, [rel StringMutableClass_append]
  mov [rdi + string_mutable_class.append], rax
  lea rax, [rel StringMutableClass_truncateAt]
  mov [rdi + string_mutable_class.truncateAt], rax
  lea rax, [rel StringMutableClass_toupper]
  mov [rdi + string_mutable_class.toupper], rax
  lea rax, [rel StringMutableClass_tolower]
  mov [rdi + string_mutable_class.tolower], rax
  mov rax, rdi
  ret

If any superclass method is to be overridden by a subclass, that subclass's init method must overwrite the superclass method pointer for it to point to its own variant, as I do in OOC.

Each object has to have a memory layout that is compatible with parent and derived classes. Here is an example:

 ; StringMutable object structure
 struc string_mutable_object
  ;=== Object instance variables:
  .magic resd 1 ; code indicating object type
  .retainCount resd 1 ; number of users of this object
  .class resq 1 ; class struc pointer
  ;=== String instance variables:
  .buffer resq 1
  .length resq 1
  ;=== StringMutable instance variables:
  .allocatedSize resq 1
  .size:
 endstruc

In use:

 ; Allocate an object
   call StringMutable_new ; object returned in rax
   mov rdi, rax ; (rdi = our object)
 ; Give it a value
   mov rsi, 'X'
   mov rax, [rdi + string_mutable_object.class] ; get pointer to class
   call [rax + string_mutable_class.appendCharacter] ; get method pointer and call it
 ; Call the print method
   mov rax, [rdi + string_mutable_object.class] ; get pointer to class
   call [rax + string_mutable_object.class.print] ; get method pointer and call it
 ; Decrement retain count and release the object
   call Object_release

Summary

Ease of programming

Assembly language is inherently more difficult and error-prone than coding in a high-level language.

The NASM assembler can make OOP difficult because struc layouts cannot be inherited (same as C).

Calculating indices of ivars and methods is tedious but could be made simpler with macros, in theory.

Object strucs must be carefully laid out so that superclass methods can work on subclass objects.

However if you want to produce the absolute fastest possible object-oriented code, this may be the only way.

Alternatives

While OOA is an interesting curiosity, it does tie you to a particular architecture.

A wiser approach is to use Object-Oriented C (OOC), which achieves the speed of C++ without the need for a 1000-page book to explain the language.

In my variant of OOC, which I use in my bandwidth benchmark, the slightly syntax resembles that of Objective-C.