Solution Exercise 1

I will give here the solution for 32 bits and at the end of the post briefly mention the differences in 64 bits.

We note that in the exercise it was written “This post is relevant to help: Spur’s new object format“. Aside from the actual Slang code, the recommended post explains how objects are represented in memory in Spur. The solution of this exercise will *not* repeat all the information present in the referred blog post. Please read the blog post before performing the exercise or reading the solution in any case.

Question 1:

During the execution fo the method, two objects are allocated, the person named John (We will call that object John for the rest of the solution) and the runtime array holding the siblings.

In 32 bits John is a fixed-sized pointer object (it has instance variables, no variable fields accessible with at:). John has 5 instance variables so it needs 5 times a pointer, so 5 times 32 bits to store its fields. As it has less that 255 slots, John’s header is 64 bits wide. The total size of John is therefore 64 + (5*32) = 224 bits wide.

For the array holding the siblings, we have a similar logic and the object is 64 + (2*32) = 128 bits wide.

Now how many bytes are in total allocated ? The thing is that even in 32 bits objects are aligned on a 64 bits boundary. As John’s bit size is not a multiple of 64, there is a 32 bits free slot left in between the 2 objects for alignment. Then, depending on the alignment of the end of the object allocated just before John, alignment bits may be required before John allocation.

The figure below summarizes the memory representation of the two objects and the number of bytes allocated. The total number of bytes allocated during the execution of the method are therefore 224 (John) + 32 (alignment for the array) + 128 (the array) + whatever bytes required to align John.


Now what is the theoretical value of each bit inside the objects ?

Let’s start with John.

In 32 bits, the second field of John is an immediate object (31 bits signed integers are encoded as immediate objects in 32 bits) while other fields are just pointers to the actual objects.

Now let’s look at the value of each field in the 64 bits header of John:

  • bits 0 to 4: number of slots in word of the object: value 5 (00101)
  • bit 5: isMarked bit: value 0 at allocation (0)
  • bit 6: unused: value 0 at allocation (0)
  • bits 7 to 31: identity hash: value 0 at allocation (0000000000000000000000)
  • bit 32: isGray: value 0 at allocation (0)
  • bit 33: isPinned: value 0 at allocation (0)
  • bit 34: isRemembered: value 0 at allocation (0)
  • bit 35 to 39: object format: fixed-sized pointer object, value 1 (00001)
  • bit 40: isReadOnly: value 0 at allocation (0)
  • bit 41: unused: value 0 at allocation (0)
  • bits 42 to 63: class index: actual class index of Person that we call K, it depends on many factors in the runtime, let’s say the value is 2048 (0000000000010000000000)

The header field of John is therefore (64 bits wide):

The figure below summarizes the value of the bits inside the object John:

The array is very similar, with a different object format. An exception is that Array is a known class to the VM, and its class index if fixed (51), present on the first page of the class table. The following figure details it the value of each bit in the array:


Question 2:

Here is a representation of each object directly referenced by John:


The value of each bit in John and the array have been detailed before.

In the case of 1.83, BoxedFloats are represented a 2 slots word object (the object format needs to be a word object) and the inner 64 bits value is the double IEEE representation of 1.83.

In the other objects, the main thing that is non obvious is to correctly set the numSlots and object format field of 189345472112 and ‘John’. NumSlots is always the number of word rounded up, hence 2 in the case of 189345472112 and 1 in the case of John. The object format encodes the number of bytes at the end of the objects that cannot be accessed in the object (in 32 bits, the object format of a byte object can be any number between 16 and 19). In 189345472112, the last 3 bytes of the object cannot be accessed hence its object format is 19 as 19 – 16 (base byte object format value) = 3. In ‘John’, all the bytes in the last word can be accessed hence the object format is 16 as 16 – 16 = 0.

Differences in 64 bits

The main differences are:

  • The header of the objects is still 64 bits wide, but each pointer is 64 bits long instead of 32 bits long.
  • 64 bits object alignment in 64 bits is easier to deal with (fewer bytes allocated to align the next object in question 1)
  • SmallIntegers are 61 bits signed integer instead of 31 bits signed integer,
    hence 189345472112 referred by John is an immediate object. The representation of SmallIntegers is also different.
  • 1.83 referred by John is in the middle 8th of the double range, hence it is a SmallFloat (immediate object). The representation is detailed in the Spur object format blog post.
  • The object format of the byte objects is a bit different to deal with the number of fields that cannot be accessed in the last word (A byte object can have an object format with any value in between 16 and 23 instead of 16 and 19, since up to 7 bytes may not be able to be accessed).

I hope you liked the exercise.