The heap vs UninterpretedBytes vs ByteArray -- My misinterpretations of life -- Michael Lucas-Smith

2008-08-31

After my OpenGL Math experiment yesterday I found myself uneasy with the notion of coding large chunks of data back and forth between OpenGL like I was doing. It seems obvious once you've made up your mind but the general idea is "Don't do it".

I was doing it for a very specific and probably wrong reason. I wanted to have a generic Vector and Matrix class that I could share across frameworks. In fact, I've been using my Vector2 class in another framework I was building... I've since come to the conclusion that I should try and be as true to OpenGL as I possible can.

So what does it mean to be true to OpenGL? Well the Matrix structure is an array of floats in memory, the vector is an array of floats in memory and when we start building up Buffer Objects, we have large arrays of floats in memory that represent all kinds of data interleaved together.

It's the last scenario that I cannot support nicely right now. So let me just explain it a bit more and then talk about how I want to change that. Say you have a million vector4s which define the geometry of a model. You may also have a million vector2s to describe the way textures attach to all those geometry vectors. You'll probably also have a million vector4s to describe the normal of those vectors.. and you may even have other attributes.

Some people like to have a buffer object per 'type' of data, but more often you see all this data interleaved together in to one buffer object. OpenGL is fine with either approach and all you have to do is tell it how you want to map all your data and its fine. An example of how this may layout in memory:

#( Vx Vy Vz Vw Cr Cg Cb Ca Nx Ny Nz Nw Tu Tv Ti ... )

This array describes one whole vertex. There may be more attributes we'll want to put in the array too, but for this example, we have V representing the Vector4, C representing the RGBA colour, N representing the normal Vector4 and T representing the texture attachment with Ti representing the texture id - which is something that we don't actually need at render time (one reason why you might want to split these things up in to multiple buffers is to avoid sending data to the graphics memory that you don't need to).

This 15 component data structure is repeated a million times. It's important to remember that all of these values are probably either 32 bit integers or floats, in which case each individual "Vertex" takes up 60 bytes.

Now imagine the cost of this thing if we actually allocated objects for every single one of them? The tax on the garbage collector alone is enough to make you cower in fear. This is data and we don't need to treat each individual bit of it as an object unless we -want- to. So what I need is the idea of roving objects. Vector should point to a location in memory and interpret the next four floats accordingly. Colours are often represented as Vectors in OpenGL too. With this in mind, we can look at the objects we're representing here:

#( Vector4 Vector4 Vector4 Vector2 SmallInteger ... )

We can call this a Vertex object. Since the objects we have in the float are accessed through a facade, why not group them together in to a bigger facade object. This lets us see more easily how the Vertices fit together:

#( Vertex1 Vertex2 Vertex3 Vertex4 Vertex5 Vertex6 Vertex7 ... )

A Matrix is not too different, though the ordering of its elements is back to front to the way you'd expect. For animation it is common to have several matrices describing the range of motion for a "limb" in a "skeleton", so having one Matrix by itself may be common, but having an array of Matrices is also something we should account for. Similar to above, we can treat it as an array of floats. Here is an example of a 2x2 Matrix in array form:

#( f11 f21 f12 f22 ... ) and similarly we can pack it up as: #( Martrix2 ... )

So we're talking about very long arrays of floats but we know we cannot afford to instantiate all these floats - where we do store them and how?

The title of this blog post should give you a hint. We have three options and I'll discuss the pros and cons of each:

The heap

Pros: When we receive a pointer back from OpenGL, we can apply it as a heap array of floats easily. Because the pointer is typed, the VM will convert values for us on access the fast way.

Cons: We have to remember the array lengths ourselves. We don't inherit the iteration API from Array.

UninterpretedBytes

Pros: Built for describing data structures and sharing them with the C world. We can allocate it in fixed space. We can use the VM accelerated floatAt: and floatAt:put: methods.

Cons: We can't make one from a pointer received by OpenGL without copying the data. We don't inherit the iteration API from Array. We don't inherit the iteration APIs from Array. Allocation speed may be an issue.

ByteArray

Pros: We get all the Array APIs for iterating. We can copy the floatAt: and floatAt:put: primitives across.

Cons: We can't make one from a pointer received by OpenGL without copying the data. Allocation speed may be an issue.

One of the main things I want to do is debug the vertex shaders. You can do that by looking at what you've rendered but if you see nothing but black, it's not so useful. The feedback technique used in Lesson #7 - Math, lets you see what the shaders are doing to all your numbers. It'd be nice if the resulting buffer object was as accessible as the input data was.

Because of this, I've decided to go with the heap approach. I'll have FloatArray which will allocate (or be given the pointer to) in the heap, holding on to a CPointer, as well as the facade classes which will over the top of the FloatArray. There'll be Vector, Matrix and some sort of Vertex to hold on to those things too as I evolve it further. I may make the Vertex class dynamic, in that you tell it what's at where when you initialize it, so that it's more general.

One outcome of this is that you should be able to throw a vertex shader at an array of data more easily and say "do this math" and get out a result as a general API of the OpenGL package.

I've yet to start on these changes, but I wanted to at least make up my mind on the approach I was going to take and talking it through on the blog certainly helps. Also thanks to Ken Causey and Travis Griggs for chatting with me about this at late hours in the night too.