DISCLAIMER: This article was migrated from the old blog thus may contain formatting and content differences compared to the original post. Additionally, it likely contains technical inaccuracies, opinions that I may no longer align with, and most certainly poor use of English (I was young and foolish :)). This article remains public for those who may find it useful despite its flaws.
OpenGL 3.1 introduced two new sources from where shaders can retrieve their data, namely uniform buffers and texture buffers. These can be used to accelerate rendering when heavy usage of application provided data happens like in case of skeletal animation, especially when combined with geometry instancing. However, even if the functionality is in the core specification for about a year now, there are few demos out there to show their usage, as so, there is a big confusion around when to use them and which one is more suitable for a particular use case.
Both AMD and NVIDIA have updated their GPU programming guides to present the latest facilities provided by both OpenGL and DirectX, however I still see that people don’t really understand how they work and that prevents them from effectively taking advantage of these features.
Once, at some online forum, I found somebody arguing why is this whole confusion introduced by the Khronos Group and why there is no general buffer type to use instead and the decision whether to use uniform or texture buffers should be a decision made by the driver. This particular post motivated me to write this article.
By the way, it seems suitable for the application to have such an abstraction, however, one should never forget that OpenGL is just a thin layer on top of any graphics capable hardware and as such, it should not hide such details that in the hand of a good programmer can provide added performance benefits.
When using as input to shaders, both uniform buffers and texture buffers have their strengths and weaknesses that are public to application developers, especially taking into account the detailed descriptions of each in the corresponding GPU programming guides of the vendors. It would be very difficult if not impossible for the driver to decide which particular buffer type to use based on shader source code and it would provide less flexibility to the programmer.
For the developer to decide which of the two should be used for a particular purpose one must investigate the characteristics of both and make the choice based on that. To ease this decision I will try to present the most important features of both. I will also talk about what I’ve used them for and what results I’ve achieved.
Maximum size: 64KByte (or more)
Memory access pattern: coherent access
Memory storage: usually local memory
Use case examples: geometry instancing, skeletal animation, etc.
Uniform buffers were introduced in OpenGL 3.1 but are available on driver implementations that don’t conform to the version 3.1 of the standard via the GL_ARB_uniform_buffer_object extension. As the specification says, uniform buffers provide a way to group GLSL uniforms into so called “uniform groups” and source their data from buffer objects to provide more streamlined access possibilities for the application.
As uniform buffers are relatively small they can easily fit in local memory. This makes data access instant thus provide optimum performance when the size constraints don’t prevent the application developer to use them. However, vendors also state that uniform buffers prefer a sequential memory access pattern. This means that it performs best when the data in the uniform buffer accesses are relative local, however, it does not necessarily mean that this sequential read must occur in one shader execution as, like in case of geometry instancing, subsequent shader executions can provide the desired access pattern.
Personally I use them for instanced rendering by storing the model-view matrix and related information of each and every instance in a common uniform buffer and use the instance id as an index to this combined data structure. This usage performs very well on my system.
Also uniform buffers can be used to store the matrices of bones and use them for implementing skeletal animation, however, I personally prefer using normal 2D textures for this purpose to take advantage of the free interpolation thanks to the dedicated texture fetching units but that’s another story.
Uniform buffers can also be used for other rendering techniques like skinned instancing or geometry deformation but the buffer size limitation may prevent such use case scenarios.
Maximum size: 128MByte (or more)
Memory access pattern: random access
Memory storage: global texture memory
Use case examples: skinned instancing, geometry tesselation etc.
Texture buffers were also became core OpenGL in version 3.1 of the specification but are available also via the GL_ARB_texture_buffer_object extension (or via the GL_EXT_texture_buffer_object extension on earlier implementations). Buffer textures are one-dimensional arrays of texels whose storage comes from an attached buffer object.
They provide the largest memory footprint for raw data access, much higher than equivalent 1D textures. However, they don’t provide texture filtering and other facilities that are usually available for other texture types. They represent formatted 1D data arrays rather than texture images. From some perspective, however, they are still textures that are resided in global memory so the access method is totally different than that of uniform buffers’. This has both advantages and disadvantages.
First, global texture memory access means texture fetching which involves the usage of a texture unit and possibly requires several clock cycles to complete. Anyway, thanks to the latency hiding mechanisms inside today commodity GPUs sometimes this can be as cheap as accessing uniform buffers. This part of the story is implementation dependent and is up to the hardware vendor. However, as stated in their programming guides, both AMD and NVIDIA have such latency hiding facilities and they also suggest that one should not expect a huge performance impact when using texture buffers.
Anyway, texture memory access provides a huge benefit compared to uniform buffers. Textures are more prone to scattered accesses and thus are more capable of dealing with random memory access. As the AMD HD2000 series programming guide says, if a certain set of data is accessed in a very random fashion it may be even faster to use texture fetches than indexed uniform access.
So even if texture buffers can be used in the same use case scenarios as uniform buffers, performance of either depends much more on the actual shader implementation rather than on the hardware implementation of the features.
Beside the aforementioned use cases, texture buffers can be used in more advanced techniques like instanced skeletal animation or even for implementing geometry tesselation, however I’m not convinced that it has any practical usage as it involves such tricks that don’t perform well on current hardware. Personally I use texture buffers for different geometry deformation techniques, to resolve batching issues when the size limitation of uniform buffers is a blocking factor, and for some inverse kinematics effects.
By the way, from now it’s your task to draw a conclusion based on the information read here but I recommend to read the mentioned programming guides to see a more accurate presentation of both methods. My personal conclusion is that there is no ultimate choice as both buffer types serve different purposes. Even if their possible use cases overlap, there are plenty of rendering techniques that would take advantage of the benefits of one but would suffer from the disadvantages of the other.
For further details on the topic, please refer to the OpenGL extension registry and the vendor supplied GPU programming guides: