Posts tagged uniform buffer
I’ve chosen the title based on the popular article that tries to prove that OpenGL lost the war against Direct3D. To be honest, I didn’t really like the article at all. First, because it compared OpenGL 3 which targeted Shader Model 4.0 hardware and DirectX 11 which targeted Shader Model 5.0 hardware. Besides that, as we will see, the war is really far from over… This article aims to list the most important features introduced by OpenGL 3.x, OpenGL 4.x, Direct3D 10, Direct3D 11 and we will also talk about the promised features of the upcoming Direct3D 11.1 to be fair with DirectX
After the release of the OpenGL 4.1 specification the Khronos Group slowed down the pace a little bit but they didn’t left OpenGL developers without a new specification version for too long as a few weeks ago they’ve released OpenGL 4.2. The new version of the specification brings several API improvements as well as exposes some important pieces of hardware functionality that makes OpenGL 4.x class hardware a great step forward in GPU history. This article aims to present the newly introduced features in the latest version of the OpenGL specification and, as a few months ago I wrote an article about Suggestions for OpenGL 4.2 and beyond, I will write a few words about how does the new specification reflect my forecast.
The Khronos Group did a great job in the last few years to once again prove that OpenGL is still in game and that it can become the ultimate graphics API of choice, if it is not that already. However, we must note that it is not quite yet true that OpenGL 4.1 is a superset of its competitor, DirectX 11. We still have some holes that still have to be filled and I think the ARB should not stop just there as there is much more potential in the current hardware architectures than that is currently exposed by any graphics API so establishing the future of OpenGL should start by going one step further than DX11. In this article I would like to present my vision of items of importance that should be included in the next revision of the specification and how I see the future of OpenGL.
Currently there are several ways to feed data to the GPU no matter of what API we use and what type of application we develop. In case of OpenGL we have uniform buffers, texture buffers, texture images, etc. The same is true for OpenCL and other compute APIs that even provide more fine-grained memory management taking advantage of the local data store (LDS) available on today’s hardware. In this article I’ll present the memory access performance characteristics of AMD’s Evergreen-class GPUs focusing on what this all means from OpenGL point of view. While most of the data is about the HD5870, the general principles and relative performance characteristics are valid for other GPUs, including ones from other vendors.
A few months ago I’ve presented an object culling mechanism that I’ve named Instance Cloud Reduction (ICR) in the article Instance culling using geometry shaders. The technique targets the first generation of OpenGL 3 capable cards and takes advantage of geometry shaders’ capability to reduce the emitted geometry amount in order to get to a fully GPU accelerated algorithm that performs view frustum culling on instanced geometry without the need of OpenCL or any other GPU compute API. After the culling step the reduced set of instance data is fed to the drawing pass in the form of a texture buffers. In this article I will present an improved version of the algorithm that exploits the use of instanced arrays introduced lately in OpenGL 3.3 to further optimize it.
The Khronos Group continues the progress of streamlining the OpenGL API. One very important step in this battle has been made just a few days ago by releasing two concurrent core releases of the OpenGL specification, namely version 3.3 and 4.0. This is a major update of the standard containing many revolutionary additions to the tool-set of OpenGL that need careful examination. In this article I would like to talk about these new features trying to point out their importance and touching also some practical use case scenarios.
Since the appearance of Shader Model 4.0 people wonder how to take advantage of the newly introduced programmable pipeline stage. The most important feature enabled by geometry shaders is that one can change the amount of emitted primitives inside the pipeline. The first thing that a naive developer would try to do with it is geometry tesselation. However, the new shader performs very bad when used for tesselation in a real life scenario even though there are demos show casting this possibility. If we take a closer look at the new feature we observe that the most revolutionary in it is not that it can raise the number of emitted primitives but that it can discard them. This article would like to present a rendering technique that takes advantage of this aspect of geometry shaders to enable the GPU accelerated culling of higher order primitives.
OpenGL 3.1 introduced two new sources from where shaders can retrieve their data, namely uniform buffers and texture buffers. These can be used to accelerate rendering when heavy usage of application provided data happens like in case of skeletal animation, especially when combined with geometry instancing. However, even if the functionality is in the core specification for about a year now, there are few demos out there to show their usage, as so, there is a big confusion around when to use them and which one is more suitable for a particular use case.