Currently there are several ways to feed data to the GPU no matter of what API we use and what type of application we develop. In case of OpenGL we have uniform buffers, texture buffers, texture images, etc. The same is true for OpenCL and other compute APIs that even provide more fine-grained memory management taking advantage of the local data store (LDS) available on today’s hardware. In this article I’ll present the memory access performance characteristics of AMD’s Evergreen-class GPUs focusing on what this all means from OpenGL point of view. While most of the data is about the HD5870, the general principles and relative performance characteristics are valid for other GPUs, including ones from other vendors.
Dynamic geometry level-of-detail (LOD) algorithms are very popular and powerful algorithms that provide a great level of rendering performance optimization while preserving detail by using less detailed geometry for objects that are far away, too small or otherwise less significant in the quality of the final rendering. Many of these are used since the very beginning of computer graphics technologies and are present in some form in current CAD softwares, video games and other graphics applications. While determining the appropriate geometry LOD was previously the task of the CPU, with todays hardware it is possible to also offload this to the GPU which excels at handling large amount of objects in parallel.
Hierarchical-Z is a well known and standard feature of modern GPUs that allows them to speed up depth testing by rejecting large group of incoming fragments using a reduced and compressed version of the depth buffer that resides in on-chip memory. The technique presented in this article uses the same basic idea to allow batched occlusion culling for large amount of individual objects using a geometry shader without the need of any CPU intervention that is unavoidable using traditional occlusion queries. The article also provides a reference implementation in the form of the OpenGL 4.0 Mountains demo that uses the technique for culling thousands of object instances.
OpenGL 3.0 capable GPUs introduced a level of processing power and programming flexibility that isn’t comparable with any earlier generations. After that, OpenGL 4.0 and the hardware supporting it even further pushed the limits of what previously seemed to be impossible. Thanks to these features nowadays more and more possibilities are available for the graphics developers to implement GPU based scene management and culling algorithms. The Mountains demo showcases some of these rendering techniques that, as far as I know, were never implemented so far using OpenGL. In this article I will present the key features of the demo that will be discussed in more detail in subsequent articles. Demo binaries with full source code are also published.
With the introduction of Shader Model 5.0 hardware and the API support provided by OpenGL 4.0 made GPU based geometry tessellation a first class citizen in the latest graphics applications. While the official support from all the commodity graphics card vendors and the relevant APIs are quite recent news, little to no people know that hardware tessellation has a long history in the world of consumer graphics cards. In this article I would like to present a brief introduction to tessellation and discuss about its evolution that resulted in what we can see in the latest technology demos and game titles.
Gaussian blur is an image space effect that is used to create a softly blurred version of the original image. This image then can be used by more sophisticated algorithms to produce effects like bloom, depth-of-field, heat haze or fuzzy glass. In this article I will present how to take advantage of the various properties of the Gaussian filter to create an efficient implementation as well as a technique that can greatly improve the performance of a naive Gaussian blur filter implementation by taking advantage of bilinear texture filtering to reduce the number of necessary texture lookups. While the article focuses on the Gaussian blur filter, most of the principles presented are valid for most convolution filters used in real-time graphics.
The Khronos Group keeps the pace that they set themselves being able to deliver the latest specification of OpenGL less than half year after the revolutionary appearance of OpenGL 4. Abandoning the OpenGL 3.x line of the specification (at least for a while) the new update concentrates on Shader Model 5.0 class GPUs and extensions heavily promoted by the community. Beside all this, the Khronos Group now confessedly opens towards convergence to OpenGL ES making the desktop version of the specification downward compatible with its embedded brother. In this article I would like to present the features introduced with the latest revision of the specification.
A few months ago I’ve presented an object culling mechanism that I’ve named Instance Cloud Reduction (ICR) in the article Instance culling using geometry shaders. The technique targets the first generation of OpenGL 3 capable cards and takes advantage of geometry shaders’ capability to reduce the emitted geometry amount in order to get to a fully GPU accelerated algorithm that performs view frustum culling on instanced geometry without the need of OpenCL or any other GPU compute API. After the culling step the reduced set of instance data is fed to the drawing pass in the form of a texture buffers. In this article I will present an improved version of the algorithm that exploits the use of instanced arrays introduced lately in OpenGL 3.3 to further optimize it.
I haven’t written any posts lately. This is because I dug into iPhone application development and this really consumed most of my spare time. As you may remember, I’ve already mentioned that I would like to start dealing with mobile platforms as a target for my OpenGL related experiments and projects. After Android, this time I got my hands on a Mac mini and took a look at the currently most popular mobile gaming platform. Actually, these initial experiments wouldn’t take that long time if I would have to deal with just a new API and not with a brand new world with its own benefits and drawbacks.
Many things have changed since the first time the public put their hands on the first mobile phone device as these days the end user rarely makes their choices when buying a mobile equipment based on their telephony capabilities. In fact, nowadays these devices are one of the most popular entertainment platforms out there. The main problem for application developers is that these platforms tended to be very heterogeneous from point of view of hardware architecture as well as that of API support. Meanwhile things have changed. While the underlying hardware still varies a lot from device to device the work of application developers has been eased by having cross platform mobile operating systems and open standards. In particular OpenGL ES that is an embedded version of the popular graphics API. In this article I would like to talk about some of the big players of the mobile OS industry and about using OpenGL ES for creating impressive mobile applications.