Posts tagged C++
In this article, I would like to present you an edge detection algorithm that shares similar performance characteristics like the well-known Sobel operator but provides slightly better edge detection and can be seamlessly extended with little to no performance overhead to also detect corners alongside with edges. The algorithm works on a 3×3 texel footprint similarly like the Sobel filter but applies a total of nine convolution masks over the image that can be used for either edge or corner detection. The article presents the mathematical background that is needed to implement the edge detector and provides a reference implementation written in C/C++ using OpenGL that showcases both the Frei-Chen and the Sobel edge detection filter applied to the same image.
Dynamic geometry level-of-detail (LOD) algorithms are very popular and powerful algorithms that provide a great level of rendering performance optimization while preserving detail by using less detailed geometry for objects that are far away, too small or otherwise less significant in the quality of the final rendering. Many of these are used since the very beginning of computer graphics technologies and are present in some form in current CAD softwares, video games and other graphics applications. While determining the appropriate geometry LOD was previously the task of the CPU, with todays hardware it is possible to also offload this to the GPU which excels at handling large amount of objects in parallel.
Hierarchical-Z is a well known and standard feature of modern GPUs that allows them to speed up depth testing by rejecting large group of incoming fragments using a reduced and compressed version of the depth buffer that resides in on-chip memory. The technique presented in this article uses the same basic idea to allow batched occlusion culling for large amount of individual objects using a geometry shader without the need of any CPU intervention that is unavoidable using traditional occlusion queries. The article also provides a reference implementation in the form of the OpenGL 4.0 Mountains demo that uses the technique for culling thousands of object instances.
OpenGL 3.0 capable GPUs introduced a level of processing power and programming flexibility that isn’t comparable with any earlier generations. After that, OpenGL 4.0 and the hardware supporting it even further pushed the limits of what previously seemed to be impossible. Thanks to these features nowadays more and more possibilities are available for the graphics developers to implement GPU based scene management and culling algorithms. The Mountains demo showcases some of these rendering techniques that, as far as I know, were never implemented so far using OpenGL. In this article I will present the key features of the demo that will be discussed in more detail in subsequent articles. Demo binaries with full source code are also published.
Gaussian blur is an image space effect that is used to create a softly blurred version of the original image. This image then can be used by more sophisticated algorithms to produce effects like bloom, depth-of-field, heat haze or fuzzy glass. In this article I will present how to take advantage of the various properties of the Gaussian filter to create an efficient implementation as well as a technique that can greatly improve the performance of a naive Gaussian blur filter implementation by taking advantage of bilinear texture filtering to reduce the number of necessary texture lookups. While the article focuses on the Gaussian blur filter, most of the principles presented are valid for most convolution filters used in real-time graphics.
A few months ago I’ve presented an object culling mechanism that I’ve named Instance Cloud Reduction (ICR) in the article Instance culling using geometry shaders. The technique targets the first generation of OpenGL 3 capable cards and takes advantage of geometry shaders’ capability to reduce the emitted geometry amount in order to get to a fully GPU accelerated algorithm that performs view frustum culling on instanced geometry without the need of OpenCL or any other GPU compute API. After the culling step the reduced set of instance data is fed to the drawing pass in the form of a texture buffers. In this article I will present an improved version of the algorithm that exploits the use of instanced arrays introduced lately in OpenGL 3.3 to further optimize it.
The importance of static code analysis is already a well known thing in the domain of software development. There are plenty of useful and less useful tools for the purpose, especially in the case of C++. However, even if in general the quality of these softwares is adequate they usually suffer from the inability for extending or customizing behavior. Also, a usual problem arises from the fact that the C++ language syntax is overwhelmingly complex and it makes the code parser of any static analysis tool a nightmare. In this article I would like to present a tool called CppDepend that gracefully solves the aforementioned problems primarily focusing on providing an interface that enables 100% adaptability and extensibility for creating customized metrics that are relevant or applicable in a particular domain.
Nowadays comprehensive testing is a must for any software product. However, it isn’t such a general rule when it comes to graphics applications. Many developers face difficulties when they have to test their rendering codes. Manual tests and visual feedback is sometimes satisfactory but if one would like to have automated regression tests usual approaches seem to fail. Even if at first sight unit testing of rendering code doesn’t look really straightforward, in fact it is. OpenGL is not an exception from this rule as well. In this article I would like to briefly present a few methods how to unit test OpenGL rendering code and also present my choice and the reasons behind the decision.
Those who worked enough with C or other procedure oriented languages know how much flexibility callbacks provide. The simplest example is the qsort function of the C standard library. It is also not unintentional that many libraries, windowing system APIs and operating system APIs also highly rely on callbacks to pass a particular task over to another program module and it is one of the fundamental tools needed to implement an event-driven application. At the same time, object oriented languages does not directly support the concept of callbacks as they don’t really fit into the paradigms used by these languages. Fortunately, even if not as a language feature, all object oriented languages support a similar facility like callbacks in the form of delegates.
Since the appearance of Shader Model 4.0 people wonder how to take advantage of the newly introduced programmable pipeline stage. The most important feature enabled by geometry shaders is that one can change the amount of emitted primitives inside the pipeline. The first thing that a naive developer would try to do with it is geometry tesselation. However, the new shader performs very bad when used for tesselation in a real life scenario even though there are demos show casting this possibility. If we take a closer look at the new feature we observe that the most revolutionary in it is not that it can raise the number of emitted primitives but that it can discard them. This article would like to present a rendering technique that takes advantage of this aspect of geometry shaders to enable the GPU accelerated culling of higher order primitives.