Most systems programmers are not new to creating and applying custom memory allocators in performance or memory constrained projects. However, the benefits of purpose-built memory allocators are often underestimated or overlooked by many programmers. This article aims to provide an overview of the motivation and advantages of deploying custom memory allocation schemes and presents a few common allocation strategies.
Previously we explored the different types of memories available for access by the GPU, but only barely touched on the topic of caches. In this article we will make up for that by taking a look at the different caches available on modern GPUs to appreciate their role in the system. Having thorough understanding of GPU cache behavior enables developers to better utilize them and thus improve the performance of their graphics or compute applications.
With the recent announcement of AMD Smart Access Memory it seemed to be the right time to write about the different types of memories available to be used by applications targeting dedicated GPUs. This article aims to provide an introduction to different memory pools within such a system, their access characteristics, and why enabling access to the entire VRAM through the PCI-Express bus could be a game changer.