<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>RasterGrid Blog &#187; Uncategorized</title>
	<atom:link href="http://rastergrid.com/blog/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://rastergrid.com/blog</link>
	<description>A technical blog from Daniel Rákos (aka aqnuep)</description>
	<lastBuildDate>Tue, 24 Aug 2010 19:34:39 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Instance Cloud Reduction reloaded</title>
		<link>http://rastergrid.com/blog/2010/06/instance-cloud-reduction-reloaded/</link>
		<comments>http://rastergrid.com/blog/2010/06/instance-cloud-reduction-reloaded/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 19:36:38 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[attribute divisor]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[culling]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[geometry shader]]></category>
		<category><![CDATA[GLEW]]></category>
		<category><![CDATA[GLM]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[instanced array]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[SFML]]></category>
		<category><![CDATA[texture buffer]]></category>
		<category><![CDATA[transform feedback]]></category>
		<category><![CDATA[uniform buffer]]></category>
		<category><![CDATA[vertex buffer]]></category>
		<category><![CDATA[vertex shader]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=251</guid>
		<description><![CDATA[

A few months ago I&#8217;ve presented an object culling mechanism that I&#8217;ve named Instance Cloud Reduction (ICR) in the article Instance culling using geometry shaders. The technique targets the first generation of OpenGL 3 capable cards and takes advantage of geometry shaders&#8217; capability to reduce the emitted geometry amount in order to get to a [...]]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2010%252F06%252Finstance-cloud-reduction-reloaded%252F%22%2C%20%22shorturl%22%3A%20%22http%3A%2F%2Fbit.ly%2Fc2unzx%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22Instance%20Cloud%20Reduction%20reloaded%22%20%7D);"></div>
<div class="wp-caption alignleft" style="width: 160px"><img src="http://rastergrid.com/blog/wp-content/uploads/2010/02/Nature-2010-02-08-20-20-36-24-150x150.png" alt="" width="150" height="150" /><p class="wp-caption-text">OpenGL 3.3 - Nature</p></div>
<p>A few months ago I&#8217;ve presented an object culling mechanism that I&#8217;ve named Instance Cloud Reduction (ICR) in the article <a title="Instance culling using geometry shaders" href="http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/">Instance culling using geometry shaders</a>. The technique targets the first generation of OpenGL 3 capable cards and takes advantage of geometry shaders&#8217; capability to reduce the emitted geometry amount in order to get to a fully GPU accelerated algorithm that performs view frustum culling on instanced geometry without the need of OpenCL or any other GPU compute API. After the culling step the reduced set of instance data is fed to the drawing pass in the form of a texture buffers. In this article I will present an improved version of the algorithm that exploits the use of instanced arrays introduced lately in OpenGL 3.3 to further optimize it.</p>
<p><span id="more-251"></span>Lets recap the basics of the algorithm before I present the improved technique. The geometry shaders have a very nice feature that they cannot just emit a modified version of the input geometry but can also alter the number of emitted primitives compared to the number of received ones. This is a both-way ability what means that we cannot just increase but also decrease the number of primitives. That is what the technique takes advantage.</p>
<p>In the first pass we feed a simple vertex shader &#8211; geometry shader pair with the instance data of the geometries as they&#8217;ve been the data of point primitives. The vertex shader then checks whether the actual instance is inside the view frustum or not and sends the result to the geometry shader. If the result is yes then the geometry shader outputs the instance data otherwise discards it. The primitives emitted by the geometry shaders are captured then using transform feedback into a buffer object. Also a query object is needed in order to be able to get the amount of instances that passed the view frustum culling. In the drawing pass we use the result of the query to decide how many instances we have to draw and the captured feedback buffer is used as instance data.</p>
<div class="wp-caption aligncenter" style="width: 660px"><img src="http://rastergrid.com/blog/wp-content/uploads/2010/02/icr_combined.png" alt="" width="650" height="347" /><p class="wp-caption-text">Instance Cloud Reduction - Combined view of Pass 1 + Pass 2</p></div>
<p>This is a very brief description of the culling mechanism so for a complete specification please read the <a title="Instance culling using geometry shaders" href="http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/">original article</a>.</p>
<h3>Motivation</h3>
<p>While Instance Cloud Reduction is a quite robust technique that can severely simplify and speed up the rendering of high amount of instanced geometry its performance is also limited due to some hardware and API restrictions. The most important ones are the following:</p>
<ul>
<li>Needs an extra rendering pass to perform the culling.</li>
<li>Requires the usage of asynchronous queries to determine the number of visible instances.</li>
<li>Uses texture fetching in the vertex shader of the actual drawing pass.</li>
</ul>
<p>The first mentioned drawback means that more draw commands are required that use the output of the first pass as input. This and the second disadvantage may cause stalls due to the fact that the CPU has to wait for the data to be ready before issuing the second pass thus the GPU is not used effectively.</p>
<p>What this improvement tries to solve is the third problem. Texture fetching itself is quite fast in the latest generation of hardware, however it causes some slowdowns anyway due to the latency introduced by texture fetches even though GPUs use some latency hiding techniques.</p>
<p>Instanced arrays provide us a way to replace texture fetching with vertex fetching that is usually done by different hardware element that works synchronously with the execution of vertex shaders. I&#8217;ve expected quite a reasonable speedup by taking advantage of instanced arrays, however we will see that actual results were far from my initial expectations.</p>
<h3>Implementation</h3>
<p>Traditional vertex fetching happens in a way that one element is fetched from each enabled input attribute buffer and the vertex shader is issued with these values. One element in a vertex attribute buffer can mean up to four floating point or integer values and for each execution of the vertex shader one set of these elements is used. There is an internal counter that is increased after each fetch and the next vertex attribute fetch will use this counter as an index into the buffer object.</p>
<p>While this mechanism is satisfactory for the most attributes of a vertex, it is not practical for instance data as such data belongs to an instance rather than a vertex. In order to source instance data from vertex attributes in case of traditional vertex fetching, high amount of redundant storage is required in order to get the same information for all the vertices belonging to a particular instance. This is not just waste of memory but also waste of bandwidth and it also defeats the goal of Instance Cloud Reduction.</p>
<p>Compared to traditional vertex fetching, instanced arrays provide a way to increase the internal counter used as the index into the vertex attribute buffer in a different way, in particular one can set the frequency of increase using a vertex attribute divisor that specifies after how many instances the counter shall be increased. This is a per-attribute property and by setting it to one we end up with exactly what we need: one vertex fetch per instance.</p>
<p>This means that actually we need just a very minor change compared to the original technique, more precisely we replace our texture buffer with a vertex attribute buffer that has a divisor of one and use it as the source of instance data in the vertex shader of the drawing pass.</p>
<h3>Execution results</h3>
<p>As we are not talking about a new technique but just an optimized implementation of the same method, the best way to evaluate it is by comparing the performance of the new version with the original one.</p>
<p>As I&#8217;ve mentioned earlier, I expected a reasonable performance increase by replacing texture fetches with vertex fetches, in practice the difference was not so significant. However, the performance difference between the two implementation can heavily depend on the underlying hardware implementation so various cards from various vendors and GPU generations can show more diverging behavior. In fact even driver versions may have an effect on the results.</p>
<div class="wp-caption aligncenter" style="width: 620px"><img class="  " src="http://rastergrid.com/blog/wp-content/uploads/2010/06/comparison.png" alt="" width="610" height="139" /><p class="wp-caption-text">Performance comparison of the old implementation and the presented one on an AMD Radeon HD5770. Scale is in frames per second (higher value is better).</p></div>
<p>Due to lack of hardware to use for testing, I&#8217;ve checked only with one card, namely a Radeon HD5770 with Catalyst 10.6 drivers. I noticed roughly a 10% speedup as the the new version of the Nature demo showed 100 FPS compared to the 90 FPS observed with the old implementation.</p>
<p>Even though this was not exactly the outcome I&#8217;ve expected from the new implementation, maybe the assumption is still valid for older generation of GPUs or for NVIDIA cards. I suspect so because for Shader Model 4.0 cards the hardware implementation of the texture fetching unit and the vertex fetching unit was most probably more differentiated than that of the latest GPUs. Also my guess is that on NVIDIA cards the difference is maybe higher as the vertex fetching hardware in SM 4.0 GeForce cards is less flexible than that of AMD&#8217;s taking in consideration that the first HD series Radeons already had some form of tessellation functionality that requires more freedom from the vertex pushing hardware.</p>
<p>In order to get a better picture about how effective the presented optimization is, I would like to ask all the visitors of this post to try the two releases and send me feedback about it.</p>
<h3>Conclusion</h3>
<p>We&#8217;ve seen that how easy it was to take advantage of instanced arrays in an existing implementation of the ICR technique and how does it perform on the latest generation of GPUs compared to the previous version. While this small addition provides some benefits, it also comes at a cost and we have to talk about that as well.</p>
<p><strong>Advantages:</strong></p>
<ul>
<li>Eliminates the need for texture fetching in the vertex shader thus improving performance.</li>
<li>Does not compromise the goal and the implementation architecture of the original method.</li>
<li>Frees up one texture unit that was previously reserved for the texture buffer containing the instance data.</li>
</ul>
<p><strong>Disadvantages:</strong></p>
<ul>
<li>Requires OpenGL 3.3 or the <a title="GL_ARB_instanced_arrays" href="http://www.opengl.org/registry/specs/ARB/instanced_arrays.txt" target="_blank" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/instanced_arrays.txt?referer=');">GL_ARB_instanced_arrays</a> extension in addition to the OpenGL 3.2 features.</li>
<li>We have to possibly sacrifice multiple vertex input attributes to feed the instance data to the shaders.</li>
</ul>
<p>Most of the mentioned benefits and drawbacks are self-explanatory, however I would like to say a few words about the last mentioned one&#8230;</p>
<p>For the purpose of showcase I used a simple translation factor as instance data that means a single vector of floats. In real life situation one may need more complex transformation data that can only be stored in the matrix. While in the demo the feeding of instance data consumed only one vertex attribute slot, in case of a full transformation matrix it would require four of them (not to mention other possible instance attributes). As the maximum number of input attributes is severely limited, usually to 16, the application of the optimization is restricted to situations when all the vertex and instance attributes fit into this limit.</p>
<p>In case of the original implementation, where a texture buffer was used as input, this did not cause any problem as the vertex shader is free to fetch any number of texels from that (still, performance can be a concern in this case). In order to help situations when input attribute slots are at a premium, in real life scenarios it is recommended to use quaternions instead of transformation matrices as they consume two times less attribute resources. Actually this can be a general recommendation as using quaternions decreases the bandwidth requirements of the instance data fetch thus increasing performance even in situations when there are enough input attribute slots available.</p>
<p>In order to ease the performance comparison for you, you can find download links for both versions of the Nature demo.</p>
<h3>Old version binary release</h3>
<p><strong>Platform:</strong> Windows<br />
<strong>Dependency:</strong> OpenGL 3.2 capable graphics driver<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature12_win32.zip">nature12_win32.zip (3.58MB)</a><br />
<strong>Comments:</strong> This version does <strong>NOT </strong>include the optimization presented in this article.</p>
<h3>Old version source code</h3>
<p><strong>Language: <span style="font-weight: normal;">C++</span><br />
Platform:</strong> cross-platform<br />
<strong>Dependency:</strong> GLEW, SFML, GLM<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature12_src.zip">nature12_src.zip (12.6KB)</a><br />
<strong>Comments:</strong> This version does <strong>NOT </strong>include the optimization presented in this article.</p>
<h3>New version binary release</h3>
<p><strong>Platform:</strong> Windows<br />
<strong>Dependency:</strong> OpenGL 3.3 capable graphics driver<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature20_win32.zip">nature20_win32.zip (3.58MB)</a><br />
<strong>Comments:</strong> This version includes the optimization presented in this article.</p>
<h3>New version source code</h3>
<p><strong>Language:</strong> C++<br />
<strong>Platform:</strong> cross-platform<br />
<strong>Dependency:</strong> GLEW, SFML, GLM<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature20_src.zip">nature20_src.zip (12.8KB)</a><br />
<strong>Comments:</strong> This version includes the optimization presented in this article.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/06/instance-cloud-reduction-reloaded/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>RasterGrid Blog crossed the 10000 threshold</title>
		<link>http://rastergrid.com/blog/2010/03/rastergrid-blog-crossed-the-10000-threshold/</link>
		<comments>http://rastergrid.com/blog/2010/03/rastergrid-blog-crossed-the-10000-threshold/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 20:35:41 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=204</guid>
		<description><![CDATA[

I am proud to announce that the number of visits has just gone over 10000. I would like to share that I haven&#8217;t expected such a great success in less than two months. I can hardly thank this enough to all my occasional and especially for my returning visitors!
When I&#8217;ve started to write this blog [...]]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2010%252F03%252Frastergrid-blog-crossed-the-10000-threshold%252F%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22RasterGrid%20Blog%20crossed%20the%2010000%20threshold%22%20%7D);"></div>
<p>I am proud to announce that the number of visits has just gone over 10000. I would like to share that I haven&#8217;t expected such a great success in less than two months. I can hardly thank this enough to all my occasional and especially for my returning visitors!</p>
<p>When I&#8217;ve started to write this blog my primary intension was to share my knowledge and ideas, no matter if they are legitimate enough or not. I did this in the hope that the articles on this blog may help others. At the end, it turned out that me myself learned from it a lot as, thanks to You, I&#8217;ve got great improvement ideas and feedback about my writings.</p>
<p><span id="more-204"></span>During the last two months I&#8217;ve came up with articles that brought various feelings out of the Reader. I&#8217;ve met exhortatory comments that gave me great power to continue the progress. Sometimes also faced conflicts due to different points of view and opinion, but I think those were also very edifying for both me and others. Also, one of my best experiences were when You came up with excellent improvement ideas regarding to the presented source code or whatever. The only thing I feel sorry for is that I haven&#8217;t had sufficient time to write further articles. Unfortunately, I cannot promise that it will change in the future but I will do my best, and I hope the quality and the utility of my articles will improve over time.</p>
<p>As it turned out that a schedule of 2-3 articles per week rendered being impossible to fit in my time-frame, I will most probably try to stick to one post per week. As a foretaste, here are some topics that I would like to talk about in the near future:</p>
<ul>
<li>Further application of geometry shaders (cube rendering and some more nifty tricks)</li>
<li>AMD tessellation demo with practical use cases</li>
<li>WebGL, COLLADA and other good to know stuff regarding to portable graphics</li>
<li>Physics and rigid body dynamics</li>
<li>More info about some of the best unit test practices</li>
<li>C++ messaging and state machines</li>
<li>Maybe some further development software reviews</li>
</ul>
<p>As the final word: thanks for being interested!</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/03/rastergrid-blog-crossed-the-10000-threshold/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Site successfully migrated</title>
		<link>http://rastergrid.com/blog/2010/01/site-successfully-migrated/</link>
		<comments>http://rastergrid.com/blog/2010/01/site-successfully-migrated/#comments</comments>
		<pubDate>Thu, 28 Jan 2010 17:24:45 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=117</guid>
		<description><![CDATA[

Almost a week ago I was talking about the problems with the server on which my blog is stored and I promised that it will be put to another server at least on Tuesday in order to not have any troubles with the reachability of the site. Unfortunately, due to environment problems this actually happened [...]]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2010%252F01%252Fsite-successfully-migrated%252F%22%2C%20%22shorturl%22%3A%20%22http%3A%2F%2Fbit.ly%2FavRoyq%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22Site%20successfully%20migrated%22%20%7D);"></div>
<p>Almost a week ago I was talking about the problems with the server on which my blog is stored and I promised that it will be put to another server at least on Tuesday in order to not have any troubles with the reachability of the site. Unfortunately, due to environment problems this actually happened only during last night. Anyway, I am happy that the migration is over now and I hope it won&#8217;t happen again.</p>
<p><span id="more-117"></span>Since my last post on the topic things got even worse as in the last few days I observed several hours of outage. This was partially due to the fact that on Tuesday, on the day when the hosting service provider originally planned the migration of my site, one of the hard drives of the old server crashed. It seems that I don&#8217;t have luck with these kinds of things. Nevertheless, I am pleased to see that the popularity of my posts didn&#8217;t suffer that much because of the former instability of the web-server behind the blog.</p>
<p>As an update about what articles can you expect in the near future, I already mentioned geometry shader based culling methods earlier. Now I was working on the demo application for this upcoming topic in the last few days and it still need some time but next week or latest the week after the next week I will post the article about the technique and also publish the demo application with full source code.</p>
<p>Till then I will present some other technologies that I would like to cover during the lifetime of this blog, like WebGL, OpenCL and others. These will be more like introductions to the different domains and more specific subjects will most probably presented later. Also I would like to recap on the topic of managed languages and introduce some more advanced C++ quasi language extensions.</p>
<p>Finally, still looking forward to know what you would like to read about.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/01/site-successfully-migrated/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Problems with the availability of the site</title>
		<link>http://rastergrid.com/blog/2010/01/problems-with-the-availability-of-the-site/</link>
		<comments>http://rastergrid.com/blog/2010/01/problems-with-the-availability-of-the-site/#comments</comments>
		<pubDate>Sat, 23 Jan 2010 15:30:50 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=104</guid>
		<description><![CDATA[

I started my blog about two weeks ago but already met some problems regarding to the availability of the site. It happened a few times that there was some problem with the Apache Server at the hosting company. This resulted in some short down times of the blog itself. I was saying that &#8220;okay, this [...]]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2010%252F01%252Fproblems-with-the-availability-of-the-site%252F%22%2C%20%22shorturl%22%3A%20%22http%3A%2F%2Fbit.ly%2F7qDwHo%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22Problems%20with%20the%20availability%20of%20the%20site%22%20%7D);"></div>
<p>I started my blog about two weeks ago but already met some problems regarding to the availability of the site. It happened a few times that there was some problem with the Apache Server at the hosting company. This resulted in some short down times of the blog itself. I was saying that &#8220;okay, this is not that big problem&#8221; so I accepted the barely limited up-time of the server. But today something much worse happened. The server was down for hours, I don&#8217;t even know for how much time so I decided to make some action to prevent such problems in the future.</p>
<p><span id="more-104"></span>First of all, I would like to apologize for the discomfiture of all of my well respected readers as it probably made many of you annoyed to get an error message instead of the content of the page. I hope that irrespectively of the observed down times most of my old and new visitors have successfully reached my blog.</p>
<p>Anyway, I made the needed actions to prevent such problems in the future. The site will be moved to a different server until Tuesday and I wish that will solve all the technical troubles and from then the only problem that a reader of my blog can met will be related only to the content of the page but I hope that most of you can find some useful information in the articles anyway.</p>
<p>By the way, this is maybe the best time to encourage you to send me ideas and topics what you would like to see on the blog. Of course, if you come up with cooking ideas then probably not I am the most competent person to talk about it, but everything related to programming, graphics and related topics is welcome!</p>
<p>Just as a foretaste, I will talk in my upcoming articles about the following topics:</p>
<ul>
<li>Delegates, signals, messages and state-machines in C++</li>
<li>Advanced culling methods on the GPU using geometry shaders</li>
<li>My first impressions about WebGL</li>
<li>GLSL shader development</li>
<li>Some words about physics libraries</li>
</ul>
<p>You can extend this list easily with your own list if you send me an e-mail to daniel.rakos@rastergrid.com. I am willing to share as much information as I can.</p>
<p>And one more thing: I would like to encourage you also to write comments on the articles even if your perspective is different from mine. We don&#8217;t have to agree, to be honest, I like to argue about any subject because I believe that it is the most natural way to improve both the knowledge of others and ourselves as well.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/01/problems-with-the-availability-of-the-site/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
