<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>RasterGrid Blog &#187; vertex buffer</title>
	<atom:link href="http://rastergrid.com/blog/tag/vertex-buffer/feed/" rel="self" type="application/rss+xml" />
	<link>http://rastergrid.com/blog</link>
	<description>A technical blog from Daniel Rákos (aka aqnuep)</description>
	<lastBuildDate>Fri, 04 Nov 2011 18:10:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>OpenGL vs DirectX: The War Is Far From Over</title>
		<link>http://rastergrid.com/blog/2011/10/opengl-vs-directx-the-war-is-far-from-over/</link>
		<comments>http://rastergrid.com/blog/2011/10/opengl-vs-directx-the-war-is-far-from-over/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 19:02:12 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Direct3D]]></category>
		<category><![CDATA[DirectX]]></category>
		<category><![CDATA[fragment shader]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[geometry shader]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[occlusion culling]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[tessellation control shader]]></category>
		<category><![CDATA[tessellation evaluation shader]]></category>
		<category><![CDATA[transform feedback]]></category>
		<category><![CDATA[uniform buffer]]></category>
		<category><![CDATA[vertex buffer]]></category>
		<category><![CDATA[vertex shader]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=652</guid>
		<description><![CDATA[I&#8217;ve chosen the title based on the popular article that tries to prove that OpenGL lost the war against Direct3D. To be honest, I didn&#8217;t really like the article at all. First, because it compared OpenGL 3 which targeted Shader Model 4.0 hardware and DirectX 11 which targeted Shader Model 5.0 hardware. Besides that, as we]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2011%252F10%252Fopengl-vs-directx-the-war-is-far-from-over%252F%22%2C%20%22shorturl%22%3A%20%22http%3A%2F%2Fbit.ly%2FnmYZeW%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22OpenGL%20vs%20DirectX%3A%20The%20War%20Is%20Far%20From%20Over%22%20%7D);"></div>
<div class="wp-caption alignleft" style="width: 260px"><img title="OpenGL vs DirectX" src="http://rastergrid.com/blog/wp-content/uploads/2011/10/opengl-vs-directx-250x138.jpg" alt="OpenGL vs DirectX" width="250" height="138" /><p class="wp-caption-text">The War Is Far From Over</p></div>
<p>I&#8217;ve chosen the title based on the <a title="OpenGL 3 &amp; DirectX 11: The War Is Over" href="http://www.tomshardware.com/reviews/opengl-directx,2019.html" target="_blank" onclick="pageTracker._trackPageview('/outgoing/www.tomshardware.com/reviews/opengl-directx_2019.html?referer=');">popular article</a> that tries to prove that OpenGL lost the war against Direct3D. To be honest, I didn&#8217;t really like the article at all. First, because it compared OpenGL 3 which targeted Shader Model 4.0 hardware and DirectX 11 which targeted Shader Model 5.0 hardware. Besides that, as we will see, the war is really far from over&#8230; This article aims to list the most important features introduced by OpenGL 3.x, OpenGL 4.x, Direct3D 10, Direct3D 11 and we will also talk about the promised features of the upcoming Direct3D 11.1 to be fair with DirectX <img src='http://rastergrid.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><span id="more-652"></span></p>
<p>After I wrote <a title="An introduction to OpenGL 4.2" href="http://rastergrid.com/blog/2011/08/an-introduction-to-opengl-4-2/">my article about the latest features introduced in OpenGL</a> someone asked me whether I can write an article about the comparison of the hardware features exposed by OpenGL and Direct3D. Instead of a long explanation, I decided to simply create a table of the features introduced by the APIs. Please note that the list focuses on hardware features and does not discuss API feature differences between the two APIs. The list may be far from complete and I&#8217;m happy to get feedback about what is missing from the table so that I can extend it. Also there are features for which I did not find whether an equivalent exists in D3D and are marked with a question mark. If anybody can point me to the answer, I would be happy, but I did not find a specification of the HLSL versions.</p>
<table style="width: 100%;" border="0">
<tbody>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>HARDWARE FEATURES EXPOSED</strong></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Draw command related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Conditional/predicated rendering based on the result of occlusion queries (<a href="http://www.opengl.org/registry/specs/NV/conditional_render.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/NV/conditional_render.txt?referer=');">NV_conditional_render</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Basic geometry instancing support and instanced draw commands (<a href="http://www.opengl.org/registry/specs/ARB/draw_instanced.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/draw_instanced.txt?referer=');">ARB_draw_instanced</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Geometry instancing with the ability to specify instanced vertex attributes (<a href="http://www.opengl.org/registry/specs/ARB/instanced_arrays.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/instanced_arrays.txt?referer=');">ARB_instanced_arrays</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Primitive restart (cut index) feature for batching multiple strips together (<a href="http://www.opengl.org/registry/specs/NV/primitive_restart.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/NV/primitive_restart.txt?referer=');">NV_primitive_restart</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Draw commands allowing modification of the base vertex index (<a href="http://www.opengl.org/registry/specs/ARB/draw_elements_base_vertex.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/draw_elements_base_vertex.txt?referer=');">ARB_draw_elements_base_vertex</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Indirect draw commands that source their parameters from server side buffers (<a href="http://www.opengl.org/registry/specs/ARB/draw_indirect.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/draw_indirect.txt?referer=');">ARB_draw_indirect</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>New shader type related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Geometry shader support and adjacency primitive support (<a href="http://www.opengl.org/registry/specs/ARB/geometry_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/geometry_shader4.txt?referer=');">ARB_geometry_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Instanced geometry shader support with fixed number of invocations (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/gpu_shader5.txt?referer=');">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Tessellation control and evaluation (hull and domain) shader support (<a href="http://www.opengl.org/registry/specs/ARB/tessellation_shader.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/tessellation_shader.txt?referer=');">ARB_tessellation_shader</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Transform feedback (stream-output) related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Basic transform feedback (stream-output) support (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/transform_feedback.txt?referer=');">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Transform feedback support without a geometry shader being active (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/transform_feedback.txt?referer=');">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for pausing and resuming transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback2.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/transform_feedback2.txt?referer=');">ARB_transform_feedback2</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Auto-draw support (feed back the contents of the transform feedback buffer) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback2.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/transform_feedback2.txt?referer=');">ARB_transform_feedback2</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Instanced auto-draw support (transform feedback buffer drawing with instancing support) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback_instanced.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/transform_feedback_instanced.txt?referer=');">ARB_transform_feedback_instanced</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for outputting multiple primitive streams using transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback3.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/transform_feedback3.txt?referer=');">ARB_transform_feedback3</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Asynchronous queries and related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Support for occlusion query for getting number of samples passed (<a href="http://www.opengl.org/registry/specs/ARB/occlusion_query.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/occlusion_query.txt?referer=');">ARB_occlusion_query</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for occlusion query for getting only a boolean value about visibility (<a href="http://www.opengl.org/registry/specs/ARB/occlusion_query2.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/occlusion_query2.txt?referer=');">ARB_occlusion_query2</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number vertices processed and the number of vertex shader invocations</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of geometry shader invocations in case a geometry shader is active</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives output by the geometry shader (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/transform_feedback.txt?referer=');">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives that were sent to the rasterizer (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/transform_feedback.txt?referer=');">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives that were passing clipping and were actually rendered</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of times a fragment/pixel shader was invoked</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives written during transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/transform_feedback.txt?referer=');">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives generated during transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/transform_feedback.txt?referer=');">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query a server side high resolution timestamp (<a href="http://www.opengl.org/registry/specs/ARB/timer_query.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/timer_query.txt?referer=');">ARB_timer_query</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the completeness of rendering commands (<a href="http://www.opengl.org/registry/specs/ARB/sync.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/sync.txt?referer=');">ARB_sync</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Texture, vertex and renderbuffer format related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Floating point color and depth formats for textures and render buffers (various extensions)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Cube map textures with depth component internal format (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Half-float (16-bit) vertex and pixel data support (<a href="http://www.opengl.org/registry/specs/NV/half_float.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/NV/half_float.txt?referer=');">NV_half_float</a>, <a href="http://www.opengl.org/registry/specs/ARB/half_float_pixel.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/half_float_pixel.txt?referer=');">ARB_half_float_pixel</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Non-normalized integer color formats for textures and renderbuffers (<a href="http://www.opengl.org/registry/specs/EXT/texture_integer.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/texture_integer.txt?referer=');">EXT_texture_integer</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Packed depth/stencil texture and renderbuffer formats (<a href="http://www.opengl.org/registry/specs/EXT/packed_depth_stencil.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/packed_depth_stencil.txt?referer=');">EXT_packed_depth_stencil</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">RGTC texture compression for two-component textures (<a href="http://www.opengl.org/registry/specs/EXT/texture_compression_rgtc.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/texture_compression_rgtc.txt?referer=');">EXT_texture_compression_rgtc</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Signed normalized texture component formats (<a href="http://www.opengl.org/registry/specs/EXT/texture_snorm.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/texture_snorm.txt?referer=');">EXT_texture_snorm</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Seamless cube map filtering support (to hide artifacts at cube map edges) (<a href="http://www.opengl.org/registry/specs/ARB/seamless_cube_map.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/seamless_cube_map.txt?referer=');">ARB_seamless_cube_map</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for swizzling the components of a texture (<a href="http://www.opengl.org/registry/specs/ARB/texture_swizzle.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/texture_swizzle.txt?referer=');">ARB_texture_swizzle</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">BPTC texture compression for floating point and unsigned normalized textures (<a href="http://www.opengl.org/registry/specs/ARB/texture_compression_bptc.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/texture_compression_bptc.txt?referer=');">ARB_texture_compression_bptc</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">64-bit floating point vertex attribute formats (<a href="http://www.opengl.org/registry/specs/ARB/vertex_attrib_64bit.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/vertex_attrib_64bit.txt?referer=');">ARB_vertex_attrib_64bit</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>New texture type related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">One- and two-dimensional layered array textures (<a href="http://www.opengl.org/registry/specs/EXT/texture_array.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/texture_array.txt?referer=');">EXT_texture_array</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Cube map array textures as special two-dimensional array textures (<a href="http://www.opengl.org/registry/specs/ARB/texture_cube_map_array).txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/texture_cube_map_array_.txt?referer=');">ARB_texture_cube_map_array)</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Rectangular textures with no mipmap support and that are accessed with integer coordinates (<a href="http://www.opengl.org/registry/specs/ARB/texture_rectangle.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/texture_rectangle.txt?referer=');">ARB_texture_rectangle</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Multisampled textures and support for fetching specific sample locations (<a href="http://www.opengl.org/registry/specs/ARB/texture_multisample.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/texture_multisample.txt?referer=');">ARB_texture_multisample</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Casting a texture&#8217;s interpreted internal format to another internal format</td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt4">[4]</a></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt4">[4]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Uniform buffer (constant buffer) related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Basic uniform buffer (constant buffer) support (<a href="http://www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt?referer=');">ARB_uniform_buffer_object</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for large uniform buffers and binding subranges (<a href="http://www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt?referer=');">ARB_uniform_buffer_object</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Framebuffer and texture rendering related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Rendering to textures and renderbuffers (<a href="http://www.opengl.org/registry/specs/EXT/framebuffer_object.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/framebuffer_object.txt?referer=');">EXT_framebuffer_object</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Multisample stretch blit functionality (<a href="http://www.opengl.org/registry/specs/EXT/framebuffer_multisample.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/framebuffer_multisample.txt?referer=');">EXT_framebuffer_multisample</a>, <a href="http://www.opengl.org/registry/specs/EXT/framebuffer_blit.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/framebuffer_blit.txt?referer=');">EXT_framebuffer_blit</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">sRGB rendering and blending support for framebuffers (<a href="http://www.opengl.org/registry/specs/EXT/framebuffer_sRGB.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/framebuffer_sRGB.txt?referer=');">EXT_framebuffer_sRGB</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for enabling or disabling clamping of the depth of fragments (<a href="http://www.opengl.org/registry/specs/ARB/depth_clamp.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/depth_clamp.txt?referer=');">ARB_depth_clamp</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for logical operations on integer render targets (supported for a decade in OpenGL)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Blending related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Support for alpha-to-coverage when using multisampling (<a href="http://www.opengl.org/registry/specs/ARB/multisample.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/multisample.txt?referer=');">ARB_multisample</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Per-color-buffer blend enables and color writemasks (<a href="http://www.opengl.org/registry/specs/EXT/draw_buffers2.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/draw_buffers2.txt?referer=');">EXT_draw_buffers2</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Dual-source color blending support based on a secondary output of the fragment shader (<a href="http://www.opengl.org/registry/specs/ARB/blend_func_extended.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/blend_func_extended.txt?referer=');">ARB_blend_func_extended</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Individual blend equations and blend functions support for each color output (<a href="http://www.opengl.org/registry/specs/ARB/draw_buffers_blend.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/draw_buffers_blend.txt?referer=');">ARB_draw_buffers_blend</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Shader related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Texture lookup functions to access individual texels of a LOD using integer coordinates (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Query the dimensions of a specific LOD of a texture in shaders (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Ability to apply integer offsets to the texel location during texture lookup (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Ability to explicitly pass in derivative values that are used to compute LOD during texture lookup (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Control over varying variable interpolation: non-perspective, flat, centroid sampling, etc. (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Full signed and unsigned integer support in shaders (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<tr>
<td style="padding: 0px">Vertex ID built-in variable available in vertex shader (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Primitive ID built-in variable available in geometry and fragment shader (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/EXT/gpu_shader4.txt?referer=');">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Instance ID built-in variable available in vertex shader (<a href="http://www.opengl.org/registry/specs/ARB/draw_instanced.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/draw_instanced.txt?referer=');">ARB_draw_instanced</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Shader fragment coordinate convention control (<a href="http://www.opengl.org/registry/specs/ARB/fragment_coord_conventions.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/fragment_coord_conventions.txt?referer=');">ARB_fragment_coord_conventions</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Provoking vertex control (for flat shaded varying value selection) (<a href="http://www.opengl.org/registry/specs/ARB/provoking_vertex.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/provoking_vertex.txt?referer=');">ARB_provoking_vertex</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for encoding and decoding floating point values from and to integers (<a href="http://www.opengl.org/registry/specs/ARB/shader_bit_encoding.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/shader_bit_encoding.txt?referer=');">ARB_shader_bit_encoding</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for get the results of the automatic LOD computations in shaders (<a href="http://www.opengl.org/registry/specs/ARB/texture_query_lod.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/texture_query_lod.txt?referer=');">ARB_texture_query_lod</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for coherent indexing into arrays of samplers using non-constant indices (addressable samplers) (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/gpu_shader5.txt?referer=');">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for indexing into arrays of uniform blocks (addressable constant buffers) (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/gpu_shader5.txt?referer=');">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Gathered texture fetches over a 2&#215;2 footprint (with custom offsets) (<a href="http://www.opengl.org/registry/specs/ARB/texture_gather.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/texture_gather.txt?referer=');">ARB_texture_gather</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Invocation ID built-in variable available in geometry shader (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/gpu_shader5.txt?referer=');">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for double-precision floating-point data types in shaders (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt?referer=');">ARB_gpu_shader_fp64</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for sample-frequency fragment shader execution (<a href="http://www.opengl.org/registry/specs/ARB/sample_shading.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/sample_shading.txt?referer=');">ARB_sample_shading</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support indirect subroutine calls in all shader stages (<a href="http://www.opengl.org/registry/specs/ARB/shader_subroutine.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/shader_subroutine.txt?referer=');">ARB_shader_subroutine</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for selecting from multiple viewports using a geometry shader (<a href="http://www.opengl.org/registry/specs/ARB/viewport_array.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/viewport_array.txt?referer=');">ARB_viewport_array</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for dedicated atomic counters in shaders (<a href="http://www.opengl.org/registry/specs/ARB/shader_atomic_counters.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/shader_atomic_counters.txt?referer=');">ARB_shader_atomic_counters</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55; text-align: center;"><a href="#tblcmt2">[2]</a></td>
<td style="background-color: #55cc55; text-align: center;"><a href="#tblcmt2">[2]</a></td>
</tr>
<tr>
<td style="padding: 0px">Support for backing up dedicated atomic counters with buffers (<a href="http://www.opengl.org/registry/specs/ARB/shader_atomic_counters.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/shader_atomic_counters.txt?referer=');">ARB_shader_atomic_counters</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt5">[5]</a></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt5">[5]</a></td>
</tr>
<tr>
<td style="padding: 0px">Support for load/store (read/write) buffers and textures in shaders (<a href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/shader_image_load_store.txt?referer=');">ARB_shader_image_load_store</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt3">[3]</a></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for atomic operations on load/store buffers and textures (<a href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/shader_image_load_store.txt?referer=');">ARB_shader_image_load_store</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for disabling or forcing early depth test (<a href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/shader_image_load_store.txt?referer=');">ARB_shader_image_load_store</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for conservative depth (enabling safe early tests even when modifying depth) (<a href="http://www.opengl.org/registry/specs/ARB/conservative_depth.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/conservative_depth.txt?referer=');">ARB_conservative_depth</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for coverage as input to the fragment shader (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/gpu_shader5.txt?referer=');">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Miscellaneous features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Support for floating point viewport specification (<a href="http://www.opengl.org/registry/specs/ARB/viewport_array.txt" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/viewport_array.txt?referer=');">ARB_viewport_array</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Per-texture mipmap clamping (supported since the very early versions of OpenGL)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to use a single depth texture for depth testing and as texture input (when depth writes are disabled)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
</tbody>
</table>
<p><a name="tblcmt1">[1]</a> There is no support for these counters in OpenGL, however they can be implemented with the help of shader atomic counters.<br />
<a name="tblcmt2">[2]</a> There is no support in Direct3D to use the dedicated atomic counter hardware (supported currently only by AMD GPUs) only by using an append/consume buffer. Though, as atomic counters are the part of UAVs and arbitrary number of UAVs can be attached to a single resource, the same functionality is supported indirectly.<br />
<a name="tblcmt3">[3]</a> There is read/write buffer and texture support in Direct3D 11, however it is available only in the fragment (pixel) shader. Direct3D 11.1 plans to remove this restriction.<br />
<a name="tblcmt4">[4]</a> There is no support for texture format casting in OpenGL, conversion, however, can be done by doing a copy preferably using pixel buffer objects.<br />
<a name="tblcmt5">[5]</a> There is no support for automatic storage of atomic counter values in buffers in Direct3D, however, their value can be manually copied to arbitrary resources.</p>
<p>As a conclusion, I would like to say just one thing: even though there are some features that are not supported by either OpenGL or Direct3D, we really can say that the two APIs are on par with the number of hardware features they expose.</p>
<p>(Sorry in advance for any mistakes, it took quite some time to create this table and I may became too tired at the end)</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/10/opengl-vs-directx-the-war-is-far-from-over/feed/</wfw:commentRss>
		<slash:comments>70</slash:comments>
		</item>
		<item>
		<title>GPU based dynamic geometry LOD</title>
		<link>http://rastergrid.com/blog/2010/10/gpu-based-dynamic-geometry-lod/</link>
		<comments>http://rastergrid.com/blog/2010/10/gpu-based-dynamic-geometry-lod/#comments</comments>
		<pubDate>Mon, 25 Oct 2010 19:35:13 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Samples]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[culling]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[geometry shader]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[LOD]]></category>
		<category><![CDATA[occlusion culling]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[tessellation]]></category>
		<category><![CDATA[vertex buffer]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=428</guid>
		<description><![CDATA[Dynamic geometry level-of-detail (LOD) algorithms are very popular and powerful algorithms that provide a great level of rendering performance optimization while preserving detail by using less detailed geometry for objects that are far away, too small or otherwise less significant in the quality of the final rendering. Many of these are used since the very]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2010%252F10%252Fgpu-based-dynamic-geometry-lod%252F%22%2C%20%22shorturl%22%3A%20%22http%3A%2F%2Fbit.ly%2F9M4KeD%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22GPU%20based%20dynamic%20geometry%20LOD%22%20%7D);"></div>
<div class="wp-caption alignleft" style="width: 210px"><a href="http://rastergrid.com/blog/wp-content/uploads/2010/10/mountains.png"><img class="  " title="Click to enlarge" src="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/mountains-thumb.png" alt="OpenGL 4.0 - Mountains demo" width="200" height="150" /></a><p class="wp-caption-text">OpenGL 4.0 - Mountains demo</p></div>
<p>Dynamic geometry level-of-detail (LOD) algorithms are very popular and powerful algorithms that provide a great level of rendering performance optimization while preserving detail by using less detailed geometry for objects that are far away, too small or otherwise less significant in the quality of the final rendering. Many of these are used since the very beginning of computer graphics technologies and are present in some form in current CAD softwares, video games and other graphics applications. While determining the appropriate geometry LOD was previously the task of the CPU, with todays hardware it is possible to also offload this to the GPU which excels at handling large amount of objects in parallel.<br />
<span id="more-428"></span></p>
<h2>Introduction</h2>
<p>With the advent of Shader Model 5.0 GPUs and the appearance of programmable tessellation hardware it may seem like the geometry LOD problem is solved once and for all. However, in many cases it is simply not enough as for far away objects even a patch pass-through tessellation shader already produces too much geometry than the added detail worths. As a result, classic geometry LOD algorithms are still a good-to-have feature in the tool-box of the developer. Not to mention that all vendors recommend disabling tessellation shaders at all if we don&#8217;t need any geometry amplification as even a pass-through tessellation shader does have its payload.</p>
<p>This means that there has to be still a conventional rendering path for geometries that should not be tessellated. Then why not to try offloading the geometry LOD determination to the GPU if possible?</p>
<p>This article presents a technique that was already presented by AMD&#8217;s <a title="March of the Froblins" href="http://developer.amd.com/samples/demos/pages/froblins.aspx" target="_blank" onclick="pageTracker._trackPageview('/outgoing/developer.amd.com/samples/demos/pages/froblins.aspx?referer=');">March of the Froblins</a> demo and by NVIDIA&#8217;s <a title="NVIDIA DX10 Samples" href="http://developer.download.nvidia.com/SDK/10/direct3d/samples.html" target="_blank" onclick="pageTracker._trackPageview('/outgoing/developer.download.nvidia.com/SDK/10/direct3d/samples.html?referer=');">Skinned Instancing</a> demo and allows GPU based dynamic geometry LOD determination using a geometry shader that selects the most appropriate LOD from a group of geometry LODs based on the object&#8217;s distance from camera. While this article and the reference implementation (<a title="OpenGL 4.0 - Mountains demo released" href="http://rastergrid.com/blog/2010/10/opengl-4-0-mountains-demo-released/">OpenGL 4.0 &#8211; Mountains demo</a>) presents the application of the technique only for instanced geometry, the same method can be easily extended to support heterogeneous objects by taking advantage of the latest functionalities introduced in OpenGL 4.</p>
<h2>The algorithm</h2>
<p>The technique is based on the geometry shader&#8217;s ability to emit or deny the emission of primitives into a transform feedback buffer as done in the mentioned DX based implementations. One major improvement compared to earlier approaches is that the LOD determination is done in a single pass rather than requiring a separate pass for each geometry LOD. Additionally, this LOD determination pass can be also merged together with other visibility determination passes like <a title="Instance culling using geometry shaders" href="http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/">Instance Cloud Reduction</a> or <a title="Hierarchical-Z map based occlusion culling" href="http://rastergrid.com/blog/2010/10/hierarchical-z-map-based-occlusion-culling/">Hierarchical-Z map based occlusion culling</a> as it is done in the reference implementation. This was made possible thanks to the latest transform feedback capabilities introduced in OpenGL 4.0 (see the extension <a title="GL_ARB_transform_feedback3" href="http://www.opengl.org/registry/specs/ARB/transform_feedback3.txt" target="_blank" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/transform_feedback3.txt?referer=');">ARB_transform_feedback3</a>) that enables the geometry shader to output data to separate primitive streams.</p>
<div class="wp-caption aligncenter" style="width: 660px"><img class="    " title="Culling and dynamic LOD in the March of the Froblins demo" src="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/froblin-lod.png" alt="Culling and dynamic LOD in the March of the Froblins demo" width="650" height="340" /><p class="wp-caption-text">Flow-chart presenting the culling and dynamic LOD algorithms used in AMD&#39;s March of the Froblins demo. The implementation needs five passes for culling and separating three detail levels and performs two asynchronous queries meanwhile. Requires OpenGL 3 compliant hardware.</p></div>
<div class="wp-caption aligncenter" style="width: 660px"><img title="Culling and dynamic LOD in the Mountains demo" src="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/mountains-lod.png" alt="Culling and dynamic LOD in the Mountains demo" width="650" height="281" /><p class="wp-caption-text">Flow-chart presenting the culling and dynamic LOD algorithm used in our Mountains demo. The implementation requires only one pass for culling and separating three detail levels without the need to use asynchronous queries. Requires OpenGL 4 compliant hardware.</p></div>
<p>The algorithm itself is very simple and straightforward. For each object instance determine the appropriate geometry LOD based on it&#8217;s distance from the camera and the LOD distances passed as uniform to the shader. After this, output the instance&#8217;s data to the output stream ID that corresponds to the determined LOD&#8217;s index. Here you can see a GLSL implementation of the algorithm:</p>
<pre class="brush:c">#version 400 core

uniform mat4 ModelViewMatrix;
uniform vec2 LodDistance;

layout(points) in;
layout(points, max_vertices = 1) out;

in vec3 InstancePosition[1];

layout(stream=0) out vec3 InstPosLOD0;
layout(stream=1) out vec3 InstPosLOD1;
layout(stream=2) out vec3 InstPosLOD2;

void main() {
  float distance = length(ModelViewMatrix * vec4(InstancePosition[0], 1.0));
  if ( distance &lt; LodDistance.x ) {
    InstPosLOD0 = InstancePosition[0];
    EmitStreamVertex(0);
  } else
  if ( distance &lt; LodDistance.y ) {
    InstPosLOD1 = InstancePosition[0];
    EmitStreamVertex(1);
  } else {
    InstPosLOD2 = InstancePosition[0];
    EmitStreamVertex(2);
  }
}</pre>
<p>Additionally, the geometry LOD determination pass has to be executed with primitive queries enabled for all the relevant output streams to acquire the number of instances for each geometry LOD index:</p>
<pre class="brush:cpp">for (int i=0; i&lt;NUM_LOD; i++)
  glBeginQueryIndexed(GL_PRIMITIVES_GENERATED, i, lodQuery[i]);

glBeginTransformFeedback(GL_POINTS);
  glDrawArrays(GL_POINTS, 0, instanceCount);
glEndTransformFeedback();

for (int i=0; i&lt;NUM_LOD; i++)
  glEndQueryIndexed(GL_PRIMITIVES_GENERATED, i);</pre>
<p>Finally, the only thing what is left is to issue an instanced draw call for each geometry LOD index to draw all the instances:</p>
<pre class="brush:cpp">for (int i=0; i&lt;NUM_LOD; i++) {
  glGetQueryObjectiv(lodQuery[i], GL_QUERY_RESULT, instanceCountLOD[i]);
  if ( instanceCountLOD[i] &gt; 0 )
    glDrawElementsInstanced(..., instanceCountLOD[i]);
}</pre>
<p>That&#8217;s all, and what you get as a result is a fully GPU based geometry LOD selection algorithm.</p>
<h2>The Mountains demo</h2>
<p>The reference implementation provided as part of the <a title="OpenGL 4.0 - Mountains demo" href="http://rastergrid.com/blog/2010/10/opengl-4-0-mountains-demo-released/">OpenGL 4.0 &#8211; Mountains demo</a> that is available with full source code and Windows executable in the <a title="Mountains Demo download" href="http://rastergrid.com/blog/downloads/mountains-demo/">downloads section</a>. The demo application implements the same visibility determination algorithms that were presented in the <a title="SIGGRAPH 2008 Course Notes about the March of the Froblins" href="http://developer.amd.com/documentation/presentations/legacy/Chapter03-SBOT-March_of_The_Froblins.pdf" target="_blank" onclick="pageTracker._trackPageview('/outgoing/developer.amd.com/documentation/presentations/legacy/Chapter03-SBOT-March_of_The_Froblins.pdf?referer=');">SIGGRAPH 2008 Course Notes</a> besides the dynamic geometry LOD algorithm presented here in a single pass.</p>
<p>Dynamic LOD can be enabled in the demo by using the F3 key. After enabled, the demo separates the various geometry detail levels according to the LOD distances configured. As it can be seen, there is almost no visible difference between the scene rendered with dynamic geometry LOD enabled and disabled. Also, by setting the LOD distances appropriately, the algorithm provides seamless transition between subsequent geometry detail levels as the camera is moved.</p>
<table style="width: 100%;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #ffffff;" align="center">
<div class="wp-caption alignnone" style="width: 338px"><a href="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/lod-comp.png" onclick="pageTracker._trackPageview('/outgoing/www.rastergrid.com/blog/wp-content/uploads/2010/10/lod-comp.png?referer=');"><img title="Click to enlarge" src="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/lod-comp-thumb.png" alt="Close-up view to compare image quality without and with dynamic LOD" width="328" height="160" /></a><p class="wp-caption-text">Close-up view of distant objects to compare the image quality without (left) and with (right) dynamic LOD.</p></div></td>
<td style="background-color: #ffffff;" align="center">
<p><div class="wp-caption alignnone" style="width: 223px"><a href="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/visual-lod.png" onclick="pageTracker._trackPageview('/outgoing/www.rastergrid.com/blog/wp-content/uploads/2010/10/visual-lod.png?referer=');"><img title="Click to enlarge" src="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/visual-lod-thumb.png" alt="LOD visualization" width="213" height="160" /></a><p class="wp-caption-text">Geometry LOD visualization: LOD 0 (red), LOD 1 (green), LOD 2 (blue).</p></div></td>
</tr>
</tbody>
</table>
<p>When dyamic LOD is enabled, the demo also makes it possible to visualize the various geometry detail levels by pressing the F4 key. The highest detail LOD is marked with red, mid-level with green and the lowest detail geometries are marked as blue. It can be seen that as the camera moves the renderer automatically adjusts the detail of each individual instance.</p>
<p>Besides maintaining a constant quality without the viewer to observe any transitions between the various detail levels, the algorithm provides a huge performance gain in case of complex geometries as it can be seen on the figure below:</p>
<p><div class="wp-caption aligncenter" style="width: 654px"><img class="   " src="http://www.rastergrid.com/blog/wp-content/uploads/2010/10/mountains-fps.png" alt="Performance comparison of various culling and LOD techniques in frames per second on a Radeon HD5770 (higher is better)" width="644" height="224" /><p class="wp-caption-text">Performance comparison of the demo in frames per second on a Radeon HD5770 (higher is better): no culling (bottom), instance cloud reduction (middle), ICR + Hi-Z map based occlusion culling (top), no geometry LOD (blue), dynamic geometry LOD (red).</p></div>
<h2>Conclusion</h2>
<p>We&#8217;ve seen how straightforward is to implement GPU based dynamic geometry LOD determination using geometry shaders on OpenGL 4.0 compliant hardware providing also a reference implementation that uses the algorithm to efficiently determine detail levels for large number of instanced geometry. We also briefly mentioned that the algorithm can be extended to handle arbitrary object sets. We discussed about a possible OpenGL 3 based implementation but we did not provide one as it requires several rendering passes to perform all the operations that can be implemented in a single pass on Shader Model 5.0 hardware.</p>
<p>Even though the algorithm is already extremely efficient, it still involves the use of asynchronous primitive queries that may induce some latency. Of course, this latency can be easily hidden by performing other operations on the CPU/GPU until the results are available.</p>
<p>Furthermore, taking full advantage of Shader Model 5.0 GPUs it would be possible to eliminate the need of asynchronous queries by using atomic counters and indirect rendering, however the core OpenGL specification does not expose yet such functionality so this improvement is left for a future release of the demo.</p>
<p>Classic dynamic geometry LOD algorithms are still first class citizens of every rendering system and even though the introduction of hardware tessellation somewhat subsumes the need for these classic techniques, practice shows that the best way to implement a full-fledged dynamic LOD system is by using geometry LOD selection and tessellation together rather that one instead of the other.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/10/gpu-based-dynamic-geometry-lod/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Instance Cloud Reduction reloaded</title>
		<link>http://rastergrid.com/blog/2010/06/instance-cloud-reduction-reloaded/</link>
		<comments>http://rastergrid.com/blog/2010/06/instance-cloud-reduction-reloaded/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 19:36:38 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Samples]]></category>
		<category><![CDATA[attribute divisor]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[culling]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[geometry shader]]></category>
		<category><![CDATA[GLEW]]></category>
		<category><![CDATA[GLM]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[instanced array]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[SFML]]></category>
		<category><![CDATA[texture buffer]]></category>
		<category><![CDATA[transform feedback]]></category>
		<category><![CDATA[uniform buffer]]></category>
		<category><![CDATA[vertex buffer]]></category>
		<category><![CDATA[vertex shader]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=251</guid>
		<description><![CDATA[A few months ago I&#8217;ve presented an object culling mechanism that I&#8217;ve named Instance Cloud Reduction (ICR) in the article Instance culling using geometry shaders. The technique targets the first generation of OpenGL 3 capable cards and takes advantage of geometry shaders&#8217; capability to reduce the emitted geometry amount in order to get to a]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2010%252F06%252Finstance-cloud-reduction-reloaded%252F%22%2C%20%22shorturl%22%3A%20%22http%3A%2F%2Fbit.ly%2Fc2unzx%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22Instance%20Cloud%20Reduction%20reloaded%22%20%7D);"></div>
<div class="wp-caption alignleft" style="width: 160px"><img src="http://rastergrid.com/blog/wp-content/uploads/2010/02/Nature-2010-02-08-20-20-36-24-150x150.png" alt="" width="150" height="150" /><p class="wp-caption-text">OpenGL 3.3 - Nature</p></div>
<p>A few months ago I&#8217;ve presented an object culling mechanism that I&#8217;ve named Instance Cloud Reduction (ICR) in the article <a title="Instance culling using geometry shaders" href="http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/">Instance culling using geometry shaders</a>. The technique targets the first generation of OpenGL 3 capable cards and takes advantage of geometry shaders&#8217; capability to reduce the emitted geometry amount in order to get to a fully GPU accelerated algorithm that performs view frustum culling on instanced geometry without the need of OpenCL or any other GPU compute API. After the culling step the reduced set of instance data is fed to the drawing pass in the form of a texture buffers. In this article I will present an improved version of the algorithm that exploits the use of instanced arrays introduced lately in OpenGL 3.3 to further optimize it.</p>
<p><span id="more-251"></span>Lets recap the basics of the algorithm before I present the improved technique. The geometry shaders have a very nice feature that they cannot just emit a modified version of the input geometry but can also alter the number of emitted primitives compared to the number of received ones. This is a both-way ability what means that we cannot just increase but also decrease the number of primitives. That is what the technique takes advantage.</p>
<p>In the first pass we feed a simple vertex shader &#8211; geometry shader pair with the instance data of the geometries as they&#8217;ve been the data of point primitives. The vertex shader then checks whether the actual instance is inside the view frustum or not and sends the result to the geometry shader. If the result is yes then the geometry shader outputs the instance data otherwise discards it. The primitives emitted by the geometry shaders are captured then using transform feedback into a buffer object. Also a query object is needed in order to be able to get the amount of instances that passed the view frustum culling. In the drawing pass we use the result of the query to decide how many instances we have to draw and the captured feedback buffer is used as instance data.</p>
<div class="wp-caption aligncenter" style="width: 660px"><img src="http://rastergrid.com/blog/wp-content/uploads/2010/02/icr_combined.png" alt="" width="650" height="347" /><p class="wp-caption-text">Instance Cloud Reduction - Combined view of Pass 1 + Pass 2</p></div>
<p>This is a very brief description of the culling mechanism so for a complete specification please read the <a title="Instance culling using geometry shaders" href="http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/">original article</a>.</p>
<h3>Motivation</h3>
<p>While Instance Cloud Reduction is a quite robust technique that can severely simplify and speed up the rendering of high amount of instanced geometry its performance is also limited due to some hardware and API restrictions. The most important ones are the following:</p>
<ul>
<li>Needs an extra rendering pass to perform the culling.</li>
<li>Requires the usage of asynchronous queries to determine the number of visible instances.</li>
<li>Uses texture fetching in the vertex shader of the actual drawing pass.</li>
</ul>
<p>The first mentioned drawback means that more draw commands are required that use the output of the first pass as input. This and the second disadvantage may cause stalls due to the fact that the CPU has to wait for the data to be ready before issuing the second pass thus the GPU is not used effectively.</p>
<p>What this improvement tries to solve is the third problem. Texture fetching itself is quite fast in the latest generation of hardware, however it causes some slowdowns anyway due to the latency introduced by texture fetches even though GPUs use some latency hiding techniques.</p>
<p>Instanced arrays provide us a way to replace texture fetching with vertex fetching that is usually done by different hardware element that works synchronously with the execution of vertex shaders. I&#8217;ve expected quite a reasonable speedup by taking advantage of instanced arrays, however we will see that actual results were far from my initial expectations.</p>
<h3>Implementation</h3>
<p>Traditional vertex fetching happens in a way that one element is fetched from each enabled input attribute buffer and the vertex shader is issued with these values. One element in a vertex attribute buffer can mean up to four floating point or integer values and for each execution of the vertex shader one set of these elements is used. There is an internal counter that is increased after each fetch and the next vertex attribute fetch will use this counter as an index into the buffer object.</p>
<p>While this mechanism is satisfactory for the most attributes of a vertex, it is not practical for instance data as such data belongs to an instance rather than a vertex. In order to source instance data from vertex attributes in case of traditional vertex fetching, high amount of redundant storage is required in order to get the same information for all the vertices belonging to a particular instance. This is not just waste of memory but also waste of bandwidth and it also defeats the goal of Instance Cloud Reduction.</p>
<p>Compared to traditional vertex fetching, instanced arrays provide a way to increase the internal counter used as the index into the vertex attribute buffer in a different way, in particular one can set the frequency of increase using a vertex attribute divisor that specifies after how many instances the counter shall be increased. This is a per-attribute property and by setting it to one we end up with exactly what we need: one vertex fetch per instance.</p>
<p>This means that actually we need just a very minor change compared to the original technique, more precisely we replace our texture buffer with a vertex attribute buffer that has a divisor of one and use it as the source of instance data in the vertex shader of the drawing pass.</p>
<h3>Execution results</h3>
<p>As we are not talking about a new technique but just an optimized implementation of the same method, the best way to evaluate it is by comparing the performance of the new version with the original one.</p>
<p>As I&#8217;ve mentioned earlier, I expected a reasonable performance increase by replacing texture fetches with vertex fetches, in practice the difference was not so significant. However, the performance difference between the two implementation can heavily depend on the underlying hardware implementation so various cards from various vendors and GPU generations can show more diverging behavior. In fact even driver versions may have an effect on the results.</p>
<div class="wp-caption aligncenter" style="width: 620px"><img class="  " src="http://rastergrid.com/blog/wp-content/uploads/2010/06/comparison.png" alt="" width="610" height="139" /><p class="wp-caption-text">Performance comparison of the old implementation and the presented one on an AMD Radeon HD5770. Scale is in frames per second (higher value is better).</p></div>
<p>Due to lack of hardware to use for testing, I&#8217;ve checked only with one card, namely a Radeon HD5770 with Catalyst 10.6 drivers. I noticed roughly a 10% speedup as the the new version of the Nature demo showed 100 FPS compared to the 90 FPS observed with the old implementation.</p>
<p>Even though this was not exactly the outcome I&#8217;ve expected from the new implementation, maybe the assumption is still valid for older generation of GPUs or for NVIDIA cards. I suspect so because for Shader Model 4.0 cards the hardware implementation of the texture fetching unit and the vertex fetching unit was most probably more differentiated than that of the latest GPUs. Also my guess is that on NVIDIA cards the difference is maybe higher as the vertex fetching hardware in SM 4.0 GeForce cards is less flexible than that of AMD&#8217;s taking in consideration that the first HD series Radeons already had some form of tessellation functionality that requires more freedom from the vertex pushing hardware.</p>
<p>In order to get a better picture about how effective the presented optimization is, I would like to ask all the visitors of this post to try the two releases and send me feedback about it.</p>
<h3>Conclusion</h3>
<p>We&#8217;ve seen that how easy it was to take advantage of instanced arrays in an existing implementation of the ICR technique and how does it perform on the latest generation of GPUs compared to the previous version. While this small addition provides some benefits, it also comes at a cost and we have to talk about that as well.</p>
<p><strong>Advantages:</strong></p>
<ul>
<li>Eliminates the need for texture fetching in the vertex shader thus improving performance.</li>
<li>Does not compromise the goal and the implementation architecture of the original method.</li>
<li>Frees up one texture unit that was previously reserved for the texture buffer containing the instance data.</li>
</ul>
<p><strong>Disadvantages:</strong></p>
<ul>
<li>Requires OpenGL 3.3 or the <a title="GL_ARB_instanced_arrays" href="http://www.opengl.org/registry/specs/ARB/instanced_arrays.txt" target="_blank" onclick="pageTracker._trackPageview('/outgoing/www.opengl.org/registry/specs/ARB/instanced_arrays.txt?referer=');">GL_ARB_instanced_arrays</a> extension in addition to the OpenGL 3.2 features.</li>
<li>We have to possibly sacrifice multiple vertex input attributes to feed the instance data to the shaders.</li>
</ul>
<p>Most of the mentioned benefits and drawbacks are self-explanatory, however I would like to say a few words about the last mentioned one&#8230;</p>
<p>For the purpose of showcase I used a simple translation factor as instance data that means a single vector of floats. In real life situation one may need more complex transformation data that can only be stored in the matrix. While in the demo the feeding of instance data consumed only one vertex attribute slot, in case of a full transformation matrix it would require four of them (not to mention other possible instance attributes). As the maximum number of input attributes is severely limited, usually to 16, the application of the optimization is restricted to situations when all the vertex and instance attributes fit into this limit.</p>
<p>In case of the original implementation, where a texture buffer was used as input, this did not cause any problem as the vertex shader is free to fetch any number of texels from that (still, performance can be a concern in this case). In order to help situations when input attribute slots are at a premium, in real life scenarios it is recommended to use quaternions instead of transformation matrices as they consume two times less attribute resources. Actually this can be a general recommendation as using quaternions decreases the bandwidth requirements of the instance data fetch thus increasing performance even in situations when there are enough input attribute slots available.</p>
<p>In order to ease the performance comparison for you, you can find download links for both versions of the Nature demo.</p>
<h3>Old version binary release</h3>
<p><strong>Platform:</strong> Windows<br />
<strong>Dependency:</strong> OpenGL 3.2 capable graphics driver<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature12_win32.zip">nature12_win32.zip (3.58MB)</a><br />
<strong>Comments:</strong> This version does <strong>NOT </strong>include the optimization presented in this article.</p>
<h3>Old version source code</h3>
<p><strong>Language: <span style="font-weight: normal;">C++</span><br />
Platform:</strong> cross-platform<br />
<strong>Dependency:</strong> GLEW, SFML, GLM<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature12_src.zip">nature12_src.zip (12.6KB)</a><br />
<strong>Comments:</strong> This version does <strong>NOT </strong>include the optimization presented in this article.</p>
<h3>New version binary release</h3>
<p><strong>Platform:</strong> Windows<br />
<strong>Dependency:</strong> OpenGL 3.3 capable graphics driver<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature20_win32.zip">nature20_win32.zip (3.58MB)</a><br />
<strong>Comments:</strong> This version includes the optimization presented in this article.</p>
<h3>New version source code</h3>
<p><strong>Language:</strong> C++<br />
<strong>Platform:</strong> cross-platform<br />
<strong>Dependency:</strong> GLEW, SFML, GLM<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature20_src.zip">nature20_src.zip (12.8KB)</a><br />
<strong>Comments:</strong> This version includes the optimization presented in this article.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/06/instance-cloud-reduction-reloaded/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Instance culling using geometry shaders</title>
		<link>http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/</link>
		<comments>http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 22:58:53 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Samples]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[culling]]></category>
		<category><![CDATA[fragment shader]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[geometry shader]]></category>
		<category><![CDATA[GLEW]]></category>
		<category><![CDATA[GLM]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[SFML]]></category>
		<category><![CDATA[texture buffer]]></category>
		<category><![CDATA[transform feedback]]></category>
		<category><![CDATA[uniform buffer]]></category>
		<category><![CDATA[vertex buffer]]></category>
		<category><![CDATA[vertex shader]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=135</guid>
		<description><![CDATA[Since the appearance of Shader Model 4.0 people wonder how to take advantage of the newly introduced programmable pipeline stage. The most important feature enabled by geometry shaders is that one can change the amount of emitted primitives inside the pipeline. The first thing that a naive developer would try to do with it is]]></description>
			<content:encoded><![CDATA[
<div class="topsy_widget_data topsy_theme_light-green" style="float: right;margin-left: 0.75em; background: url(data:,%7B%20%22url%22%3A%20%22http%253A%252F%252Frastergrid.com%252Fblog%252F2010%252F02%252Finstance-culling-using-geometry-shaders%252F%22%2C%20%22shorturl%22%3A%20%22http%3A%2F%2Fbit.ly%2FanKmpg%22%2C%20%22style%22%3A%20%22big%22%2C%20%22title%22%3A%20%22Instance%20culling%20using%20geometry%20shaders%22%20%7D);"></div>
<div id="attachment_136" class="wp-caption alignleft" style="width: 160px"><a href="http://rastergrid.com/blog/wp-content/uploads/2010/02/Nature-2010-02-08-20-20-36-24.png"><img class="size-thumbnail wp-image-136  " title="Nature demo screenshot" src="http://rastergrid.com/blog/wp-content/uploads/2010/02/Nature-2010-02-08-20-20-36-24-150x150.png" alt="Nature demo screenshot" width="150" height="150" /></a><p class="wp-caption-text">OpenGL 3.2 - Nature</p></div>
<p>Since the appearance of Shader Model 4.0 people wonder how to take advantage of the newly introduced programmable pipeline stage. The most important feature enabled by geometry shaders is that one can change the amount of emitted primitives inside the pipeline. The first thing that a naive developer would try to do with it is geometry tesselation. However, the new shader performs very bad when used for tesselation in a real life scenario even though there are demos show casting this possibility. If we take a closer look at the new feature we observe that the most revolutionary in it is not that it can raise the number of emitted primitives but that it can discard them. This article would like to present a rendering technique that takes advantage of this aspect of geometry shaders to enable the GPU accelerated culling of higher order primitives.</p>
<p><span id="more-135"></span>Geometry shaders can be used for many different advanced rendering techniques that were impossible before the introduction of this flexible programmable shader stage. In this article I would like to present one use case that for me seemed to be one of the most practical application of primitive manipulation possibilities introduced by geometry shaders. As I haven&#8217;t seen any whitepaper talking specifically about this particular technique, even if some of them inherently used it, I would dare name the technique myself as <strong>Instance Cloud Reduction</strong>. I will also present a demo program that shows how to take advantage of the technique in a heavy workload situation.</p>
<p>The idea itself was inspired by AMD&#8217;s  tech demo for the Radeon 4800 series cards called <a title="March of the Froblins" href="http://developer.amd.com/samples/demos/pages/froblins.aspx" target="_blank" onclick="pageTracker._trackPageview('/outgoing/developer.amd.com/samples/demos/pages/froblins.aspx?referer=');">March of the Froblins</a>. An almost identical technique presented in this article is used in the mentioned demo for the culling of large amount of animated creatures against the view frustum. Also a somewhat similar technique is used in NVIDIA&#8217;s <a title="Skinned Instancing" href="http://developer.download.nvidia.com/SDK/10/direct3d/samples.html" target="_blank" onclick="pageTracker._trackPageview('/outgoing/developer.download.nvidia.com/SDK/10/direct3d/samples.html?referer=');">Skinned Instancing</a> demo for determining LOD instance sets. Unfortunately, both demos are for DirectX only and, as far as I can tell, there is no OpenGL demo showing any of the aforementioned rendering techniques.</p>
<h3>Motivation</h3>
<p>Nowadays, as the computational capabilities of GPUs is growing in a much faster pace than that of CPUs, graphics developers meet more and more optimization problems related to CPU bound applications. More and more focus is on minimizing the number of driver invocations, actually that&#8217;s what motivated the restructuring of the two most commonly used graphics APIs. As a result we have now DirectX 10+ and OpenGL 3+. However, even if the introduction of geometry instancing, texture arrays and local memory buffer storage for the most important inputs of the rendering, there is still need for wise decisions from graphics programmers to take full advantage of the horsepower coming with the latest GPUs.</p>
<p>Earlier graphics applications strongly relied on CPU based culling techniques, whether it be the usage of the quite outdated BSPs or the more generic and still heavily applied hierarchical culling techniques. We&#8217;ve already reached the point that sometimes even the most efficient CPU based culling techniques seem to be too expensive and usually introduce the small batch problem. Instanced rendering is not an exception.</p>
<p>The applicability of geometry instancing is strongly limited by several factors. One of the most important ones is the culling of instanced geometries. One may choose to cull these objects in the same fashion as others, using the CPU, but that usually breaks the batch and maybe we loose the benefits of geometry instancing. It is more and more imminent to have a GPU based alternative. Without CPU based culling, by sending the whole bunch of instances down the graphics pipeline may choke our vertex processor in case we have high poly geometries and quite large amount of instances of it.</p>
<p>The rendering technique presented in this article will try to achieve this goal. We will use a multi-pass technique that in the first pass culls the object instances against the view frustum using the GPU and in the second pass renders only those instances that are likely to be visible in the final scene. This way we can severely reduce the amount of vertex data sent through the graphics pipeline.</p>
<h3>Implementation</h3>
<p>For some people it might seem that the promise for such a technique is simply too naive and is most probably relying on very exotic OpenGL features, heavy misuse of some basic features or need of data conversions during the frame rendering. Wondrously, this is not the case as we have all we need in OpenGL 3.2 to implement the object culling method sketched above. All we need are the followings:</p>
<ul>
<li>instanced rendering (core since OpenGL 3.1)</li>
<li>geometry shaders (core since OpenGL 3.2)</li>
<li>transform feedback (core since OpenGL 3.0)</li>
<li>uniform or texture buffers (core since OpenGL 3.1)</li>
</ul>
<p>The method itself is a multi-pass rendering technique, however, unlike other multi-pass rendering techniques it does not produce any fragments in the first pass, instead the first pass does the view frustum culling and processes data entirely only inside buffer objects.</p>
<h3>Culling pass</h3>
<p>In the first pass we will feed the graphics pipeline with information about the instances that are needed to perform the view frustum culling. For this we need two inputs for the executed shaders in order to be able to perform the required calculations:</p>
<ol>
<li><strong>Instance transformation data</strong> (whether it be a simple transformation matrix or quaternions or whatever) -- This preferably comes from one or more buffer objects that are bound as vertex buffers to the context.</li>
<li><strong>Object extents information</strong> -- Beside the instance positions we have to know the extents of an instance in order to perform correct culling. This can be either a single float representing the object radius if we choose to use bounding spheres for the culling or a three-dimensional extent vector if we would like to use bounding boxes.</li>
</ol>
<p>Using these as input we can feed in the instance transformation data as attributes of point primitives to our culling shader. The culling shader is composed of a vertex and a geometry shader. In a typical setup the role of each is the following: the vertex shader determines whether the actual object instance&#8217;s bounding volume is inside the view frustum and sends a flag about the culling to the geometry shader, that will emit the instance data to the destination buffer if the flag says that the instance is likely to be visible or does not emit anything if it is determined that the object instance is out of view.</p>
<p>Next, transform feedback is used to capture the primitives emitted by the geometry shader into another buffer object that will be used in the actual rendering pass to source instance transformation data. Beside this, we also need to have an asynchronous query to determine the number of primitives generated to know how many instances of the object do we actually need to render. The following figure shows the workflow of the first pass:</p>
<div id="attachment_146" class="wp-caption aligncenter" style="width: 460px"><a href="http://rastergrid.com/blog/wp-content/uploads/2010/02/icr_pass1.png"><img class="size-full wp-image-146" title="Culling pass" src="http://rastergrid.com/blog/wp-content/uploads/2010/02/icr_pass1.png" alt="Culling pass" width="450" height="200" /></a><p class="wp-caption-text">Instance Cloud Reduction - Pass 1: Culling</p></div>
<p>The actual geometry shader implementation needed to perform the actual culling based on the view frustum check performed by the vertex shader should look like the following chunk:</p>
<pre class="brush: c">#version 150 core

layout(points) in;
layout(points, max_vertices = 1) out;

in vec4 OrigPosition[1];
flat in int objectVisible[1];

out vec4 CulledPosition;

void main() {

	/* only emit primitive if the object is visible */
	if ( objectVisible[0] == 1 )
	{
		CulledPosition = OrigPosition[0];
		EmitVertex();
		EndPrimitive();
	}
}</pre>
<p>In this example we used only simply a four-component position vector for the instance transformation data but the technique works well for transformation matrices and quaternions as well.</p>
<p>One more thing is that beside that we set up transform feedback in a way that we feed our buffer object dedicated for the culled instance data and we also started an asynchronous query to be able to determine the number of primitives written into the buffer object, it is also useful to turn of rasterization as we wouldn&#8217;t like to produce any fragments as a result of the first pass.</p>
<h3>Rendering pass</h3>
<p>In the second pass there is nothing special to do. Simply use whatever rendering setup you would like to use. The only things that need to be changed in this step compared to your already existing rendering path is that the instance data for the rendering must be sourced from the generated culled instance data buffer and, as a result, the number of instances passed for the instanced drawing functions shall be changed in order to render only the visible instances. This number can be read from the asynchronous query&#8217;s result that we started in the first pass.</p>
<p>The instance data in the rendering pass can be, of course, sourced from either a uniform or a texture buffer object. This depends on the actual use case and is more clearly explained in the article <a href="http://rastergrid.com/blog/2010/01/uniform-buffers-vs-texture-buffers/">Uniform Buffers VS Texture Buffers</a>.</p>
<p>Important note is that when one has to deal with several instanced geometries it is recommended to do the culling phase prior to rendering any instanced primitives because of the following reasons:</p>
<ul>
<li>The result of the first instance cloud&#8217;s culling is more likely to be finished on the GPU so no sync issues arise from reading the asynchronous query result to determine the number of visible instances.</li>
<li>Probably less state changes are needed as very different setup is required by the two passes.</li>
<li>Results in tidier renderer design as culling is clearly separated from actual rendering.</li>
</ul>
<p>Putting everything together, the application of the presented technique would result in the following workflow on the GPU:</p>
<div id="attachment_150" class="wp-caption aligncenter" style="width: 660px"><a href="http://rastergrid.com/blog/wp-content/uploads/2010/02/icr_combined.png"><img class="size-full wp-image-150" title="Instance Cloud Reduction" src="http://rastergrid.com/blog/wp-content/uploads/2010/02/icr_combined.png" alt="Instance Cloud Reduction" width="650" height="347" /></a><p class="wp-caption-text">Instance Cloud Reduction - Combined view of Pass 1 + Pass 2</p></div>
<h3>Conclusion</h3>
<p>We&#8217;ve seen that the presented advanced rendering technique is able to help in situations when we have to deal with large number of instanced geometries and how to take advantage of the latest features of graphics cards and OpenGL to perform view frustum culling calculations on the GPU. This prevents us from having to deal with complicated and expensive CPU based object culling methods that break the drawing batches, especially when dealing with dynamic objects. For ease the decision whether to incorporate this technique in your rendering engine I would like to present the advantages and disadvantages of it.</p>
<p><strong>Advantages:</strong></p>
<ul>
<li>Heavily reduces the amount of processed data in a naive implementation.</li>
<li>No need for any space partitioning methods in the host application to handle the culling of dynamic objects.</li>
<li>Can handle huge amount of instanced objects due to the enormous horsepower of today&#8217;s GPUs.</li>
<li>Scales well with increased number of instances as the per-instance calculation is relatively low.</li>
<li>Relies strictly on OpenGL 3.2 core features.</li>
<li>No need for OpenCL capable hardware.</li>
</ul>
<p><strong>Disadvantages:</strong></p>
<ul>
<li>Needs an extra rendering pass to perform the culling.</li>
<li>Requires the usage of asynchronous queries to determine the number of visible instances.</li>
</ul>
<p>I hope you agree with me and think about this technique as one more step towards fully GPU based scene management. If you have any remarks or improvement ideas regarding to the rendering technique itself feel free to tell me.</p>
<h3>The Demo</h3>
<p>As I promised, the technique presented above comes with a live demo that actually took most of my time dedicated to writing this blog in the last two weeks. The demo itself is more like a technical show cast rather than a presentation of a real-life use case scenario.</p>
<p>First of all, I used high polygon count models for the rendering to emphasize the amount of time the culling phase spares from the very valuable time of our GPU. In a real world application one would never do something like this. As a result, the demo is more like a benchmark than an interactive application. However, maybe on high-end graphics cards it can perform pretty well.</p>
<p>The demo scene consists of two object types: trees and grass blocks. The tree model is further divided into two parts as they need different textures: the tree trunk and the tree foliage. Obviously, this additional burden can be prevented by using texture arrays to avoid the need of separate draw calls to render the trunk and the foliage.</p>
<p>The tree trunk consists of 33138 triangles, the tree foliage has 16069 triangles and the faking-free grass block consists of 8961 triangles which I had to model myself as didn&#8217;t found any suitable model. Actually this modeling step consumed quite a reasonable amount of my time spent with the demo as I&#8217;m not an expert in this domain.As you can see, these models are not the ones that one might use in an interactive real-time application like games. However, they seemed to be very suitable for the purpose of the demonstration.</p>
<p>What really kicks off the boundaries of GPUs is that the demo renders 10,000 trees and 250,000 grass blocks using instancing. This ends up in more than <strong>2.7 billion triangles</strong> in the scene. This is far more that a GPU can handle without the aid of some scene management and culling. However, we will use no scene management at all and the only culling method that we will use is the one presented in this article.</p>
<p>The actual results are quite promising. The view frustum culling step usually spares more than <strong>99.9%</strong> of the GPU horsepower as the amount of actually rendered triangles after the culling step is far below 2 million triangles. This is still quite much but as we use high polygon count models and we don&#8217;t use any LOD techniques this seems reasonable.</p>
<p>Even if the demo scene statistics doesn&#8217;t seem like a typical use case scenario, the ease of the implementation and the compelling visual results made me pleased anyway:</p>
<p style="text-align: center;"><span class="youtube">
<object width="640" height="480">
<param name="movie" value="http://www.youtube.com/v/srbOFTLTe8k?color1=3a3a3a&amp;color2=999999&amp;border=0&amp;fs=1&amp;hl=en&amp;modestbranding=1&amp;loop=&amp;showinfo=0&amp;iv_load_policy=3&amp;showsearch=0&amp;rel=1&amp;hd=1" />
<param name="allowFullScreen" value="true" />
<embed wmode="opaque" src="http://www.youtube.com/v/srbOFTLTe8k?color1=3a3a3a&amp;color2=999999&amp;border=0&amp;fs=1&amp;hl=en&amp;modestbranding=1&amp;loop=&amp;showinfo=0&amp;iv_load_policy=3&amp;showsearch=0&amp;rel=1&amp;hd=1" type="application/x-shockwave-flash" allowfullscreen="true" width="640" height="480"></embed>
<param name="wmode" value="opaque" />
</object>
</span><p><a href="http://www.youtube.com/watch?v=srbOFTLTe8k&fmt=18" onclick="pageTracker._trackPageview('/outgoing/www.youtube.com/watch?v=srbOFTLTe8k_fmt=18&amp;referer=');">www.youtube.com/watch?v=srbOFTLTe8k</a></p></p>
<p>On my Radeon HD2600XT I have achieved 6-7 frames per second which is acceptable taking in consideration the huge amount of geometry data still passed to the graphics card. On more recent cards I suppose it should run with good frame rates, however, due to the lack of hardware to test on, these are my only results. If anybody manages to take a better screen capture than mine above then please let me know.</p>
<h3>Implementation details</h3>
<p>Just to tell a few words about what techniques and tricks I&#8217;ve used during the creation of the demo here is a listing of the most important ones:</p>
<ul>
<li>Three models are used as mentioned previously with high instance counts with over 2.7 billion of total triangles in the scene as mentioned already.</li>
<li>Three 512x512 RGBA textures are used for the models that are partially handmade, and again, I&#8217;m not a texture artist so sorry if they don&#8217;t look flawless.</li>
<li>The wavefront model and TGA image loader that accompany the demo are very roughly implemented only for the demo so I would strongly encourage you not to use it to any purpose as it handles only a subset of the possibilities of the file formats.</li>
<li>The vertex data from the wavefront model files is transferred in a very naive way so vertex reuse isn&#8217;t taken into account.</li>
<li>The instance data consists of simple four-component vectors representing the world-space position of the instance. This seemed to be the most simple for the demonstration purposes.</li>
<li>In the second pass, the instance data is sourced from a texture buffer but not really because the visible instance count exceeded the amount that would fit in a uniform buffer. I used texture buffers because for this simple demonstration they seemed to be a little bit more easy to be integrated.</li>
<li>The morphing effect that simulated wind blow is done using hard-coded geometry deformation in the vertex shader. It is not physically correct but visually compelling.</li>
<li>The lighting is a simple directional light using Phong&#8217;s shading and reflection model.</li>
<li>Simple fog is simulated with some awkward formula that I&#8217;ve chosen after a few test runs.</li>
<li>Alpha testing is achieved by using the discard operation in the fragment shader.</li>
</ul>
<h3>Driver issues</h3>
<p>During the development of the demonstration program I&#8217;ve met several driver related problems as I&#8217;ve never used so heavily the latest OpenGL features previously. I&#8217;ve worked with Catalyst 9.12 and 10.1 but both seemed to lack of a proper GLSL compiler. Here are some of the issues I&#8217;ve met:</p>
<ul>
<li>When I&#8217;ve forgot to declare the varyings in the geometry shader as arrays like the standard requires then still the driver hasn&#8217;t complained about any syntax error but when tried to execute the code the program crashed.</li>
<li>Except the texture sampler uniform, all other uniforms failed to work when used in the fragment shader only so I&#8217;ve put them all in the vertex shader.</li>
<li>For loops seemed not to work when used inside the geometry shader, that&#8217;s why the culling itself is done in the vertex shader in the demo.</li>
</ul>
<p>All these problems resulted in nasty tricks to make things working and ended up in awful shader code. Sorry for that. At least now it works on my configuration but pretty unsure whether it will work on other graphics card and driver combos. Please report me any success or failure when trying out the demo. Anyway, be sure to have the latest graphics drivers installed as, at least in case of AMD, OpenGL 3.2 drivers came out only at the fall of 2009.</p>
<p><em><strong>Edit:</strong></em></p>
<p><em>Thanks to the information got from Pierre Boudier from AMD I&#8217;ve updated both the source and binary releases to support the latest drivers properly. The problem was that I didn&#8217;t use attribute location binding as specified in the standard.</em></p>
<p><em>Also have to mention that with my new Radeon HD5770 I managed to achieve over 90 frames per second that actually show that this technique can be in fact used for games and other interactive applications.</em></p>
<p><em>One more thing in the end. As you know this version of the Nature demo uses a texture buffer to source instance positions. I plan to create another version that will take advantage of the instanced arrays introduced in core with OpenGL 3.4. I expect quite a reasonable speedup as that would eliminate the need for texture fetches in the vertex array by rather dedicating a vertex fetcher for the purpose thus increasing the overall performance of the technique.</em></p>
<h3>Binary release</h3>
<p><strong>Platform:</strong> Windows<br />
<strong>Dependency:</strong> OpenGL 3.2 capable graphics driver<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature12_win32.zip" target="_blank">nature12_win32.zip (3.58MB)<br />
</a><strong>Comments:</strong> Includes the update that makes it work even with the latest drivers.</p>
<h3>Full source code</h3>
<p><strong>Language:</strong> C++<br />
<strong>Platform:</strong> cross-platform<br />
<strong>Dependency:</strong> GLEW, SFML, GLM<br />
<strong>Download link:</strong> <a href="http://rastergrid.com/blog/wp-content/uploads/2010/06/nature12_src.zip" target="_blank">nature12_src.zip (12.6KB)<br />
</a><strong>Comments:</strong> Sorry for the many dependencies, however, I would recommend the mentioned libraries for everybody who is doing OpenGL development.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/02/instance-culling-using-geometry-shaders/feed/</wfw:commentRss>
		<slash:comments>43</slash:comments>
		</item>
	</channel>
</rss>

