<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>RasterGrid Blog</title>
	<atom:link href="http://rastergrid.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://rastergrid.com/blog</link>
	<description>A technical blog from Daniel Rákos (aka aqnuep)</description>
	<lastBuildDate>Sun, 24 Mar 2013 20:31:22 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>AdMob extension for cocos2d-x (Android)</title>
		<link>http://rastergrid.com/blog/2013/03/admob-extension-for-cocos2d-x-android/</link>
		<comments>http://rastergrid.com/blog/2013/03/admob-extension-for-cocos2d-x-android/#comments</comments>
		<pubDate>Sun, 24 Mar 2013 20:31:22 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[AdMob]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[cocos2d-x]]></category>
		<category><![CDATA[extension]]></category>
		<category><![CDATA[JNI]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=789</guid>
		<description><![CDATA[Lately I switched from Java based Android development to native C++ code and started using the famous cocos2d-x framework for implementing my second Android game Henhouse Trouble, that I released almost a year ago. At that time I had quite some trouble with interfacing third party Android libraries like the AdMob SDK. Finally, I managed&#8230;]]></description>
				<content:encoded><![CDATA[
<p>Lately I switched from Java based Android development to native C++ code and started using the famous <a title="cocos2d-x" href="http://www.cocos2d-x.org/" target="_blank">cocos2d-x</a> framework for implementing my second Android game <a title="Henhouse Trouble on Google Play" href="https://play.google.com/store/apps/details?id=com.rastergrid.game.henhousetrouble&amp;hl=en" target="_blank">Henhouse Trouble</a>, that I released almost a year ago. At that time I had quite some trouble with interfacing third party Android libraries like the AdMob SDK. Finally, I managed to get a working solution based on <a title="OpenFeint and Admob integrated with cocos2d-x" href="http://blog.molioapp.com/2011/11/openfeint-and-admob-integrated-with.html" target="_blank">this article</a>. While the article was extremely helpful for me in getting my own solution up and running, considering that I have minimal experience with JNI, I thought that AdMob SDK integration is quite a common task to deserve its own easy to integrate cocos2d-x extension. In this article I&#8217;ll present a step by step guide on how to integrate AdMob advertisements into your cocos2d-x 2.x.x application using the extension.</p>
<p><span id="more-789"></span></p>
<p>I&#8217;ll first present the generic steps of AdMob integration that are also the prerequisites of setting up AdMob with cocos2d-x.</p>
<h2>Step 1 &#8211; Adding the AdMob SDK library to your project</h2>
<p>First, you have to add GoogleAdMobAdsSdk-x.x.x.jar to your project. In Eclipse you can do so by going to <em>Project -&gt; Properties -&gt; Java Build Path</em>, and then on the Libraries tab clicking on the <em>Add JARs&#8230;</em> button and then selecting the jar file in the file browser to add it to your project. In my projects I like to keep the jar file inside my source tree so you probably also want to first copy the file to your project&#8217;s <em>libs</em> folder before adding it to the project in the Eclipse GUI.</p>
<h2>Step 2 &#8211; Setting up your AndroidManifest.xml</h2>
<p>The second step is to set up your Android manifest file. In order for the AdMob SDK to be able to present the ads you have to add the following activity definition under your <em>manifest/application</em> element in the XML:</p>
<pre class="brush:xml">&lt;activity android:name="com.google.ads.AdActivity" android:configChanges="keyboard|keyboardHidden|orientation"/&gt;</pre>
<p>Next, you have to declare the Android permissions that are needed for the SDK to work. At a bare minimum <em>INTERNET</em> and <em>ACCESS_NETWORK_STATE</em> permissions are required. You can request the permissions by adding the following two items under the <em>manifest</em> element in your XML:</p>
<pre class="brush:xml">&lt;uses-permission android:name="android.permission.INTERNET"/&gt;
&lt;uses-permission android:name="android.permission.ACCESS_NETWORK_STATE"/&gt;</pre>
<p>For some configurations you might need to add additional permission requirements to your application, but I&#8217;ll talk about it later.</p>
<h2>Step 3 &#8211; Adding extension files to your project</h2>
<p>The source tree of the extension contains two directories: <a href="https://code.google.com/p/cocos2d-x-admob/source/browse/#hg%2Ftrunk%2Fjava" target="_blank">java</a> and <a href="https://code.google.com/p/cocos2d-x-admob/source/browse/#hg%2Ftrunk%2Fcpp" target="_blank">cpp</a>. The files correspond to the files needed for your Eclipse project and the files needed for your NDK build.</p>
<p>First, you have to copy the contents of the <em>java</em> folder to the <em>src</em> folder of your Eclipse project. The only modification you need to make to your Java Android project is to instantiate an <a href="https://code.google.com/p/cocos2d-x-admob/source/browse/trunk/java/com/rastergrid/AdMobHelper.java" target="_blank">AdMobHelper</a> object in your activity&#8217;s <em>onCreate</em> method. After that your activity&#8217;s class should look something like the following example:</p>
<pre class="brush:java">public class MyGameActivity extends Cocos2dxActivity {

    protected AdMobHelper mAdMobHelper;

    protected void onCreate(Bundle savedInstanceState){
        super.onCreate(savedInstanceState);

        mAdMobHelper = new AdMobHelper(this);
    }

    static {
        System.loadLibrary("myNativeGameLib");
    }
}</pre>
<p>That&#8217;s all for the Java part. Now you have to copy the contents of the <em>cpp</em> folder to your C++ source tree (e.g. directly under your <em>Classes</em> folder). If you are using Visual Studio to test your application under Windows, make sure you add the files also to your Visual Studio project.</p>
<p>In order for the new C++ files to get compiled into your native library, you have to add the source file to the list of files to compile in your NDK makefile which you can find in your Eclipse project at <em>jni/Android.mk</em>. After adding the file to the list, your makefile should look something like this:</p>
<pre class="brush:text">...
LOCAL_SRC_FILES := hellocpp/main.cpp \
                   ../../Classes/AppDelegate.cpp \
                   ../../Classes/CCAdView.cpp \
...</pre>
<p>That&#8217;s all about the generic integration steps of the extension. Now it&#8217;s time to present how you use it.</p>
<h2>Basic usage</h2>
<p>The extension adds a new node type to the <em>cocos2d</em> namespace called <a href="https://code.google.com/p/cocos2d-x-admob/source/browse/trunk/cpp/CCAdView.h" target="_blank">CCAdView</a>. In order to present your ad, you just have to create an instance of this node and add it to your cocos2d-x layer that you want to present it in. In order to do so, you need to remember two important piece of information when you create your ad unit in the AdMob web interface: the ad type you&#8217;ve created and the identified of the ad unit. After having this information, you just have to add the following code to the <em>init</em> method of your layer:</p>
<pre class="brush:cpp">CCAdView* adView = CCAdView::create(AD_SIZE, AD_UNIT_ID);
this-&gt;addChild(adView, AD_VIEW_Z_ORDER);
adView-&gt;loadAd();</pre>
<p>Here <em>AD_UNIT_ID</em> is the string identifier of the ad unit, while <em>AD_SIZE</em> specifies the type (or size) of the ad unit you&#8217;ve created and can be one of the constants of the following enum (equivalent with the corresponding Java constants of the <em>AdSize</em> class):</p>
<pre class="brush:cpp">typedef enum _CCAdSize
{
    kCCAdSizeSmartBanner,
    kCCAdSizeBanner,
    kCCAdSizeMediumRectangle,
    kCCAdSizeFullBanner,
    kCCAdSizeLeaderboard,
    kCCAdSizeWideSkyscraper
} CCAdSize;</pre>
<p>Remember, these information come from the AdMob web interface. For example, if you created a simple banner then you should use the <em>kCCAdSizeBanner</em> constant, and the ad unit id will be the 15 character long string identifier the AdMob web interface allocated for it.</p>
<p>The value of <em>AD_VIEW_Z_ORDER</em> is actually indifferent on Android as there the actual visual item showing the ad is an Android View, not really a cocos2dx node, thus no matter what Z order you specify the ad will always appear on top of any cocos2d-x rendered primitive. However, the extension comes also with a stub Windows implementation that will show you a white outlined black rectangle as a placeholder of where your ad will actually show up when running your application on an Android device, and as that placeholder is shown as a regular cocos2d-x node on Windows, it is recommended to specify a Z order higher than any other displayed node so that the visual appearance on Windows will match the one you&#8217;ll see on the actual device.</p>
<p>You probably also noted that we call the <em>loadAd</em> method of the <em>CCAdView</em> node which is actually equivalent with the Java interface&#8217;s <em>loadAd</em> method and it tells the AdMob SDK to load a new ad to the ad unit. You can call this method anytime you want to make sure that a new ad is presented (however, the SDK doesn&#8217;t guarantees that a new ad will be shown each time you call this method in order to avoid potential abuse).</p>
<h2>Ad placement and visibility</h2>
<p>By default, your ad will be shown in the top-left corner of your application. However, you can change its location by specifying its horizontal and vertical alignment with the <em>setAlignment</em> method. For example, in order to place your ad horizontally centered at the bottom of the screen you have to add the following line to your code:</p>
<pre class="brush:cpp">adView-&gt;setAlignment(kCCHorizontalAlignmentCenter, kCCVerticalAlignmentBottom);</pre>
<p>Currently there is no way to set an explicit position for the ad, but I think for most use cases setting the screen alignment should be satisfactory (at least it was for me).<br />
Besides that, you might also want to hide your ad unit from time to time and then re-show it again later. You can use the <em>setVisible</em> method of the <em>CCAdView</em> node just like you do it with any other cocos2d-x node. In general, it is a good practice to just hide and then show you ad again, instead of deleting and re-creating every single time your whole <em>CCAdView</em> node. Also, you probably want to load a new ad each time you show your ad unit again. These can be easily implemented as follows:</p>
<pre class="brush:cpp">// show new ad
adView-&gt;setVisible(true);
adView-&gt;loadAd();
...
// hide ad
adView-&gt;setVisible(false);</pre>
<h2>Location based ads</h2>
<p>The AdMob SDK also allows you to present targeted ads to the users of your application based on their location. This allows you to show more relevant ads to the user which can both result in better user experience and higher ad revenue. The location used for determining what ads to show can be either based on the user&#8217;s coarse-grain location (simple mobile network provided data) or fine-grain location (GPS provided location). In general, the former is perfectly acceptable and is probably less intrusive from the user&#8217;s point of view, but for the sake of completeness, the extension supports both types.</p>
<p>By default, the user&#8217;s location is not used by AdMob or by this extension. You can check that by using the <em>getUsedLocation</em> method that will return the value <em>kCCLocationNone</em> by default.</p>
<p>If you want to use the user&#8217;s coarse-grain location to increase the relevance of ads you have to add the following line to your code:</p>
<pre class="brush:cpp">adView-&gt;useLocation(kCCLocationCoarse);</pre>
<p>The effect of the <em>useLocation</em> method will take effect the next time you call <em>loadAd</em>.</p>
<p>However, in order to use the coarse-grain location in your application, you also have to request permission to do so in your Android manifest by adding the following line under the <em>manifest</em> element of your XML:</p>
<pre class="brush:xml">&lt;uses-permission android:name="android.permission.ACCESS_COARSE_LOCATION"/&gt;</pre>
<p>In a similar fashion, if you want to use the user&#8217;s fine-grain location, add the following line to your C++ code:</p>
<pre class="brush:cpp">adView-&gt;useLocation(kCCLocationFine);</pre>
<p>While in your Android manifest you have to request the <em>ACCESS_FINE_LOCATION</em> permission instead:</p>
<pre class="brush:xml">&lt;uses-permission android:name="android.permission.ACCESS_FINE_LOCATION"/&gt;</pre>
<p>Note that, however, both of these permissions are optional, and you only need at most one of them, based on what policy you use for location based ads.</p>
<h2>Summary</h2>
<p>You can see how easy AdMob integration into cocos2d-x projects could be with the extension and the functionality provided is probably enough for most users, at least it was for me. However, I&#8217;m pretty new to writing Android JNI code so people who are more experienced in the field would probably find my implementation kind of hacky. That&#8217;s perfectly fine, the extension is open source and I&#8217;d really like to see people improving it, so feel free to contact me if you have any improvement ideas or patches. The source code is available as a Google Code project here: <a title="Cocos2d-x AdMob" href="https://code.google.com/p/cocos2d-x-admob/" target="_blank">https://code.google.com/p/cocos2d-x-admob/</a></p>
<p>Also, the current implementation has the following limitations:</p>
<ul>
<li>Only supports Android (and Windows, through a stub that is meant only for development purposes). It would be great if somebody could provide ports to other platforms.</li>
<li>Only a single <em>CCAdView</em> node is going to work. That means if you create an additional node, that will practically overwrite the state of the previous one. Though I don&#8217;t consider this a practical limitation as AdMob doesn&#8217;t allow you to display more than one ad at a time anyways.</li>
<li>Only cocos2d-x 2.x.x is supported, but if you&#8217;re still using cocos2d-x 1.x.x I would anyways recommend you to upgrade.</li>
</ul>
<p>So while there is a lot of room for improvements, I hope that the extension will help people not to have to go through the same problems that I had and can concentrate more on their actual application.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2013/03/admob-extension-for-cocos2d-x-android/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OpenGL vs DirectX: The War Is Far From Over</title>
		<link>http://rastergrid.com/blog/2011/10/opengl-vs-directx-the-war-is-far-from-over/</link>
		<comments>http://rastergrid.com/blog/2011/10/opengl-vs-directx-the-war-is-far-from-over/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 19:02:12 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Direct3D]]></category>
		<category><![CDATA[DirectX]]></category>
		<category><![CDATA[fragment shader]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[geometry shader]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[occlusion culling]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[tessellation control shader]]></category>
		<category><![CDATA[tessellation evaluation shader]]></category>
		<category><![CDATA[transform feedback]]></category>
		<category><![CDATA[uniform buffer]]></category>
		<category><![CDATA[vertex buffer]]></category>
		<category><![CDATA[vertex shader]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=652</guid>
		<description><![CDATA[I&#8217;ve chosen the title based on the popular article that tries to prove that OpenGL lost the war against Direct3D. To be honest, I didn&#8217;t really like the article at all. First, because it compared OpenGL 3 which targeted Shader Model 4.0 hardware and DirectX 11 which targeted Shader Model 5.0 hardware. Besides that, as we&#8230;]]></description>
				<content:encoded><![CDATA[
<div class="wp-caption alignleft" style="width: 260px"><img title="OpenGL vs DirectX" src="http://rastergrid.com/blog/wp-content/uploads/2011/10/opengl-vs-directx-250x138.jpg" alt="OpenGL vs DirectX" width="250" height="138" /><p class="wp-caption-text">The War Is Far From Over</p></div>
<p>I&#8217;ve chosen the title based on the <a title="OpenGL 3 &amp; DirectX 11: The War Is Over" href="http://www.tomshardware.com/reviews/opengl-directx,2019.html" target="_blank">popular article</a> that tries to prove that OpenGL lost the war against Direct3D. To be honest, I didn&#8217;t really like the article at all. First, because it compared OpenGL 3 which targeted Shader Model 4.0 hardware and DirectX 11 which targeted Shader Model 5.0 hardware. Besides that, as we will see, the war is really far from over&#8230; This article aims to list the most important features introduced by OpenGL 3.x, OpenGL 4.x, Direct3D 10, Direct3D 11 and we will also talk about the promised features of the upcoming Direct3D 11.1 to be fair with DirectX <img src='http://rastergrid.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><span id="more-652"></span></p>
<p>After I wrote <a title="An introduction to OpenGL 4.2" href="http://rastergrid.com/blog/2011/08/an-introduction-to-opengl-4-2/">my article about the latest features introduced in OpenGL</a> someone asked me whether I can write an article about the comparison of the hardware features exposed by OpenGL and Direct3D. Instead of a long explanation, I decided to simply create a table of the features introduced by the APIs. Please note that the list focuses on hardware features and does not discuss API feature differences between the two APIs. The list may be far from complete and I&#8217;m happy to get feedback about what is missing from the table so that I can extend it. Also there are features for which I did not find whether an equivalent exists in D3D and are marked with a question mark. If anybody can point me to the answer, I would be happy, but I did not find a specification of the HLSL versions.</p>
<table style="width: 100%;" border="0">
<tbody>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>HARDWARE FEATURES EXPOSED</strong></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Draw command related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Conditional/predicated rendering based on the result of occlusion queries (<a href="http://www.opengl.org/registry/specs/NV/conditional_render.txt">NV_conditional_render</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Basic geometry instancing support and instanced draw commands (<a href="http://www.opengl.org/registry/specs/ARB/draw_instanced.txt">ARB_draw_instanced</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Geometry instancing with the ability to specify instanced vertex attributes (<a href="http://www.opengl.org/registry/specs/ARB/instanced_arrays.txt">ARB_instanced_arrays</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Primitive restart (cut index) feature for batching multiple strips together (<a href="http://www.opengl.org/registry/specs/NV/primitive_restart.txt">NV_primitive_restart</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Draw commands allowing modification of the base vertex index (<a href="http://www.opengl.org/registry/specs/ARB/draw_elements_base_vertex.txt">ARB_draw_elements_base_vertex</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Indirect draw commands that source their parameters from server side buffers (<a href="http://www.opengl.org/registry/specs/ARB/draw_indirect.txt">ARB_draw_indirect</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>New shader type related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Geometry shader support and adjacency primitive support (<a href="http://www.opengl.org/registry/specs/ARB/geometry_shader4.txt">ARB_geometry_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Instanced geometry shader support with fixed number of invocations (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Tessellation control and evaluation (hull and domain) shader support (<a href="http://www.opengl.org/registry/specs/ARB/tessellation_shader.txt">ARB_tessellation_shader</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Transform feedback (stream-output) related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Basic transform feedback (stream-output) support (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Transform feedback support without a geometry shader being active (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for pausing and resuming transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback2.txt">ARB_transform_feedback2</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Auto-draw support (feed back the contents of the transform feedback buffer) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback2.txt">ARB_transform_feedback2</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Instanced auto-draw support (transform feedback buffer drawing with instancing support) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback_instanced.txt">ARB_transform_feedback_instanced</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for outputting multiple primitive streams using transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/ARB/transform_feedback3.txt">ARB_transform_feedback3</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Asynchronous queries and related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Support for occlusion query for getting number of samples passed (<a href="http://www.opengl.org/registry/specs/ARB/occlusion_query.txt">ARB_occlusion_query</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for occlusion query for getting only a boolean value about visibility (<a href="http://www.opengl.org/registry/specs/ARB/occlusion_query2.txt">ARB_occlusion_query2</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number vertices processed and the number of vertex shader invocations</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of geometry shader invocations in case a geometry shader is active</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives output by the geometry shader (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives that were sent to the rasterizer (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives that were passing clipping and were actually rendered</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of times a fragment/pixel shader was invoked</td>
<td style="background-color: #cc5555"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt1">[1]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives written during transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the number of primitives generated during transform feedback (stream-output) (<a href="http://www.opengl.org/registry/specs/EXT/transform_feedback.txt">EXT_transform_feedback</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query a server side high resolution timestamp (<a href="http://www.opengl.org/registry/specs/ARB/timer_query.txt">ARB_timer_query</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to query the completeness of rendering commands (<a href="http://www.opengl.org/registry/specs/ARB/sync.txt">ARB_sync</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Texture, vertex and renderbuffer format related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Floating point color and depth formats for textures and render buffers (various extensions)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Cube map textures with depth component internal format (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Half-float (16-bit) vertex and pixel data support (<a href="http://www.opengl.org/registry/specs/NV/half_float.txt">NV_half_float</a>, <a href="http://www.opengl.org/registry/specs/ARB/half_float_pixel.txt">ARB_half_float_pixel</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Non-normalized integer color formats for textures and renderbuffers (<a href="http://www.opengl.org/registry/specs/EXT/texture_integer.txt">EXT_texture_integer</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Packed depth/stencil texture and renderbuffer formats (<a href="http://www.opengl.org/registry/specs/EXT/packed_depth_stencil.txt">EXT_packed_depth_stencil</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">RGTC texture compression for two-component textures (<a href="http://www.opengl.org/registry/specs/EXT/texture_compression_rgtc.txt">EXT_texture_compression_rgtc</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Signed normalized texture component formats (<a href="http://www.opengl.org/registry/specs/EXT/texture_snorm.txt">EXT_texture_snorm</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Seamless cube map filtering support (to hide artifacts at cube map edges) (<a href="http://www.opengl.org/registry/specs/ARB/seamless_cube_map.txt">ARB_seamless_cube_map</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for swizzling the components of a texture (<a href="http://www.opengl.org/registry/specs/ARB/texture_swizzle.txt">ARB_texture_swizzle</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">BPTC texture compression for floating point and unsigned normalized textures (<a href="http://www.opengl.org/registry/specs/ARB/texture_compression_bptc.txt">ARB_texture_compression_bptc</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">64-bit floating point vertex attribute formats (<a href="http://www.opengl.org/registry/specs/ARB/vertex_attrib_64bit.txt">ARB_vertex_attrib_64bit</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>New texture type related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">One- and two-dimensional layered array textures (<a href="http://www.opengl.org/registry/specs/EXT/texture_array.txt">EXT_texture_array</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Cube map array textures as special two-dimensional array textures (<a href="http://www.opengl.org/registry/specs/ARB/texture_cube_map_array).txt">ARB_texture_cube_map_array)</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Rectangular textures with no mipmap support and that are accessed with integer coordinates (<a href="http://www.opengl.org/registry/specs/ARB/texture_rectangle.txt">ARB_texture_rectangle</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Multisampled textures and support for fetching specific sample locations (<a href="http://www.opengl.org/registry/specs/ARB/texture_multisample.txt">ARB_texture_multisample</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Casting a texture&#8217;s interpreted internal format to another internal format</td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt4">[4]</a></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt4">[4]</a></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Uniform buffer (constant buffer) related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Basic uniform buffer (constant buffer) support (<a href="http://www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt">ARB_uniform_buffer_object</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for large uniform buffers and binding subranges (<a href="http://www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt">ARB_uniform_buffer_object</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Framebuffer and texture rendering related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Rendering to textures and renderbuffers (<a href="http://www.opengl.org/registry/specs/EXT/framebuffer_object.txt">EXT_framebuffer_object</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Multisample stretch blit functionality (<a href="http://www.opengl.org/registry/specs/EXT/framebuffer_multisample.txt">EXT_framebuffer_multisample</a>, <a href="http://www.opengl.org/registry/specs/EXT/framebuffer_blit.txt">EXT_framebuffer_blit</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">sRGB rendering and blending support for framebuffers (<a href="http://www.opengl.org/registry/specs/EXT/framebuffer_sRGB.txt">EXT_framebuffer_sRGB</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for enabling or disabling clamping of the depth of fragments (<a href="http://www.opengl.org/registry/specs/ARB/depth_clamp.txt">ARB_depth_clamp</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for logical operations on integer render targets (supported for a decade in OpenGL)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Blending related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Support for alpha-to-coverage when using multisampling (<a href="http://www.opengl.org/registry/specs/ARB/multisample.txt">ARB_multisample</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Per-color-buffer blend enables and color writemasks (<a href="http://www.opengl.org/registry/specs/EXT/draw_buffers2.txt">EXT_draw_buffers2</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Dual-source color blending support based on a secondary output of the fragment shader (<a href="http://www.opengl.org/registry/specs/ARB/blend_func_extended.txt">ARB_blend_func_extended</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Individual blend equations and blend functions support for each color output (<a href="http://www.opengl.org/registry/specs/ARB/draw_buffers_blend.txt">ARB_draw_buffers_blend</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Shader related features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Texture lookup functions to access individual texels of a LOD using integer coordinates (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Query the dimensions of a specific LOD of a texture in shaders (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Ability to apply integer offsets to the texel location during texture lookup (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Ability to explicitly pass in derivative values that are used to compute LOD during texture lookup (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Control over varying variable interpolation: non-perspective, flat, centroid sampling, etc. (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Full signed and unsigned integer support in shaders (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<tr>
<td style="padding: 0px">Vertex ID built-in variable available in vertex shader (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Primitive ID built-in variable available in geometry and fragment shader (<a href="http://www.opengl.org/registry/specs/EXT/gpu_shader4.txt">EXT_gpu_shader4</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Instance ID built-in variable available in vertex shader (<a href="http://www.opengl.org/registry/specs/ARB/draw_instanced.txt">ARB_draw_instanced</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Shader fragment coordinate convention control (<a href="http://www.opengl.org/registry/specs/ARB/fragment_coord_conventions.txt">ARB_fragment_coord_conventions</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Provoking vertex control (for flat shaded varying value selection) (<a href="http://www.opengl.org/registry/specs/ARB/provoking_vertex.txt">ARB_provoking_vertex</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cc5555;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for encoding and decoding floating point values from and to integers (<a href="http://www.opengl.org/registry/specs/ARB/shader_bit_encoding.txt">ARB_shader_bit_encoding</a>)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for get the results of the automatic LOD computations in shaders (<a href="http://www.opengl.org/registry/specs/ARB/texture_query_lod.txt">ARB_texture_query_lod</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for coherent indexing into arrays of samplers using non-constant indices (addressable samplers) (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for indexing into arrays of uniform blocks (addressable constant buffers) (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Gathered texture fetches over a 2&#215;2 footprint (with custom offsets) (<a href="http://www.opengl.org/registry/specs/ARB/texture_gather.txt">ARB_texture_gather</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Invocation ID built-in variable available in geometry shader (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for double-precision floating-point data types in shaders (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt">ARB_gpu_shader_fp64</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for sample-frequency fragment shader execution (<a href="http://www.opengl.org/registry/specs/ARB/sample_shading.txt">ARB_sample_shading</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support indirect subroutine calls in all shader stages (<a href="http://www.opengl.org/registry/specs/ARB/shader_subroutine.txt">ARB_shader_subroutine</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for selecting from multiple viewports using a geometry shader (<a href="http://www.opengl.org/registry/specs/ARB/viewport_array.txt">ARB_viewport_array</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for dedicated atomic counters in shaders (<a href="http://www.opengl.org/registry/specs/ARB/shader_atomic_counters.txt">ARB_shader_atomic_counters</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55; text-align: center;"><a href="#tblcmt2">[2]</a></td>
<td style="background-color: #55cc55; text-align: center;"><a href="#tblcmt2">[2]</a></td>
</tr>
<tr>
<td style="padding: 0px">Support for backing up dedicated atomic counters with buffers (<a href="http://www.opengl.org/registry/specs/ARB/shader_atomic_counters.txt">ARB_shader_atomic_counters</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt5">[5]</a></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt5">[5]</a></td>
</tr>
<tr>
<td style="padding: 0px">Support for load/store (read/write) buffers and textures in shaders (<a href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt">ARB_shader_image_load_store</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #cccc55; text-align: center;"><a href="#tblcmt3">[3]</a></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for atomic operations on load/store buffers and textures (<a href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt">ARB_shader_image_load_store</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for disabling or forcing early depth test (<a href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt">ARB_shader_image_load_store</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for conservative depth (enabling safe early tests even when modifying depth) (<a href="http://www.opengl.org/registry/specs/ARB/conservative_depth.txt">ARB_conservative_depth</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support for coverage as input to the fragment shader (<a href="http://www.opengl.org/registry/specs/ARB/gpu_shader5.txt">ARB_gpu_shader5</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="text-align: center; background-color: #c5e526;" colspan="6"><strong>Miscellaneous features</strong></td>
</tr>
<tr style="height: 20px">
<td style="background-color: #aaaaaa;"></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 3.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">GL 4.x</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 10</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11</span></strong></td>
<td style="text-align: center; width: 50px; background-color: #aaaaaa; padding: 0px;"><strong><span style="color: #ffffff;">DX 11.1</span></strong></td>
</tr>
<tr>
<td style="padding: 0px">Support for floating point viewport specification (<a href="http://www.opengl.org/registry/specs/ARB/viewport_array.txt">ARB_viewport_array</a>)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Per-texture mipmap clamping (supported since the very early versions of OpenGL)</td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
<tr>
<td style="padding: 0px">Support to use a single depth texture for depth testing and as texture input (when depth writes are disabled)</td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #cc5555;"></td>
<td style="background-color: #55cc55;"></td>
<td style="background-color: #55cc55;"></td>
</tr>
</tbody>
</table>
<p><a name="tblcmt1">[1]</a> There is no support for these counters in OpenGL, however they can be implemented with the help of shader atomic counters.<br />
<a name="tblcmt2">[2]</a> There is no support in Direct3D to use the dedicated atomic counter hardware (supported currently only by AMD GPUs) only by using an append/consume buffer. Though, as atomic counters are the part of UAVs and arbitrary number of UAVs can be attached to a single resource, the same functionality is supported indirectly.<br />
<a name="tblcmt3">[3]</a> There is read/write buffer and texture support in Direct3D 11, however it is available only in the fragment (pixel) shader. Direct3D 11.1 plans to remove this restriction.<br />
<a name="tblcmt4">[4]</a> There is no support for texture format casting in OpenGL, conversion, however, can be done by doing a copy preferably using pixel buffer objects.<br />
<a name="tblcmt5">[5]</a> There is no support for automatic storage of atomic counter values in buffers in Direct3D, however, their value can be manually copied to arbitrary resources.</p>
<p>As a conclusion, I would like to say just one thing: even though there are some features that are not supported by either OpenGL or Direct3D, we really can say that the two APIs are on par with the number of hardware features they expose.</p>
<p>(Sorry in advance for any mistakes, it took quite some time to create this table and I may became too tired at the end)</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/10/opengl-vs-directx-the-war-is-far-from-over/feed/</wfw:commentRss>
		<slash:comments>80</slash:comments>
		</item>
		<item>
		<title>An introduction to OpenGL 4.2</title>
		<link>http://rastergrid.com/blog/2011/08/an-introduction-to-opengl-4-2/</link>
		<comments>http://rastergrid.com/blog/2011/08/an-introduction-to-opengl-4-2/#comments</comments>
		<pubDate>Sun, 28 Aug 2011 14:25:25 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[atomic counter]]></category>
		<category><![CDATA[fragment shader]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[image load store]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[texture buffer]]></category>
		<category><![CDATA[transform feedback]]></category>
		<category><![CDATA[uniform buffer]]></category>
		<category><![CDATA[vertex shader]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=611</guid>
		<description><![CDATA[After the release of the OpenGL 4.1 specification the Khronos Group slowed down the pace a little bit but they didn&#8217;t left OpenGL developers without a new specification version for too long as a few weeks ago they&#8217;ve released OpenGL 4.2. The new version of the specification brings several API improvements as well as exposes&#8230;]]></description>
				<content:encoded><![CDATA[
<p>After the release of the OpenGL 4.1 specification the Khronos Group slowed down the pace a little bit but they didn&#8217;t left OpenGL developers without a new specification version for too long as a few weeks ago they&#8217;ve released OpenGL 4.2. The new version of the specification brings several API improvements as well as exposes some important pieces of hardware functionality that makes OpenGL 4.x class hardware a great step forward in GPU history. This article aims to present the newly introduced features in the latest version of the OpenGL specification and, as a few months ago I wrote an article about <a title="Suggestion for OpenGL 4.2 and beyond" href="http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/">Suggestions for OpenGL 4.2 and beyond</a>, I will write a few words about how does the new specification reflect my forecast.</p>
<p><span id="more-611"></span></p>
<h2>New features in OpenGL 4.2</h2>
<p>OpenGL 4.2 finally filled the holes in the capability matrix of Shader Model 5.0 hardware with some long waited extensions from which some of the functionalities were actually already accessible through cross-vendor and vendor specific extensions. Also, the new version of the specification brings some important API improvement extensions and GLSL constructs that continue the transition to a more easy to use state and shader management.</p>
<h3><a title="GL_ARB_texture_compression_bptc" href="http://www.opengl.org/registry/specs/ARB/texture_compression_bptc.txt" target="_blank">ARB_texture_compression_bptc</a></h3>
<p>This extension adds the new block compression texture formats called BC7 and BC6H in Direct3D terminology. The extension is actually available for quite some time, since the release of OpenGL 4.0 but now it became core. The formats provide high quality block compression for fixed point RGBA and sRGB textures as well as two floating point texture compression formats for signed and unsigned data.</p>
<p>Traditional block compression methods (as S3TC or RGTC) use the gradients in a block of pixels which works fine for smooth images but does provide poor results in case of sharp edges. BPTC solves the issue by dividing blocks into multiple partitions which are compressed using independent gradients thus providing better overall quality.</p>
<p>When comparing compression efficiency, BPTC has a compression ratio of 3:1 compared to 6:1, 4:1 and 2:1 that are the compression ratios of the S3TC DXT1, S3TC DXT5 and RGTC formats respectively.</p>
<h3><a title="GL_ARB_compressed_texture_pixel_storage" href="http://www.opengl.org/registry/specs/ARB/compressed_texture_pixel_storage.txt" target="_blank">ARB_compressed_texture_pixel_storage</a></h3>
<p>This is an interesting extension that solves a problem that I didn&#8217;t even know is such a big issue. The extension is designed primarily to support compressed image formats with fixed-size blocks as that of BPTC as an example. The application can use this extension to configure pixel store parameters so that subtexture operations can provide consistent results in all cases.</p>
<h3><a title="GL_ARB_texture_storage" href="http://www.opengl.org/registry/specs/ARB/texture_storage.txt" target="_blank">ARB_texture_storage</a></h3>
<p>This is again an interesting extension that provides API improvement over how texture storage is allocated in classic OpenGL. As we all know, OpenGL was always too ad hoc on resource management, from the point of view of when actual resources are allocated for a particular API primitive. This is especially a problem in case of textures where we potentially talk about large amount of data. In classic OpenGL the driver could not know from the beginning for example whether the application will need mipmaps for the texture or how many levels are required. This could easily result in bad allocation patterns and/or large reallocations. This extension introduces the concept of immutable texture images where all the levels are allocated up-front for a texture object.</p>
<h3><a title="GL_ARB_transform_feedback_instanced" href="http://www.opengl.org/registry/specs/ARB/transform_feedback_instanced.txt" target="_blank">ARB_transform_feedback_instanced</a></h3>
<p>This extension extends the so called &#8220;AutoDraw&#8221; feature by providing instanced &#8220;AutoDraw&#8221;. This means that geometry captured using transform feedback can be rendered multiple time using geometry instancing. This is actually a feature that even D3D11 does not provide and being such, I didn&#8217;t even think that hardware supports it, even though I think the list usage patterns of the extensions is most probably pretty narrow.</p>
<h3><a title="GL_ARB_base_instance" href="http://www.opengl.org/registry/specs/ARB/base_instance.txt" target="_blank">ARB_base_instance</a></h3>
<p>This extension is actually the feature I called <strong>ARB_instanced_arrays2</strong> in my <a title="Suggestions for OpenGL 4.2 and beyond." href="http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/" target="_blank">suggestion list</a>. The extension provides three new draw commands, one is kind of illy named as <strong>DrawElementsInstancedBaseVertexBaseInstance</strong>, even though this command can be called the &#8220;basic&#8221; indexed draw commands that specifies all parameters. Also, the parameter list of the indirect indexed draw command is extended with the base instance parameter. Fortunately, however, the ARB chosen to add new commands rather than a <strong>SetBaseInstance</strong>-style state specifier command to introduce the new concept. Funnily this feature was missing for a long time as, as far as I know, it is supported by all GPUs capable of doing instanced drawing, and is available in D3D as well.</p>
<h3><a title="GL_ARB_shader_image_load_store" href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt" target="_blank">ARB_shader_image_load_store</a></h3>
<p>This is where things get start really interesting. This new extension is the ARBified version of the extension <a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">EXT_shader_image_load_store</a> which fortunately didn&#8217;t make it into core in its current form.</p>
<p>The extension provides GLSL built-in functions allowing shaders to load from, store to, and perform atomic read-modify-write operations to a single level of a texture called an image from any shader stage. Also, the extension indirectly enables the same set of operations for buffer objects by using buffer textures. This enables developers to implement more sophisticated algorithms using shaders that require more complex data structures than just plain arrays.</p>
<p>This, together with atomic counters that we will talk about later, enables the possibility to implement append/consume buffers and rendering techniques like AMD&#8217;s Order-Independent Transparency (OIT) algorithm as <a title="OIT and Indirect Illumination  Using DX11 Linked Lists" href="http://www.slideshare.net/hgruen/oit-and-indirect-illumination-using-dx11-linked-lists" target="_blank">presented at GDC10</a>.</p>
<p>As the introduction of the new write operations to fragment shaders besides the traditional framebuffer writes makes the execution of the shader have side effects and thus sensitive to whether early-Z is used or not by the hardware, so the extension also provides a mechanism to force or disable early-Z in the fragment shader.</p>
<p>A similar issue is in case of vertex shaders as the post-transform cache may be no longer valid in case of certain usage patterns of load/store images so, based on how smart the shader compiler is, the post-transform cache could be easily disabled in case a vertex shader uses load/store images resulting in downgraded performance, so care must be taken when using read/write images in vertex shaders as OpenGL does not have any mechanism to help these issues (but I actually have a proposal that I&#8217;ll talk about in a future article).</p>
<p>The API of this extension is greatly improved compared to the EXT version, especially when dealing with various texture image formats. The extension also provides a future-proof DSA-style API. Further, the ARB version of the extension supports loads from any texture format and corrected some specification bugs of the EXT version.</p>
<p>From hardware implementation point of view, it must be noted that in case a shader contains atomic operations applied to a particular read/write image the driver uses a different hardware path, as required by atomic read-modify-writes so that care must be taken to use atomic operations only when necessary. Also note that this decision is made statically at compile time by the driver so even a single atomic operation in an unlikely taken branch will result it degraded performance. This is another reason why to use atomic counters to implement append/consume buffers instead of using read/write image atomics.</p>
<h3><a title="GL_ARB_shader_atomic_counters" href="http://www.opengl.org/registry/specs/ARB/shader_atomic_counters.txt" target="_blank">ARB_shader_atomic_counters</a></h3>
<p>This the other long waited feature that I also suggested and was still missing from OpenGL but was available in D3D11. The specification was actually ongoing for a long time now (about a year) and it even appeared for a while in AMD&#8217;s OpenGL drivers sometimes as EXT, sometimes as ARB extension. The extension provides API to access a number of hardware atomic counters that provide efficient counter operations on a GPU global scale. Atomic counters come handy in many cases like append/consume buffers or indirect draw buffer construction.</p>
<p>The extension provides access to these atomic counters from GLSL and also makes it possible to back them up with buffer objects so after OpenGL draw calls the value of the counters is preserved in these buffers for later use.</p>
<p>The OpenGL implementation is superior compared to D3D&#8217;s as it provides access to atomic counters from all shader stages, with caveats of course as, it was mentioned in the previous section, the side effects made possible with read/write images and atomic counters require special care in case of fragment and vertex shaders as they may result in invalid rendering and/or lower performance.</p>
<p>On hardware vendor implementations, it must be noted that atomic counters are much, much more faster than read/write image atomics, at least on AMD hardware which has dedicated hardware for atomic counters. On NVIDIA hardware, though, it seems that there is no different hardware path for atomic counters as their performance is roughly the same as in case of read/write image atomics.</p>
<p>The dedicated hardware implementation of atomic counters, however, comes with a trade-off as the number of atomic counters is severely limited on AMD hardware, but one can still use read/write image atomics if ran out of atomic counters.</p>
<h3><a title="GL_ARB_conservative_depth" href="http://www.opengl.org/registry/specs/ARB/conservative_depth.txt" target="_blank">ARB_conservative_depth</a></h3>
<p>This is another extension I&#8217;ve suggested and that fills another functionality hole compared to D3D11. The extension is actually an ARBified version of <a title="GL_AMD_conservative_depth" href="http://www.opengl.org/registry/specs/AMD/conservative_depth.txt" target="_blank">AMD_conservative_depth</a> that extends the application developer&#8217;s control over eary depth and stencil tests. <a title="GL_ARB_shader_image_load_store" href="http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt" target="_blank">ARB_shader_image_load_store</a>  already provides a way to force or disable eary-Z and this extension provides further modes that provide a hint to the driver about how depth is modified in a fragment shader that outputs depth. This passes enough information to the GL implementation to activate some early depth test optimizations safely while still preserving the ability to account the final depth value in the depth test.</p>
<p>The extension exposes the new capability in the form of fragment shader input layout qualifiers called &#8220;depth_any&#8221;, &#8220;depth_greater&#8221;, &#8220;depth_less&#8221; and &#8220;depth_unchanged&#8221;. The interesting ones are the one that assume a greater or less depth value as output and provide the ability to early reject groups of fragments using Hi-Z and early-Z even when depth is modified. This technique can greatly improve the rendering performance of volumetric particles, decals and billboards.</p>
<p>As far as I can tell, though, the extension provides performance benefits only the AMD hardware currently as NVIDIA hardware does not have such functionality thus using the extension would still force NVIDIA GPUs to disable early-Z in case the fragment shader outputs a depth value, but future hardware may change this.</p>
<h3><a title="GL_ARB_shading_language_420pack" href="http://www.opengl.org/registry/specs/ARB/shading_language_420pack.txt" target="_blank">ARB_shading_language_420pack</a></h3>
<p>This is a strangely named extension that provides a lot of improvements to GLSL. These are mostly API improvements only, but have a great value when looking at source code maintainability and resource management.</p>
<p>I think the most useful addition of the extension is the &#8220;binding&#8221; layout qualifier that I referred to as ARB_explicit_sampler_location and ARB_explicit_uniform_block_index in my <a title="Suggestions for OpenGL 4.2 and beyond." href="http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/" target="_blank">suggestion list</a>. This enables shader writers to explicitly bind a uniform block binding index to a uniform block as well as explicitly bind sampler, texture and image binding points to a sampler or image variable.</p>
<p>Besides that, the extension adds other minor improvements, like implicit conversion of return values of functions, UTF-8 character set support, C-style initializer list support and scalar swizzle operators.</p>
<h3><a title="GL_ARB_internalformat_query" href="http://www.opengl.org/registry/specs/ARB/internalformat_query.txt" target="_blank">ARB_internalformat_query</a></h3>
<p>This is another kind of strangely named extension that was meant to provide the possibility to query information about the internal format of textures, however, it actually failed it as it provides only the ability to query the maximum number of samples available for different texture formats.</p>
<p>The extension was ambitious as it planned to provide internal format information like the ability to query the actual internal format used, whether the format is renderable, accessible in a particular shader stage, whether it can be used as read/write image, and even to provide performance hint about using a particular texture internal format. Unfortunately all these were left for a future extension.</p>
<h3><a title="GL_ARB_map_buffer_alignment" href="http://www.opengl.org/registry/specs/ARB/map_buffer_alignment.txt" target="_blank">ARB_map_buffer_alignment</a></h3>
<p>This is the last new extension introduced in OpenGL 4.2 that trivially adds the requirement to the pointer returned by buffer mapping commands that they provide a minimum of 64 byte alignment to support processing of the data directly with special CPU instructions like SSE or AVX. This can provide further performance increase when client is modifying buffer data.</p>
<h2>Conclusion</h2>
<p>OpenGL 4.2 again proven that OpenGL is not dead, but in fact plans to be again the ultimate choice of 3D API by pushing the exposed hardware capabilities over the line set by D3D11. When thinking about the list of expected extensions I presented in my earlier article, <a title="Suggestions for OpenGL 4.2 and beyond" href="http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/" target="_blank">Suggestions for OpenGL 4.2 and beyond</a> we can see that OpenGL 4.2 fulfilled all my expectations and even my wish list was partly fulfilled, but here&#8217;s the list for a better overview:</p>
<p><strong>My expectations for OpenGL 4.2:</strong></p>
<pre style="background-color: #ccffcc;"><strong>GL_EXT_shader_image_load_store</strong>
<span>- added in the form of GL_ARB_shader_image_load_store</span></pre>
<pre style="background-color: #ccffcc;"><strong>GL_ARB_shader_atomic_counters</strong>
<span>- added as is</span></pre>
<pre style="background-color: #ccffcc;"><strong>GL_ARB_instanced_arrays2</strong>
<span>- added in the form of GL_ARB_base_instance</span></pre>
<pre style="background-color: #ccffcc;"><strong>GL_ARB_explicit_sampler_location</strong>
<span>- added in the form of GL_ARB_shading_language_420pack</span></pre>
<pre style="background-color: #ccffcc;"><strong>GL_ARB_explicit_uniform_block_index</strong>
<span>- added in the form of GL_ARB_shading_language_420pack</span></pre>
<p><strong>My personal wish-list for OpenGL 4.2:</strong></p>
<pre style="background-color: #ffcccc;"><strong>GL_ARB_draw_indirect2</strong>
<span>- still missing, though partly available though <a title="GL_AMD_multi_draw_indirect" href="http://www.opengl.org/registry/specs/AMD/multi_draw_indirect.txt" target="_blank">GL_AMD_multi_draw_indirect</a></span></pre>
<pre style="background-color: #ffcccc;"><strong>GL_ARB_direct_state_access</strong>
<span>- still missing, however, there is hope that it will be included in the next release where the ARB plans to rewrite the whole structure of the core specification</span></pre>
<pre style="background-color: #ccffcc;"><strong>GL_NV_texture_barrier</strong>
<span>- not in core but it is implicitly subsumed by GL_ARB_shader_image_load_store, they say</span></pre>
<pre style="background-color: #ccffcc;"><strong>GL_AMD_conservative_depth</strong>
<span>- added in the form of GL_ARB_conservative_depth, despite lack of NVIDIA support</span></pre>
<pre style="background-color: #ffcccc;"><strong>GL_ARB_texture_gather_lod</strong>
<span>- still missing, because of lack of supporting hardware</span></pre>
<pre style="background-color: #ffcccc;"><strong>GL_NV_copy_image</strong>
<span>- still missing, even though it could be a good API improvement</span></pre>
<pre style="background-color: #ffcccc;"><strong>GL_EXT_texture_filter_anisotropic</strong>
<span>- still missing, as I was informed, because of patent issues</span></pre>
<pre style="background-color: #ffcccc;"><strong>GL_ARB_shader_stencil_export</strong>
<span>- still missing, most probably because of lack of NVIDIA hardware support</span></pre>
<pre style="background-color: #ffcccc;"><strong>GL_AMD_depth_clamp_separate</strong>
<span>- still missing, most probably because of lack of NVIDIA hardware support</span></pre>
<pre style="background-color: #ffcccc;"><strong>GL_AMD_transform_feedback3_lines_triangles</strong>
<span>- still missing, most probably because of lack of NVIDIA hardware support</span></pre>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/08/an-introduction-to-opengl-4-2/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Pocket Soccer&#8217;s story so far&#8230;</title>
		<link>http://rastergrid.com/blog/2011/07/pocket-soccers-story-so-far/</link>
		<comments>http://rastergrid.com/blog/2011/07/pocket-soccers-story-so-far/#comments</comments>
		<pubDate>Wed, 06 Jul 2011 18:32:16 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[Android Market]]></category>
		<category><![CDATA[button]]></category>
		<category><![CDATA[football]]></category>
		<category><![CDATA[game]]></category>
		<category><![CDATA[market]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[soccer]]></category>
		<category><![CDATA[store]]></category>
		<category><![CDATA[video]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=585</guid>
		<description><![CDATA[Almost four months passed since I&#8217;ve released my first Android game called Pocket Soccer. Game was very well received and even though its popularity showed some decline lately. In this post I would like to present some data about the lifecycle of Pocket Soccer so far, including my experience with alternative markets. Also, I will&#8230;]]></description>
				<content:encoded><![CDATA[
<p><img class="alignleft" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/03/pocket-soccer-promo-graph.png" alt="" width="180" height="120" />Almost four months passed since I&#8217;ve released my first Android game called Pocket Soccer. Game was very well received and even though its popularity showed some decline lately. In this post I would like to present some data about the lifecycle of Pocket Soccer so far, including my experience with alternative markets. Also, I will present some of the achievements it got. Finally, I would like to talk about the future development of Pocket Soccer that many people were interested in.<br />
<span id="more-585"></span></p>
<h2>The Evolution</h2>
<p><a href="http://www.babble.com/products/kids-products/android-apps-for-kids-2011"><img class="alignright" src="http://www.babble.com/badges/images/badge_droidkids.png" alt="We were chosen as one of Babble.com's top 25 droid apps for kids! Click this badge to learn more." width="234" height="333" /></a>I started the development somewhen in February, Pocket Soccer being my first Android game and, in general, my first real Android application. Actually, I was experimenting with very simple Android demo applications using the emulator already earlier, however, real development was made possible only after January, when I bought my Galaxy S. After roughly a month, I&#8217;ve published the first version of Pocket Soccer on the Android Market.</p>
<p>At that time, Pocket Soccer was a bit far from a polished, competitive game, there was no social network integration, no leaderboard, even the training mode was not available yet and, of course it contained quite some bugs. The reason behind the later was that I hadn&#8217;t have enough experience yet with the platform and I was also unable to test it on many devices as I only own a single Android device, so I had to ask from time to time my friends to give it a try (the later being still true as I did not have yet the chance to invest on further devices).</p>
<p>Shortly after, Ben Camenker from Scoreloop contacted me with the offer of integrating the Scoreloop social networking framework into Pocket Soccer. Of course, I was interested and thanks to Ben not so much after the first release I&#8217;ve integrated Scoreloop into Pocket Soccer, bringing the still great performing rating system and online leaderboard to the game.</p>
<p>Pocket Soccer continued its evolution with the addition of the training mode, that I actually wanted to include from the beginning, but didn&#8217;t have the necessary time to do it in the timeframe I wanted to keep. Meanwhile, the game also became much more mature as I was able to track down most of the issues the players reported.</p>
<p>While there are still some bugs hanging around (many of them deeply lying inside the 3rd party libraries and frameworks the game uses) I think the game reached some sort of completeness, even though I&#8217;m still planning to add new features to it (however, I also want to allocate some time for a next game as well).</p>
<h2>The Numbers</h2>
<p>Now, I would like to talk a little bit about the numbers: download counts, active installs and leaderboard members. The game started with very moderate download counts at the beginning. Actually this is the most stressful period of a mobile game developer as here it will turn out whether the game will be noticed or it will go down the drain in the huge pool of apps available for such a popular platform as Android is.</p>
<p>For Pocket Soccer, the great entry happened at the beginning of May, roughly eight weeks after the first release. Since then, more than 350.000 people downloaded Pocket Soccer and there are even currently over 160.000 active installs and around 10.000 active players each day. This result exceeded all of my expectations and I would like to say thanks to all the poeple who downloaded the game and I hope that, at least most of you, have or had a great time playing it. If you don&#8217;t mind, I would also like to share some figures with you:</p>
<div id="attachment_598" class="wp-caption aligncenter" style="width: 569px"><a href="http://rastergrid.com/blog/wp-content/uploads/2011/07/download_stats.png"><img class="size-full wp-image-598  " src="http://rastergrid.com/blog/wp-content/uploads/2011/07/download_stats.png" alt="Download statistics" width="559" height="249" /></a><p class="wp-caption-text">Pocket Soccer total downloads (blue) and active installs (red) over time.</p></div>
<p>Here, I would like to take the chance and talk a little bit about the alternative Android market places. This may be interesting mostly to fellow developers who are planning to submit their applications to alternative markets. Well, if you ask me, I would say that it not worth the effort, but don&#8217;t believe me, believe the numbers&#8230;</p>
<p>Shortly after releasing Pocket Soccer on the Android Market, I gave a try to a few alternative markets and download sites, namely to AndroidPIT, SlideME, AppsLib and the Amazon AppStore, Pocket Soccer appearing on the later from the market&#8217;s launch. At the beginning even those few downloads meant a lot for me, but after almost four months, the picture is quite disappointing. Here you can see the number of downloads Pocket Soccer acquired at each market so far:</p>
<table style="width: 100%;" border="0">
<tbody>
<tr>
<td style="text-align: center; background-color: #82b747;"><strong>Distribution Network Name</strong></td>
<td style="text-align: center; background-color: #82b747;"><strong>Download count</strong></td>
</tr>
<tr>
<td style="text-align: center;">Android Market</td>
<td style="text-align: right; padding-right: 100px;">352 260</td>
</tr>
<tr>
<td style="text-align: center;">SlideME</td>
<td style="text-align: right; padding-right: 100px;">6 363</td>
</tr>
<tr>
<td style="text-align: center;">AndroidPIT</td>
<td style="text-align: right; padding-right: 100px;">5 117</td>
</tr>
<tr>
<td style="text-align: center;">AppsLib</td>
<td style="text-align: right; padding-right: 100px;">1 660</td>
</tr>
<tr>
<td style="text-align: center;">Amazon AppStore</td>
<td style="text-align: right; padding-right: 100px;">1 301</td>
</tr>
</tbody>
</table>
<p>As you can see, the four alternative markets barely contribute to a 4% of the number of total downloads, even though SlideME and the Amazon AppStore can be really thought of as big players of the alternative market business.</p>
<p>I don&#8217;t want to discourage any of the developers to give a try to alternative markets as others may have much more success with them, I just wanted to share my personal experience with them. If anybody has much different figures then I would be really interested in it so please take your time and write a comment about it.</p>
<h2>The Achiements</h2>
<p>Besides the spectacular number of downloads and active players, Pocket Soccer received a lot of other achievements, including many positive ratings and comments, as well as several great reviews, including some video reviews that I would like to share with you (my personal favorite is the second one as this was the first time I&#8217;ve seen the game running on a tablet):</p>
<p style="text-align: center;">
<p><a href="http://www.youtube.com/watch?v=-qqoutmSaKs&#038;fmt=18">http://www.youtube.com/watch?v=-qqoutmSaKs</a></p>
</p>
<p style="text-align: center;">
<p><a href="http://www.youtube.com/watch?v=wkgei2yxTP8&#038;fmt=18">http://www.youtube.com/watch?v=wkgei2yxTP8</a></p>
</p>
<p>Another thing that I&#8217;m very proud of is that Pocket Soccer is in the top 100 free games in the arcade &amp; action category for months now, peaking around the #50 place which I take as a great compliment being a hobbyist and a rookie in the Android world.</p>
<p>Finally, I would like to mention one particular achievement that I&#8217;m really proud of, namely that Pocket Soccer was chosen #12 into the Top 25 Droid Apps for Kids in 2011 at Babble (the promotion badge you&#8217;ve seen at the beginning of this article).</p>
<h2>The Future</h2>
<p>While the story of Pocket Soccer already had quite a lot of things to tell, the story is yet far from over. I&#8217;ll continue the development on it, even if only with limited time quota, as there are a lot of requests coming from the players and I will continue to try my best to make them pleased, at least, as much as my time allows.</p>
<p>Here, I would like to take the chance to talk about the most requested features to try to explain why they are not available yet and/or when they can be expected:</p>
<ul>
<li>Tournament Mode (or World Cup, Campaign or Career mode if you wish)</li>
<li>Networked Multiplayer Mode (bluetooth, WiFi and/or online)</li>
</ul>
<p>Implementing a tournament mode may not seem like a complex, and actually it may or may not be in practice either, however, please consider something: I&#8217;m not a full-time game developer, I&#8217;m working at a multinational company 40 hours a week and lately I was also occupied with my exams and master thesis because I&#8217;ve just acquired my master degree lately. Spare time is really of scarcity in my life, so please forgive me for not making it already. I&#8217;m seriously willing to make it, just I have my own limits as well.</p>
<p>About the networked multiplayer mode, well, I have problems with making that. I don&#8217;t think I&#8217;ll make an online multiplayer mode because, besides that being the biggest effort, I&#8217;m afraid the latency, especially in case of mobile internet, would be highly prohibitive, as for such a fast paced game like Pocket Soccer, even a small delay could be a show stopper. So I would go rather in the direction of a bluetooth or WiFi based multiplayer mode.</p>
<p>Finally, I would like to make some further games that I need to spend time with as well, so I hope you understand now why those so asked features didn&#8217;t make their way into the game yet. This won&#8217;t stay so forever and I&#8217;m really willing to implement all the features you request.</p>
<p>Summing up everything, Pocket Soccer brought a great amount of achievements to me and, hopefully, a great amount of fun to the players! And, of course, the story does not end here as I will continue the development&#8230;</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/07/pocket-soccers-story-so-far/feed/</wfw:commentRss>
		<slash:comments>64</slash:comments>
		</item>
		<item>
		<title>Multi-Draw-Indirect is here</title>
		<link>http://rastergrid.com/blog/2011/06/multi-draw-indirect-is-here/</link>
		<comments>http://rastergrid.com/blog/2011/06/multi-draw-indirect-is-here/#comments</comments>
		<pubDate>Sun, 19 Jun 2011 15:04:12 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[atomic counter]]></category>
		<category><![CDATA[culling]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[indirect draw]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[synchronization]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=578</guid>
		<description><![CDATA[You might remember that I wrote an article about my suggestions for OpenGL 4.2 and beyond. One of the features that I recommended to be added to OpenGL was a yet non-existent extension called GL_ARB_draw_indirect2 which suggested the addition of new draw commands that are similar in fashion to the ancient MultiDraw* commands but they&#8230;]]></description>
				<content:encoded><![CDATA[
<p>You might remember that I wrote an article about my <a href="http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/">suggestions for OpenGL 4.2 and beyond</a>. One of the features that I recommended to be added to OpenGL was a yet non-existent extension called GL_ARB_draw_indirect2 which suggested the addition of new draw commands that are similar in fashion to the ancient MultiDraw* commands but they are meant to build on top of the indirect drawing mechanism introduced by the <a title="GL_ARB_draw_indirect" href="http://www.opengl.org/registry/specs/ARB/draw_indirect.txt" target="_blank">GL_ARB_draw_indirect</a> extension and OpenGL 4.0. I contacted both AMD and NVIDIA with my idea with different levels of success, but AMD saw the potential in the functionality and they actually implemented it in the form of <a title="GL_AMD_multi_draw_indirect" href="http://www.opengl.org/registry/specs/AMD/multi_draw_indirect.txt" target="_blank">GL_AMD_multi_draw_indirect</a>, well at least partially&#8230;</p>
<p><span id="more-578"></span></p>
<h2>The proposition</h2>
<p>First of all, let&#8217;s recap what exactly <a title="GL_ARB_draw_indirect" href="http://www.opengl.org/registry/specs/ARB/draw_indirect.txt" target="_blank">GL_ARB_draw_indirect</a> brought us:</p>
<blockquote><p>This extension provides a mechanism for supplying the arguments to a DrawArraysInstanced or DrawElementsInstancedBaseVertex from buffer object memory. This is not particularly useful for applications where the CPU knows the values of the arguments beforehand, but is helpful when the values will be generated on the GPU through any mechanism that can write to a buffer object including image stores, atomic counters, or compute interop. This allows the GPU to consume these arguments without a round-trip to the CPU or the expensive synchronization that would involve. This is similar to the DrawTransformFeedbackEXT command from EXT_transform_feedback2, but offers much more flexibility in both generating the arguments and in the type of Draws that can be accomplished.</p></blockquote>
<p>If you know my <a href="http://rastergrid.com/blog/downloads/nature-demo/">Nature</a> or <a href="http://rastergrid.com/blog/downloads/mountains-demo/">Mountains</a> demo you know that I have dug deeply into the domain of GPU based culling algorithms. In case of these algorithms, the GPU consumes the scene data and performs visibility determination over a list of objects and writes out the culled data into a buffer object. The problem is that those algorithms that I&#8217;ve implemented in the aforementioned demo applications work only for instanced objects. In order to make it possible for the algorithms to be able to efficiently work with arbitrary object sets we still need a lot of new features (some of them may even require newer GPU generations). The most important ones are discussed in detail in the following sections.</p>
<h4>Atomic counters</h4>
<p>This feature enables us to use the global atomic counters present on the GPU, which have, at least on the AMD implementation, dedicated hardware to provide efficient chip-wide access to these counters from any shader. This can be expected in the near future in the form of the yet not published GL_ARB_shader_atomic_counter extension. The extension also provides a way to back up the atomic counter values in buffer object memory.</p>
<p>The currently available GPU based culling algorithms, including those presented in my demos, bypass the lack of this feature by using transform feedback to capture the culled data which has implicit atomic counters that are associated with each output stream. However, this has a few drawbacks. First of all, transform feedback is not as efficient if one would use atomic counters together with the random memory read/write mechanism exposed by the <a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">GL_EXT_shader_image_load_store</a> extension. This is because of its nature, geometry shaders and thus transform feedback has to preserve the original order of the incoming primitives. This is why the first GPU generation with geometry shader support had so much performance problems as the use of geometry shaders easily became the bottleneck of the rendering. Besides the performance benefits of having our own atomic counters, there are a lot of other reasons, like the ability to implement an append/consume buffer, if I&#8217;m allowed to use the D3D terminology.</p>
<p>It may seem that I went a bit off-topic, however, just think about how atomic counters can interact so nicely with indirect drawing. There is the instance count field of the indirect draw commands, what if we bind that address as the back-up buffer memory for the atomic counter? Yes, we can save that costly asynchronous query to get the number of visible objects that we did otherwise in case of applying an ICR or Hi-Z map based occlusion culling. You may say that you can achieve the same thing with atomic read/writes as provided by the <a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">GL_EXT_shader_image_load_store</a>. Well, that&#8217;s true, unless the additional performance hit by doing atomic memory writes is acceptable (atomic counters are much, much faster, however, it is true that in case of a GPU based culling algorithm, those few writes shouldn&#8217;t be the bottleneck). But now let us think more deeply into the problem. If we can use atomic read/writes to count the instances, as it is present in the indirect draw command in the buffer object, then what if we count the number of draw commands written into the indirect draw buffer using atomic counters? And here we are, we have the first building block of a GPU based culling algorithm that can handle arbitrary data sets.</p>
<h4>Multi-Draw-Indirect phase 1</h4>
<p>Now let&#8217;s say we somehow managed to generate an indirect draw buffer object with the list of the instanced draw command arguments necessary to render the visible objects, no matter whether we used the OpenGL toolset as in my demos or we used some compute API like OpenCL. Now somehow we have to initiate the drawing. We can do this by issuing several DrawArraysIndirect or DrawElementsIndirect command based on how many instanced draw command arguments we&#8217;ve generated.</p>
<p>But what if we could do this with a single command? This is where <a title="GL_AMD_multi_draw_indirect" href="http://www.opengl.org/registry/specs/AMD/multi_draw_indirect.txt" target="_blank">GL_AMD_multi_draw_indirect</a> comes into picture and that&#8217;s what AMD implemented for us. We can actually do this by using one of the MultiDraw*Indirect commands introduced by the extension.</p>
<p>The best thing in it is that in case of lack of hardware support for it, the driver can still implement it by simply making a loop that calls the appropriate Draw*Indirect commands so every hardware that supports <a title="GL_ARB_draw_indirect" href="http://www.opengl.org/registry/specs/ARB/draw_indirect.txt" target="_blank">GL_ARB_draw_indirect</a> can support <a title="GL_AMD_multi_draw_indirect" href="http://www.opengl.org/registry/specs/AMD/multi_draw_indirect.txt" target="_blank">GL_AMD_multi_draw_indirect</a>, and in case the hardware actually supports the functionality, then we can get a slight performance increase for free.</p>
<h4>Multi-Draw-Indirect phase 2</h4>
<p>While the new extension adds quite some flexibility to the existing indirect drawing mechanism, it still lacks an important feature to become the Holy Grail of GPU based culling and scene management algorithms. We still have to perform an asynchronous query or otherwise determine the number of records written into the indirect draw buffer.</p>
<p>Of course, we can alleviate the problem by always initializing the indirect draw buffer with zero values (so that if one would issue an indirect draw command using any of the data in the buffer no actual rendering would take place) and then simply using a MultiDraw*Indirect command passing a primcount argument that is equal to the theoretical maximum of generated records. However, this might result in a performance decrease, especially if this theoretical maximum value is much bigger than the actual draw commands present in the buffer.</p>
<p>In order to circumvent this problem, we need some mechanism that allows us to also source the primcount argument of the MultiDraw*Indirect commands from buffer object memory. While such functionality is not exposed yet by any of the major graphics APIs (and may not be supported by current hardware) this could be the next major step towards a fully self-feeding renderer that handles graphics related data on a much higher level beyond triangles and pixels.</p>
<h2>Conclusion</h2>
<p>While the indirect drawing mechanism introduced with OpenGL 4.0 is just a very little part of the feature set introduced by Shader Model 5.0 GPUs, it has still a lot of room for improvement and evolution ahead. AMD made the first step with <a title="GL_AMD_multi_draw_indirect" href="http://www.opengl.org/registry/specs/AMD/multi_draw_indirect.txt" target="_blank">GL_AMD_multi_draw_indirect</a> and I really hope that indirect drawing and other GPU self-feed mechanisms will gain more developer attention in the near future.</p>
<p>Finally, I would like to thank to Graham Sellers, the creator of the extension, Pierre Bourdier for his support on promoting the new functionality and all the engineers at AMD who have contributed to the specification and the implementation work behind it. I&#8217;m really glad to see that they take the word of the developers in which direction they improve their OpenGL support.</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/06/multi-draw-indirect-is-here/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Just released my first Android game</title>
		<link>http://rastergrid.com/blog/2011/03/just-released-my-first-android-game/</link>
		<comments>http://rastergrid.com/blog/2011/03/just-released-my-first-android-game/#comments</comments>
		<pubDate>Mon, 14 Mar 2011 10:46:31 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[AndEngine]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[Android Market]]></category>
		<category><![CDATA[Box2D]]></category>
		<category><![CDATA[button]]></category>
		<category><![CDATA[football]]></category>
		<category><![CDATA[game]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[market]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[soccer]]></category>
		<category><![CDATA[store]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=564</guid>
		<description><![CDATA[I am happy to announce that I&#8217;ve just published my first mobile game on the Android Market. I have experimented with creating games earlier, especially targeting the PC platform, however I never accomplished to release such one due to lack of resources, especially in the domain of artwork. Hence I turned to mobile platforms as&#8230;]]></description>
				<content:encoded><![CDATA[
<p><img class="alignleft" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/03/pocket-soccer-promo-graph.png" alt="" width="180" height="120" /> I am happy to announce that I&#8217;ve just published my first mobile game on the Android Market. I have experimented with creating games earlier, especially targeting the PC platform, however I never accomplished to release such one due to lack of resources, especially in the domain of artwork. Hence I turned to mobile platforms as there even a one-man-show game can bring loads of fun time to the players. So here we are now: after loads of abandoned PC projects, here I have my first published game called &#8220;Pocket Soccer&#8221;.</p>
<p><span id="more-564"></span>The game itself is a reinterpretation of a classic board game called button football that is very popular in my home country. The key difference is that the game does not contain the many rules like the original one to provide a smoother and more fast-paced game-play. Each player has three buttons that they control by grabbing and throwing them in the desired direction. If one manages to push the soccer ball into the opposite player&#8217;s goal then he or she gets one point. The first one to reach ten points wins the match.</p>
<p>The game is turn based so each player has five seconds to move with one of his/her buttons. While, in my opinion, the game is more fun in two-player mode when two buddies can play against each other on the same device, the game also features a pretty smart AI with three difficulty levels. But that&#8217;s enough talk, maybe some screenshots say more:</p>
<div class="wp-caption aligncenter" style="width: 610px"><img src="http://www.rastergrid.com/blog/wp-content/uploads/2011/03/pocket-soccer-screenshot1.jpg" alt="" width="600" height="360" /><p class="wp-caption-text">Starting lineup in a match between Spain and Portugal.</p></div>
<p>Besides the possibility to choose between more than sixty countries to play with, the game has also other changeable assets like different soccer fields and balls. These also come with different physical properties that slightly change the game-play. While some of these assets come out-of-the-box, some others are only accessible if you unlock them. You can do so by playing and/or winning a number of matches in the various game modes. The prerequisites of each asset can be checked in the appropriate menu and you can also check your current accomplishments by tapping the statistics button in the main menu.</p>
<div class="wp-caption aligncenter" style="width: 610px"><img src="http://www.rastergrid.com/blog/wp-content/uploads/2011/03/pocket-soccer-screenshot2.jpg" alt="" width="600" height="360" /><p class="wp-caption-text">Another match between Peru and Uruguay. The player with Uruguay is about to move.</p></div>
<p>The game should work well on most Android devices. It requires only API level 4 (Android 1.6). I&#8217;ve mainly tested it on my Samsung Galaxy S, which of course runs it smoothly, but I tested it also on other devices like the Motorola Droid, ZTE Blade (San Fransisco) and Samsung Galaxy Spica. The game worked well on the Droid and especially smooth on the Blade, which surprised me a little bit for such a cheap phone. In case of the Spica, it already felt that the phone was not made for gaming, however, at the end I managed to optimize the game enough so that it provides a good user experience on that phone as well.</p>
<div class="wp-caption aligncenter" style="width: 610px"><img src="http://www.rastergrid.com/blog/wp-content/uploads/2011/03/pocket-soccer-menu.jpg" alt="" width="600" height="360" /><p class="wp-caption-text">The main menu. You can scroll left and right to access the additional menu items and you can check your statistics anytime by tapping its icon in the bottom-right corner.</p></div>
<p>I tried to make the game look like the least possible like &#8220;programmer art&#8221; and I home I managed to do so. In order to have a fast time-to-market with my first game, I&#8217;ve chosen to use a game engine framework first, rather than writing my own. Having a lack of native game engines for Android, I settled down with <a title="AndEngine" href="http://www.andengine.org/" target="_blank">AndEngine</a> as it looked to have a fast learning curve and actually it has (other option was <a title="libgdx" href="http://code.google.com/p/libgdx/" target="_blank">libgdx</a>). While I&#8217;m not a great fan of pre-cooked solutions, AndEngine worked out pretty well with its native <a title="Box2D" href="http://www.box2d.org/" target="_blank">Box2D</a> accessible over JNI, however, I also had some bad experiences. I will write another post about my development experiences with Android and AndEngine.</p>
<h2>Summary</h2>
<p>To sum it up, I managed to publish my first game and I hope you&#8217;ll like it. The game is ad supported, so you can download it for FREE from the android market:</p>
<p style="text-align: center;"><a href="http://market.android.com/details?id=com.rastergrid.game.pocketsoccer"><img class="aligncenter" title="http://market.android.com/details?id=com.rastergrid.game.pocketsoccer" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/03/pocket-soccer-qr.png" alt="" width="258" height="258" /></a></p>
<p style="text-align: center;"><a href="http://market.android.com/details?id=com.rastergrid.game.pocketsoccer"><img class="aligncenter" title="http://market.android.com/details?id=com.rastergrid.game.pocketsoccer" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/03/download-icon.png" alt="" width="258" height="55" /></a></p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/03/just-released-my-first-android-game/feed/</wfw:commentRss>
		<slash:comments>34</slash:comments>
		</item>
		<item>
		<title>The blog reached 100.000 visitors!</title>
		<link>http://rastergrid.com/blog/2011/03/the-blog-reached-100-000-visitors/</link>
		<comments>http://rastergrid.com/blog/2011/03/the-blog-reached-100-000-visitors/#comments</comments>
		<pubDate>Sun, 06 Mar 2011 17:42:06 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=558</guid>
		<description><![CDATA[I am happy to announce that today, little more than a year after the start, the blog reached 100.000 visitors. I would like to thank you for all the people who visited, commented and helped the evolution of the site! Again, there was quite a pause in the flow of articles. This is because I&#8217;m&#8230;]]></description>
				<content:encoded><![CDATA[
<p>I am happy to announce that today, little more than a year after the start, the blog reached 100.000 visitors. I would like to thank you for all the people who visited, commented and helped the evolution of the site!</p>
<p><span id="more-558"></span>Again, there was quite a pause in the flow of articles. This is because I&#8217;m currently working on a project that consumes most of my spare time and unfortunately takes away time from the update of the blog as well. I would like to take the opportunity and say a few words about this project&#8230;</p>
<p>During the last month I worked on the development of an Android game that I plan to release in the near future. It is my first serious try in developing games for handhelds and I hope people will like it. Unfortunately, I&#8217;m also one of those fanatic developers who were always interested in game development and always targeted the high end PC gaming industry.</p>
<p>Like many fellow developers, despite my knowledge and enthusiasm, I also struggled with lack of resources, especially considering non-programming related stuff like artwork, which resulted in several partly finished but abandoned projects. This is the reason why I finally settled down in first concentrating on less-demanding yet interesting portions of the industry: mobile game development.</p>
<p>While making games for mobile platforms is usually less ground-breaking and revolutionary, at least I can finally publish something to the end-users in the hope that the gathered experience and notoriety may enable me to target also PC gaming in the future. But this is not all about it&#8230; In fact, despite the difficulty of targeting an embedded platform, developing games for phones is great fun and I can advise it to every developer to at least try it out.</p>
<p>Stay tuned for news about the project, especially if you own an Android phone!</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/03/the-blog-reached-100-000-visitors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Frei-Chen edge detector</title>
		<link>http://rastergrid.com/blog/2011/01/frei-chen-edge-detector/</link>
		<comments>http://rastergrid.com/blog/2011/01/frei-chen-edge-detector/#comments</comments>
		<pubDate>Sun, 30 Jan 2011 15:27:43 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Samples]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[detection]]></category>
		<category><![CDATA[edge]]></category>
		<category><![CDATA[filter]]></category>
		<category><![CDATA[fragment shader]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[OpenGL]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=532</guid>
		<description><![CDATA[In this article, I would like to present you an edge detection algorithm that shares similar performance characteristics like the well-known Sobel operator but provides slightly better edge detection and can be seamlessly extended with little to no performance overhead to also detect corners alongside with edges. The algorithm works on a 3&#215;3 texel footprint&#8230;]]></description>
				<content:encoded><![CDATA[
<div class="wp-caption alignleft" style="width: 160px"><img title="Frei-Chen edge detector" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/frei-chen.png" alt="Frei-Chen edge detector" width="150" height="150" /><p class="wp-caption-text">Frei-Chen edge detector</p></div>
<p>In this article, I would like to present you an edge detection algorithm that shares similar performance characteristics like the well-known Sobel operator but provides slightly better edge detection and can be seamlessly extended with little to no performance overhead to also detect corners alongside with edges. The algorithm works on a 3&#215;3 texel footprint similarly like the Sobel filter but applies a total of nine convolution masks over the image that can be used for either edge or corner detection. The article presents the mathematical background that is needed to implement the edge detector and provides a reference implementation written in C/C++ using OpenGL that showcases both the Frei-Chen and the Sobel edge detection filter applied to the same image.</p>
<p><span id="more-532"></span>I met with the algorithm during my computer graphics studies when one of my homeworks was to implement the Frei-Chen edge detector. As I already mentioned it in an earlier post, I am willing to provide source code for more basic graphics algorithms after seeing the success of <a title="Efficient Gaussian blur with linear sampling" href="http://rastergrid.com/blog/2010/09/efficient-gaussian-blur-with-linear-sampling/">my former post</a> about the Gaussian blur filter. This one is a very similarly basic article, taking in consideration it shows only how to apply a particular convolution filter based algorithm on a still image, while the possibilities this edge detection algorithm brings is a more complex topic that is out of the scope of this article.</p>
<p>As the provided reference implementation also showcases applying the Sobel operator on an image, I would like to present that first and then continue with the presentation of the Frei-Chen masking set. Those who are already well familiar with edge detection and the Sobel operator can skip the following two sections.</p>
<h2>Edge detection</h2>
<p>Before getting deep into how to implement edge detectors, let&#8217;s first talk about what is an edge detector and why we need it.</p>
<p>In general, edge detection is one of the most fundamental image processing tools, particularly used in the areas of feature detection and feature extraction. The aim of the technique is to identify points of a digital image at which the intensity changes sharply. The reason of these intensity changes can be either discontinuities in depth, surface orientation, lighting condition changes and many other factors. In the ideal case, the result of applying an edge detector to an image leads us to a set of connected lines or curves that indicate the boundaries of objects.</p>
<p>Not going that far, what an edge detector gives us from the very beginning is a gray-scale image where each pixel intensity tries to approximate the likelihood of whether that pixel belongs to an object boundary. How well a particular algorithm can detect such pixels depends on many factors and usually it is better to try multiple edge detectors in order to choose one that fits most for the particular use case.</p>
<p>After we got this gray-scale image we usually have to define a threshold value that will be used as an acceptance criteria for edge pixels. If the intensity value previously calculated is above this threshold then we accept the pixel as an edge otherwise we don&#8217;t. This part is the so called binarization stage. Additionally, subsequent image processing algorithms can be used to further interpret the edge image.</p>
<p>In computer graphics, edge detection is usually used to implement various image decoration algorithms. Maybe the most popular applications of edge detectors nowadays are non-photorealistic rendering (NPR) and screen-space anti-aliasing techniques.</p>
<h2>Sobel filter</h2>
<p>The Sobel edge detection filter works on a 3&#215;3 texel footprint and applies two convolution masks to the image that are intended to detect horizontal and vertical gradients of the image. The filter weights can be seen in on the figure below:</p>
<p style="text-align: center;"><img class="   aligncenter" title="Sobel masks" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/sobel-masks.png" alt="Sobel masks" width="457" height="119" /></p>
<p>These masks are applied to the intensities gathered from the 3&#215;3 footprint of the image and then are accumulated to produce the final gradient value in the following way:</p>
<p style="text-align: center;"><img class="aligncenter" title="Sobel gradient" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/sobel-grad.png" alt="Sobel gradient" width="321" height="84" /></p>
<p>The actual algorithm can be seen in the accompanying demo that provides a GLSL based implementation. The algorithm is defined to work on one channel image, however it can be easily extended to be applied either separately on a usual three-channel RGB image or by first calculating a gray-scale value based on the color component values. The former is more computationally intensive but usually provides better results by defining the threshold criteria in a way that a pixel is accepted as boundary point if the gradient value is larger than the threshold for either of the color channels. The reference implementation, however is based on the later approach for the sake of simplicity so for each pixel first an intensity value is calculated simply by taking the length of the vector comprised of the RGB components.</p>
<h2>Frei-Chen filter</h2>
<p>The Frei-Chen edge detector also works on a 3&#215;3 texel footprint but applies a total of nine convolution masks to the image. Frei-Chen masks are unique masks, which contain all of the basis vectors. This implies that a 3&#215;3 image area is represented with the weighted sum of nine Frei-Chen masks that can be seen below:</p>
<p style="text-align: center;"><img class="aligncenter" title="Frei-Chen masks" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/frei-chen-masks.png" alt="Frei-Chen masks" width="650" height="237" /></p>
<p>The first four Frei-Chen masks above are used for edges, the next four are used for lines and the last mask is used to compute averages. For edge detection, appropriate masks are chosen and the image is projected onto it. The projection equation is given below:</p>
<p style="text-align: center;"><img class="aligncenter" title="Frei-Chen equation" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/frei-chen-eq.png" alt="Frei-Chen equation" width="631" height="108" /></p>
<p>When we are using the Frei-Chen masks for edge detection we are searching for the cosine defined above and we use the first four masks as the elements of importance so the first sum above goes from one to four.</p>
<p>The application of a threshold and applying the filter to multi-channel images works exactly the same way like in case of the Sobel filter. Similarly, the reference implementation applies the filter on the image as it would be a single-channel image by first calculating the intensity value for each texel in the same fashion like with the previously presented filter.</p>
<h2>Comparison</h2>
<p>Based on my experience, the Frei-Chen edge detector looks better than the Sobel filter as it is less sensitive to noise and is able to detect edges that have small gradients and thus are not found by the basic Sobel filter. For a comparison, you can check the figure below:</p>
<div class="wp-caption aligncenter" style="width: 610px"><a href="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/ed-comparison.png"><img title="Comparison of edge detectors" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/ed-comparison-thumb.png" alt="Comparison of edge detectors" width="600" height="200" /></a><p class="wp-caption-text">Comparison of edge detectors: original image (left), Sobel filter (middle), Frei-Chen filter (right).</p></div>
<p>The reason why the Frei-Chen edge detector seems to work better is because its construction includes a normalization factor as well as other factors that are meant to exclude all other features except edges. A normalization factor can be also added to the Sobel filter by having a third mask that is equivalent with the ninth Frei-Chen mask and is used to normalize the gradients. This could help in reducing the number of undetected edges and the amount of noise that arises from the fact that the Sobel filter calculates absolute gradients rather than relative ones.</p>
<p>From performance point of view, the Frei-Chen edge detector is much more heavyweight as it uses nine masks instead of two, however, in practice, the performance difference between the two is much less taking in consideration that both use the same sized texel footprint and the computational performance of today&#8217;s GPUs is usually much higher than their texture fetching performance.</p>
<h2>Conclusion</h2>
<p>We managed to present an alternative algorithm for the Sobel filter in the form of the Frei-Chen edge detector that, even though having little impact on the performance compared to the Sobel operator, provides better edge detection quality. Having little to no difference in the way how the input data has to be organized and how the result is output, the Frei-Chen edge detector can be easily used as a drop-in replacement for implementations that used the Sobel filter before.</p>
<p><strong>Source code</strong> and <strong>Win32 binary</strong> can be acquired in the <a title="Frei-Chen Edge Detector" href="http://rastergrid.com/blog/downloads/frei-chen-edge-detector/">downloads section</a>.</p>
<p>I would like to encourage those who read this article to add the Frei-Chen edge detector into their software for making a comparison about whether it yields to better results than the Sobel filter for applications that rely on the output of the edge detection filter. I would be interested how the filter works in real-life computer graphics scenarios.</p>
<p>Thanks in advance and hope you enjoyed the article!</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/01/frei-chen-edge-detector/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>Happy New Year!</title>
		<link>http://rastergrid.com/blog/2011/01/happy-new-year/</link>
		<comments>http://rastergrid.com/blog/2011/01/happy-new-year/#comments</comments>
		<pubDate>Mon, 17 Jan 2011 18:18:26 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=522</guid>
		<description><![CDATA[After a quite long, about two months intermission, let me celebrate for a moment the birthday of the blog. It was roughly a year ago that I started the RasterGrid Blogosphere with the wish to have a forum where I can share my experience and ideas with others. Meanwhile, I posted several programming related articles&#8230;]]></description>
				<content:encoded><![CDATA[
<div class="wp-caption alignleft" style="width: 160px"><img class=" " title="Happy Birthday" src="http://www.rastergrid.com/blog/wp-content/uploads/2011/01/cake.png" alt="Happy Birthday" width="150" height="150" /><p class="wp-caption-text">Happy Birthday !!</p></div>
<p>After a quite long, about two months intermission, let me celebrate for a moment the birthday of the blog. It was roughly a year ago that I started the RasterGrid Blogosphere with the wish to have a forum where I can share my experience and ideas with others. Meanwhile, I posted several programming related articles about graphics, multiprocessing and published a few demo applications. However, this is not the best part of it&#8230; The thing that I&#8217;m most proud of is that the blog now has many returning visitors whom I was somehow able to help and that I&#8217;ve received a lot of feedback and ideas from you. I&#8217;m especially pleased because the list of people who contacted me via the comments section or via e-mail contains several great personalities of the industry with who I wouldn&#8217;t be able to meet if I would not start this project. Thank you all for this!</p>
<p><span id="more-522"></span></p>
<p>I&#8217;m really happy to announce that I will continue to bring you topics in the second season of the RasterGrid Blogosphere.</p>
<p>Most probably you&#8217;ve noticed that I haven&#8217;t written for a while. Sorry for that, sometimes it is really difficult to allocate time to sit down and write, especially because I always tried to focus on quality, what I hope I more or less did. I also wanted to publish much more demonstration applications but making these even takes more time than just writing. Actually many topics have been dropped from the content of the previous year just because they would have been valuable only if they would be accompanied by source code as well.</p>
<p>Now let&#8217;s not talk anymore about the past, let&#8217;s concentrate what will come in 2011. While I don&#8217;t want to make any more promises about average number of posts per month as it didn&#8217;t work in the past either, I still would like to indicate about what I will write.</p>
<p>As you may observed, even though the blog was started as a general programming related blog, at the end I could not deny myself and it ended up in more like a graphics programming blog. Sorry for that if you expected more general programming articles as I will most probably continue in this fashion in the new year having the following key topics:</p>
<ul>
<li>More advanced graphics programming articles with focus on OpenGL 3+ and possibly on OpenGL ES and DX11.</li>
<li>More graphics demos showcasing advanced rendering techniques, focusing more on OpenGL 4+ functionality.</li>
<li><strong><span style="color: #800000;">New!</span></strong> More tutorial-style articles about less-complicated graphics topics after seeing the popularity of <a title="Efficient Gaussian blur with linear sampling" href="http://rastergrid.com/blog/2010/09/efficient-gaussian-blur-with-linear-sampling/">my article</a> written about Gaussian blur. Those who read my <a title="Curriculum Vitae" href="http://rastergrid.com/blog/curriculum-vitae/">curriculum vitae</a> know that besides working at Nokia Siemens Networks, I&#8217;m doing my master degree about computer graphics and image processing. Meanwhile, I had to make a few programming exercises related to computer geometry and signal processing and as they are quite basic ones that showcase a particular topic I would like to share these codes with you as well as part of the tutorial sessions.</li>
<li><strong><span style="color: #800000;">New!</span></strong> Another new thing that will come with the new year that is based on your, the reader&#8217;s request is the new <strong><em>Tools</em></strong> category that will contain articles about various development tools, especially focusing on applications and libraries that ease and assist OpenGL development.</li>
<li>As time will allow and as inspiration will be available, I will write also about general programming related articles. The following domains are on my wish-list: OO patterns &#8220;why&#8221; and &#8220;when&#8221;s, more unit testing, asynchronous pipelines and other multiprocessing models, etc.</li>
</ul>
<p>Of course, the list is not something written in stone, every idea is welcomed!</p>
<p>Thanks again for your attendance and interest so far!</p>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2011/01/happy-new-year/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Suggestions for OpenGL 4.2 and beyond</title>
		<link>http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/</link>
		<comments>http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/#comments</comments>
		<pubDate>Sun, 14 Nov 2010 17:15:23 +0000</pubDate>
		<dc:creator>Daniel Rákos</dc:creator>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[callback]]></category>
		<category><![CDATA[fragment shader]]></category>
		<category><![CDATA[geometry instancing]]></category>
		<category><![CDATA[GLSL]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[texture buffer]]></category>
		<category><![CDATA[transform feedback]]></category>
		<category><![CDATA[uniform buffer]]></category>

		<guid isPermaLink="false">http://rastergrid.com/blog/?p=504</guid>
		<description><![CDATA[The Khronos Group did a great job in the last few years to once again prove that OpenGL is still in game and that it can become the ultimate graphics API of choice, if it is not that already. However, we must note that it is not quite yet true that OpenGL 4.1 is a&#8230;]]></description>
				<content:encoded><![CDATA[
<p>The Khronos Group did a great job in the last few years to once again prove that OpenGL is still in game and that it can become the ultimate graphics API of choice, if it is not that already. However, we must note that it is not quite yet true that OpenGL 4.1 is a superset of its competitor, DirectX 11. We still have some holes that still have to be filled and I think the ARB should not stop just there as there is much more potential in the current hardware architectures than that is currently exposed by any graphics API so establishing the future of OpenGL should start by going one step further than DX11. In this article I would like to present my vision of items of importance that should be included in the next revision of the specification and how I see the future of OpenGL.</p>
<p><span id="more-504"></span>Since the original OpenGL Longs Peak announcement, graphics developers were really excited to get their hands on the completely revised OpenGL 3 specification. Still, due to severe backward compatibility and portability issues the original plan seemed to be failed and developers expressed their great sense of disappointment about the ARB&#8217;s decision to choose rather a more evolutionary move away from the legacy API instead of the radical rewrite, the Khronos Group has proved that the decision was not necessarily bad for OpenGL and in fact we got now a pretty powerful API, even though the coexistence of the legacy and the new design greatly increased the complexity of the specification.</p>
<p>What we have now is an API that can really compete with DirectX 11 but I strongly believe that this is not the end of the story yet as we still have a lot of things to do in ahead of us. I mean this both from point of view of exposing more hardware capabilities as well as streamlining the API language itself to increase the productivity of the developers who use it. My plan is to target both of these issues in this article, also trying to focus on hardware functionalities that are not even exposed by other graphics APIs yet.</p>
<h2>Exposing more hardware capabilities</h2>
<p>In this chapter of the article I will talk about some familiar and some not so familiar hardware features and corresponding OpenGL extensions that should be included in the next revision of the specification in order to be able to confidently say that OpenGL is a strict superset of the competing graphics APIs. The extensions listed here are not in any particular priority order, they are just listed in a way that ease the discussion about their functionality.</p>
<h3><a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">GL_EXT_shader_image_load_store</a></h3>
<p>This extension provides GLSL built-in functions allowing shaders to load from, store to, and perform atomic read-modify-write operations to a single level of a texture from any shader stage. Also, the extension also indirectly enables the same operations for buffer objects by using texture buffers. This enables developers to implement more sophisticated algorithms using shaders that require more complex data structures than just plain arrays.</p>
<p>An example use case can be the implementation of Order-Independent Transparency (OIT) using fragment linked lists as presented by <a title="OIT And Indirect Illumination Using Dx11 Linked Lists" href="http://www.slideshare.net/hgruen/oit-and-indirect-illumination-using-dx11-linked-lists" target="_blank">AMD at GDC10</a>. Of course, there are a lot of other techniques that could benefit from hardware accelerated random access images (called UAV textures/buffers in DX11 terminology) including algorithms related to global illumination, ray tracing, and my personal favorite: scene management.</p>
<p>As the introduction of new write operations to fragment shaders besides the traditional framebuffer writes makes the execution of the shaders sensitive to whether early-Z is used or not by the hardware, the extension also introduces a new fragment shader input layout qualifier called &#8220;early_fragment_tests&#8221; to force OpenGL to use early depth and stencil test. Otherwise the specification language is valid stating that the depth and stencil tests are performed after fragment shader execution.</p>
<p>Finally, the extension enables some form of control over the order of image loads, stores, and atomics relative to other pipeline operations accessing the same memory region both using the OpenGL API and from within shaders.</p>
<p>The API itself provides a DSA-style binding mechanism that enables binding to so called &#8220;image units&#8221; that are separate from that of texture image units. In the same style, the specification language and GLSL refers to the introduced read-write textures with the term &#8220;image&#8221;.</p>
<p>In my opinion this is one of the most important extensions that should be made core with OpenGL 4.2 and I&#8217;m pretty sure this will actually happen.</p>
<h3><a title="GL_NV_texture_barrier" href="http://www.opengl.org/registry/specs/NV/texture_barrier.txt" target="_blank">GL_NV_texture_barrier</a></h3>
<p>This extension relaxes the restrictions of OpenGL on rendering to a currently bound texture and provides a mechanism to avoid read-after-write problems. More precisely, the extension allows rendering to a currently bound texture in the following cases:</p>
<ul>
<li>If the reads and writes are from/to disjoint sets of texels (after accounting for texture filtering rules) so it should work unless the drawn areas overlap, or</li>
<li>If there is only a single read and write of each texel, and the read is in the fragment shader invocation that writes the same texel (e.g. using texelFetch2D).</li>
</ul>
<p>Some of these situations were already supported implicitly like rendering to a texture level and fetching from another texture level. But the extension goes further and provides an API function to put an explicit barrier between draw calls to ensure proper rendering.</p>
<p>The extension can be used to accomplish a limited form of programmable blending and can eliminate the need of any image or buffer data copy in case we can live with the restrictions mentioned above.</p>
<p>One may ask why we need this extension if we have the <a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">GL_EXT_shader_image_load_store</a> extension as this one is just a subset of the functionality provided by that. The answer is simple: performance. While read-write textures can mimic the same functionality they usually use different hardware paths that are slower than regular read-only texture accesses. So it would be a definite benefit to having also this extension in core OpenGL.</p>
<h3>GL_ARB_shader_atomic_counters</h3>
<p>This extension does not have public specifications yet, however it can be found in the extension lists of the latest Catalyst driver releases sometimes with EXT, sometimes with ARB prefix. The extension itself provides API to access a number of hardware atomic counters that provide efficient counter operations on a GPU global scale.</p>
<p>Atomic counters come handy when one has to read or write individual elements of a buffer or texture. As an example, this extension is needed to be able to efficiently implement the OIT algorithm mentioned earlier as, when constructing the fragment linked list, we need to have unique offsets to the linked list buffer. This unique offset can be, of course, acquired by using atomic read-modify-write operations but those perform much slower than hardware atomic counters.</p>
<p>Besides the mentioned example, atomic counters are useful in many algorithms from many domains, one important use case is to perform feedback operations similar to that provided by transform feedback. Such feedback operations can be used to perform various scene management or culling mechanisms.</p>
<p>The extension provides access to these atomic counters from GLSL and also makes it possible to back them up with buffer objects so after OpenGL draw calls the value of the counters is conserved in these buffers for subsequent use.</p>
<h3><a title="GL_AMD_conservative_depth" href="http://www.opengl.org/registry/specs/AMD/conservative_depth.txt" target="_blank">GL_AMD_conservative_depth</a></h3>
<p>Early depth test is a common optimization for hardware accelerated graphics that can skip the evaluation of fragment shaders for fragments that end up being discarded because they don&#8217;t pass the depth test. The problem is that in case the fragment shader modifies the depth value of the fragment then the early depth test is disabled. One can force early depth test with the functionality introduced by the extension <a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">GL_EXT_shader_image_load_store</a> but that can lead to some rendering artifacts as the modified depth value output by the fragment shader is not taken into account.</p>
<p>This extension allows the application to pass enough information to the GL implementation to activate some early depth test optimizations safely while still preserving the ability to account the final depth value in the depth test. In order to solve this, the extension introduces four new fragment shader input layout qualifiers called &#8220;depth_unchanged, &#8220;depth_any&#8221;, &#8220;depth_greater&#8221; and &#8220;depth_less&#8221;. The most interesting ones are the latest two that provide the ability to do early-Z and hierarchical-Z tests from one direction to discard some groups of fragments and still allow the fragment shader to safely modify the depth value.</p>
<p>This technique comes very handy in case of rendering volumetric particles, decals or billboards. Without this extension one have to sacrifice the possibility to do early rejection of fragments in order to be able to create the volumetric primitives mentioned.</p>
<p>As far as I know this feature is also present in DirectX 11 so it should be a must for OpenGL 4.x also. As the extension is an AMD one, I don&#8217;t know whether NVIDIA GPUs do support anything like this in hardware but even if not, they can simply ignore the new layout qualifiers and do late depth test instead. Of course, it would result in lower performance but if only functionality is concerned it should be just okay.</p>
<h3>GL_ARB_instanced_arrays2</h3>
<p>OpenGL provides two means to perform geometry instancing via the extensions <a title="GL_ARB_draw_instanced" href="http://www.opengl.org/registry/specs/ARB/draw_instanced.txt" target="_blank">GL_ARB_draw_instanced</a> and <a title="GL_ARB_instanced_arrays" href="http://www.opengl.org/registry/specs/ARB/instanced_arrays.txt" target="_blank">GL_ARB_instanced_arrays</a>. While this (yet non-existent) extension would extend both, it is more relevant in case of the extension mentioned later so I named it accordingly.</p>
<p>The extension should trivially add the possibility to specify a &#8220;first instance&#8221; parameter for the instanced draw commands. Whether this is accomplished by introducing new variants of the glDrawElement* and glDrawArrays* draw commands or having a separate command for specifying the new parameter is up to the ARB. The extension should also interact with <a title="GL_ARB_draw_indirect" href="http://www.opengl.org/registry/specs/ARB/draw_indirect.txt" target="_blank">GL_ARB_draw_indirect</a> which already mentions the lack of the parameter in GL and reserved already a field in the indirect draw command structure for specifying the &#8220;first instance&#8221; parameter.</p>
<p>This extension itself would be much more a bug fix rather than a completely new feature as this functionality should have been already exposed at the first time instancing was introduced to OpenGL.</p>
<h3>GL_ARB_draw_indirect2</h3>
<p>This is one of the extensions I would be the most happy to see in the next release of the OpenGL specification. It would be a functional addition to the <a title="GL_ARB_draw_indirect" href="http://www.opengl.org/registry/specs/ARB/draw_indirect.txt" target="_blank">GL_ARB_draw_indirect</a> extension that currently only allows the execution of a single instanced draw command that sources its parameter from a buffer object.</p>
<p>The new extension would add a new buffer binding point called e.g. GL_DRAW_INDIRECT_PRIMITIVE_COUNT that would specify the source of the &#8220;primcount&#8221; parameter to the following newly introduced draw commands:</p>
<pre>    void <strong>MultiDrawArraysIndirect</strong>( enum <em>mode</em>, sizei stride,
                                  const void *<em>indirect</em>,
                                  const void *<em>primcount</em> );
    void <strong>MultiDrawElementsIndirect</strong>( enum <em>mode</em>, enum <em>type</em>, sizei stride,
                                    const void *<em>indirect</em>,
                                    const void *<em>primcount</em> );</pre>
<p>This would not just allow for executing multiple indirect draw commands at once, without further CPU action, but also would source the &#8220;primcount&#8221; parameter from a buffer object thus if the draw commands are generated using transform feedback, read-write buffers or OpenCL (e.g. based on some GPU based scene management algorithm) then the application does not have to use asynchronous queries or other means that may introduce sync points in the rendering to be able to feed the &#8220;primcount&#8221; parameter.</p>
<p>Some people said that this is quite a futuristic feature to expect and most probably such functionality will be available only on newer generation of GPUs and maybe with OpenGL 5. I was not that pessimistic so I decided to raise my question to the relevant ARB members of NVIDIA and AMD. While I did not receive any answer from NVIDIA, I did received some good news from AMD as they said that this functionality can be implemented for Shader Model 5.0 level hardware.</p>
<p>What this extension would give developers is a way to efficiently implement GPU based scene management where the GPU bakes together all the rendering commands for the current frame using atomic counters and buffer writes, and the CPU just have to issue a few or maybe just a single MultiDraw*Indirect command to render the whole scene. But of course, the feature can increase draw command throughput also in case of CPU based scene management.</p>
<p>So my message to the Khronos Group is please, start working on such an extension as this would not just make developers happy, but you can also strengthen OpenGL&#8217;s position in the industry by putting something into the specification that even DirectX 11 cannot do.</p>
<h3><a title="GL_AMD_transform_feedback3_lines_triangles" href="http://www.opengl.org/registry/specs/AMD/transform_feedback3_lines_triangles.txt" target="_blank">GL_AMD_transform_feedback3_lines_triangles</a></h3>
<p>OpenGL 4.0 introduced the extension <a title="GL_ARB_transform_feedback3" href="http://www.opengl.org/registry/specs/ARB/transform_feedback3.txt" target="_blank">GL_ARB_transform_feedback3</a> that further extended the transform feedback capabilities provided by earlier extensions to allow ouput to separate vertex streams. However there is one caveat: separate vertex streams are only supported for point primitives.</p>
<p>This new AMD extension does nothing more than just simply removes that restrictions for separate output streams allowing the same set of primitive types to be used with multiple transform feedback streams as with a single stream as long as the primitive types are the same for all output streams.</p>
<p>Limiting the possible output primitive types for transform feedback into multiple streams should not be a problem unless you want also to rasterize some triangles at the same time you output. Without relaxing this restriction can do this only by issuing two separate draw commands that incurs a performance hit.</p>
<p>I don&#8217;t know if the restriction is present in the ARB extension because NVIDIA does not support this in hardware but if this is not the case then I think this extension should be included in the next release of the specification. Otherwise, please NVIDIA include this feature in your next GPU generation.</p>
<h3><a title="GL_NV_copy_image" href="http://www.opengl.org/registry/specs/NV/copy_image.txt" target="_blank">GL_NV_copy_image</a></h3>
<p>OpenGL 3.1 already introduced a method to provide GPU accelerated copy of buffer data. This NVIDIA extension provides a similar functionality that can be used to execute efficient image data transfer between image objects (i.e. textures and renderbuffers).</p>
<p>While there are already methods to perform image data copies between textures e.g. using the <a title="GL_EXT_framebuffer_blit" href="http://www.opengl.org/registry/specs/EXT/framebuffer_blit.txt" target="_blank">GL_EXT_framebuffer_blit</a> extension promoted to core with OpenGL 3.0 these require expensive framebuffer object operations and they also lack direct support for transferring 3D image data.</p>
<p>This extension simply introduces a single command that allows such image data copies for every type of textures (including cube maps, 3D textures and array textures) without the need to bind the image objects or otherwise configure the rendering.</p>
<h3><a title="GL_AMD_depth_clamp_separate" href="http://www.opengl.org/registry/specs/AMD/depth_clamp_separate.txt" target="_blank">GL_AMD_depth_clamp_separate</a></h3>
<p>The extension <a title="GL_ARB_depth_clamp" href="http://www.opengl.org/registry/specs/ARB/depth_clamp.txt" target="_blank">GL_ARB_depth_clamp</a> promoted to core with OpenGL 3.2 introduced the ability to control the clamping of the depth value for both the near and far clip planes. This eliminates artifacts like seeing inside an object happening when the object&#8217;s geometry is clipped by the near clip plane.</p>
<p>This new extension provides a mean for the application to enable depth clamp separately for the near and the far clip plane. This increases the flexibility of depth clamping and can save some fill-rate in certain situations.</p>
<h3><a title="GL_EXT_texture_filter_anisotropic" href="http://www.opengl.org/registry/specs/EXT/texture_filter_anisotropic.txt" target="_blank">GL_EXT_texture_filter_anisotropic</a></h3>
<p>I don&#8217;t think that I have to talk too much about this extension as it should be familiar to all of you. It simply enables the possibility to use anisotropic filtering on a per-texture basis. I really wonder how this extension didn&#8217;t make its way into core as it is supported by hardware since more than a decade.</p>
<p>I know that the extension itself is supported by all relevant graphics driver vendors but really, why we can&#8217;t just simply include it in the core specification?</p>
<h3>GL_ARB_texture_gather_lod</h3>
<p>This is another yet non-existent extension that would extend <a title="GL_ARB_texture_gather" href="http://www.opengl.org/registry/specs/ARB/texture_gather.txt" target="_blank">GL_ARB_texture_gather</a> by adding GLSL built-in functions called textureGatherLod that would allow gathered fetches with explicit LOD. I&#8217;m not sure if these functions are missing from the specification because of lack of hardware support or just because the ARB thought they might not be of any use. Anyway, if the hardware supports it then OpenGL should expose it to developers as there are certain situations when one has to use explicit LOD and could benefit from the increased fetching performance enabled by gathered fetches.</p>
<h3><a title="GL_ARB_shader_stencil_export" href="http://www.opengl.org/registry/specs/ARB/shader_stencil_export.txt" target="_blank">GL_ARB_shader_stencil_export</a></h3>
<p>This extension was published at the time the OpenGL 4.1 specification came out and provides the ability for the fragment shader to output the stencil reference value that was otherwise configurable only using API calls. This enables a great level of flexibility to existing and future stencil buffer based algorithms making it possible also to directly write independent values to the stencil buffer on a per-fragment basis.</p>
<p>The predecessor of the extension is <a title="GL_AMD_shader_stencil_export" href="http://www.opengl.org/registry/specs/AMD/shader_stencil_export.txt" target="_blank">GL_AMD_shader_stencil_export</a> and as such it indicates that maybe it is only supported in hardware on AMD GPUs. However, if this is not the case and NVIDIA could support this also then I think it worths to promote this feature also to core OpenGL.</p>
<h2>Streamlining the API</h2>
<p>After discussing the long list of functional features that would be nice to be included into the next release of OpenGL let&#8217;s focus on the API improvement extensions and ideas that are necessary to improve the usability of the API itself. Actually this part could go way longer than I&#8217;ll discuss because as we get more and more features to OpenGL, developers struggle with the increased complexity of the API. I&#8217;ll try to focus on the most crucial issues.</p>
<h3><a title="GL_EXT_direct_state_access" href="http://www.opengl.org/registry/specs/EXT/direct_state_access.txt" target="_blank">GL_EXT_direct_state_access</a></h3>
<p>This is the extension what all OpenGL developers are waiting for a long time now. Direct state access eliminates the OpenGL API&#8217;s stupid &#8220;bind-to-modify&#8221; nature.</p>
<p>For a very long time the only vendor supporting the extension was NVIDIA. Fortunately, since Catalyst 10.7 AMD also exposes the extension to developers. Still, I have one problem: this extension is very poorly designed.</p>
<p>The main problem with the extension is that the functions were designed in a way that a naive implementation could be done by simply using &#8220;bind-to-modify&#8221; under the hood. That&#8217;s what resulted in crazy API functions like MultiTexParameter* and friends. Also, enabling DSA for all of the deprecated functionalities would result in an explosion of the API specification and as a consequence it would result in bloated specification language. Finally, I would also like to object somewhat the lack of creativity of the contributors regarding to the awkward naming conventions present in the current DSA extension.</p>
<p>In my opinion the Khronos Group has to address the issue by creating a new ARB version of the DSA extension that focuses strictly on core functionalities, throwing away DSA support for deprecated features (if somebody needs to use deprecated features they can still use the EXT version) and provide a naming convention that fits much better into the current API language.</p>
<p>Anyway, I completely agree with the other developers out there and scream for DSA. I think the Khronos Group has to eliminate the problem of the &#8220;bind-to-modify&#8221; semantics as soon as possible otherwise, even though the core specification exposes more and more hardware features, developers will not be attracted to use OpenGL.</p>
<h3>GL_ARB_explicit_sampler_location</h3>
<p>The ARB moved in the right direction when they introduced the <a title="GL_ARB_explicit_attrib_location" href="http://www.opengl.org/registry/specs/ARB/explicit_attrib_location.txt" target="_blank">GL_ARB_explicit_attrib_location</a> extension by eliminating the need to use dummy API calls to bind vertex attributes and output buffers to shader variables but they should not stop here. One of the most important addition could be adding a similar language syntax to GLSL that would allow us to bind sampler uniforms to texture image units. Obviously, the same goes for read-write images if <a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">GL_EXT_shader_image_load_store</a> is included.</p>
<h3>GL_ARB_explicit_uniform_block_index</h3>
<p>Similar to the previous request, uniform block indices should be as well explicitly specifiable in the shaders themselves. This extension would add exactly such functionality. The implementation is also straightforward: just a simple uniform block layout qualifier has to be added.</p>
<h3>Other API clarifications</h3>
<p>Besides the major issues the current specification language also has some bugs and unclear parts that should be addressed as well:</p>
<ul>
<li>Program pipeline objects are created by binding the object name which is not in align with the rest of the API language.</li>
<li>No language is about whether program pipeline objects are shared among contexts or not which suggests that they aren&#8217;t which is not in align with the fact that program and shader objects are shared.</li>
</ul>
<p>Most probably there are a lot more issues with the specification language but for now just these came into my mind. Maybe some of you can extend the list with tons of other specification mistakes.</p>
<h2>OpenGL 4.2 and beyond</h2>
<p>While my feature requests cover most of the needed functionality that should be included in the next revision of the OpenGL specification, there are a lot of other things that could be very useful for developers but are very unlikely to get their way into the specification any soon. I will talk about these features in this section of the article as these raise much more questions than just to be able to simply include it in OpenGL 4.2.</p>
<h3>Affinity contexts</h3>
<p>We have multi-GPU designs like SLI and CrossFire for a long time now. Fortunately, we have also vendor specific extensions to create affinity contexts that are associated with a single GPU of a multi-GPU configuration. We have <a title="WGL_AMD_gpu_association" href="http://www.opengl.org/registry/specs/AMD/wgl_gpu_association.txt" target="_blank">WGL_AMD_gpu_association</a> and <a title="WGL_NV_gpu_affinity" href="http://www.opengl.org/registry/specs/NV/gpu_affinity.txt" target="_blank">WGL_NV_gpu_affinity</a> for Windows and <a title="GLX_AMD_gpu_association" href="http://www.opengl.org/registry/specs/AMD/glx_gpu_association.txt" target="_blank">GLX_AMD_gpu_association</a> on GLX based platforms. I have just two problems with this:</p>
<ul>
<li>First, these are vendor specific extensions.</li>
<li>Second, NVIDIA exposes its affinity context support only on Windows and just for their professional cards, leaving consumer hardware owners without affinity context support.</li>
</ul>
<p>I would be pleased to see in the future extensions like <span style="text-decoration: underline;">WGL_ARB_gpu_affinity_context</span> and <span style="text-decoration: underline;">GLX_ARB_gpu_affinity_context</span> that will be supported by both NVIDIA and AMD, and that are supported on both professional and consumer hardware.</p>
<h3>Command buffers</h3>
<p>I would like to see something similar in OpenGL that what we have in OpenCL. Having several separate command buffers for a single OpenGL context can have its performance benefits as some of the implicit sync points that are otherwise present in OpenGL draw commands could be eliminated. Another solution would be to use simply multiple GL contexts but it is much more complicated and context switches are quite heavy-weight operations. This would be something like how framebuffer objects replaced pbuffers.</p>
<p>Also this could go that far as we can encapsulate state manipulation data into command buffers in a similar way how display lists allowed this in many cases just in a more efficient and hardware centric manner.</p>
<h3>Immutable state objects</h3>
<p>Another thing strongly related to the previous idea would be immutable state objects. If state management data could not be efficiently stored in such a command buffer we could use instead immutable state objects that would be very similar in nature to display lists that are hiding the underlying representation of the commands.</p>
<p>Display lists are deprecated and I don&#8217;t think it was a wrong decision. It made the API language complex and you&#8217;ve never knew which command compiles into display lists and how. I remember the time I was making an OpenGL app on my GeForce2 and used DrawElements calls inside display lists that referenced buffer object data. Funnily it was working on NVIDIA hardware, even though the specification says otherwise, and I was wondering why I my app crashes on ATI cards.</p>
<p>Anyway, display lists are gone, but we need some complex state objects that could fill those holes that were left after them.</p>
<h3>More callbacks</h3>
<p>I was very happy to see the appearance of an extension that introduced the callback concept into OpenGL (<a title="GL_AMD_debug_output" href="http://www.opengl.org/registry/specs/AMD/debug_output.txt" target="_blank">GL_AMD_debug_output</a>). Since that, the functionality was promoted to an ARB extension meaning that the ARB has accepted the fact that we need callbacks.</p>
<p>What I would like to see in the future is more OpenGL callbacks. One of the most trivial things I can think of are asynchronous queries. It would so much easier if we would be able to receive a callback from OpenGL when the results of our asynchronous queries are available, rather than having to manually poll it for result in various phases of the rendering.</p>
<p>Actually, I could imagine callbacks for every rendering command issued that will be called by the driver as soon as the actual rendering is complete on the GPU side.</p>
<h3>Programmable blending</h3>
<p>This is one another thing that developers are screaming for. Fortunately now we have indirect methods to solve most of the issues of programmable blending via the extensions <a title="GL_EXT_shader_image_load_store" href="http://www.opengl.org/registry/specs/EXT/shader_image_load_store.txt" target="_blank">GL_EXT_shader_image_load_store</a> and <a title="GL_NV_texture_barrier" href="http://www.opengl.org/registry/specs/NV/texture_barrier.txt" target="_blank">GL_NV_texture_barrier</a>, however a more general solution would be welcomed.</p>
<p>I don&#8217;t know whether this would be actually possible on current hardware but if not, then this is a message to hardware vendors to solve the issue in the near future.</p>
<h2>Summary</h2>
<p>We&#8217;ve seen that even though OpenGL is on track and the Khronos Group is keeping up the pace with its competitors, still there are lots of room for improvement regarding to the OpenGL specification from both functional point of view as well as from API design point of view.</p>
<p>I would like to end the article with a summary of what I expect to be part of the OpenGL 4.2 specification and my personal wish-list beyond those in some kind of priority order.</p>
<p><strong>My expectations for OpenGL 4.2:</strong></p>
<ul>
<li>GL_EXT_shader_image_load_store</li>
<li>GL_ARB_shader_atomic_counters</li>
<li>GL_ARB_instanced_arrays2</li>
<li>GL_ARB_explicit_sampler_location</li>
<li>GL_ARB_explicit_uniform_block_index</li>
</ul>
<p><strong>My personal wish-list for OpenGL 4.2:</strong></p>
<ul>
<li>GL_ARB_draw_indirect2</li>
<li>GL_ARB_direct_state_access</li>
<li>GL_NV_texture_barrier</li>
<li>GL_AMD_conservative_depth</li>
<li>GL_ARB_texture_gather_lod</li>
<li>GL_NV_copy_image</li>
<li>GL_EXT_texture_filter_anisotropic</li>
<li>GL_ARB_shader_stencil_export</li>
<li>GL_AMD_depth_clamp_separate</li>
<li>GL_AMD_transform_feedback3_lines_triangles</li>
</ul>

]]></content:encoded>
			<wfw:commentRss>http://rastergrid.com/blog/2010/11/suggestions-for-opengl-4-2-and-beyond/feed/</wfw:commentRss>
		<slash:comments>29</slash:comments>
		</item>
	</channel>
</rss>
