Are you a passionate learner, with some beginner’s knowledge of OpenGL retained mode, eager to dive into some foreign material and implement some fascinating graphics for your own purposes? If so, then this is the guide for you! We’re going to implement a wrapper around OpenGL 3.1! Be forewarned, implementing this rendering layer is no simple task, but with the help of this guide, it should be much easier, quicker, less involving of manual learning, and less involving of fruitless Googling than it otherwise would be. This comes from someone with experience implementing his own wrapper around OpenGL 3.1+ via Java and LWJGL, and some experience utilizing the professional rendering engine used for EverQuest: Landmark at Daybreak Game Company, formerly Sony Online Entertainment. (Note: the thoughts and opinions contained in this blog do not represent those of my former employer!) Hopefully you can learn from my experiences, and the failures I’ve both experienced and witnessed. Now, without further ado…
Some Useful Resources
The Fundamental Objects
DisplayManager
Make a class that encapsulates the “display”, and is responsible for flipping the framebuffer, assuming that it’s double-buffered. (It probably should be.) It should encapsulate anything OS dependent about your approach. (LWJGL is already OS independent-ish, fortunately for me.) In C++, you’d probably want to have different implementation files (*.cpp) depending on the target OS. Let it accept interfaces to handle things like gain/loss of the display’s focus. Give it methods for locking the mouse cursor, changing the window title, the current display mode, etc.
State Caching
Note: you might can skip this for now, but it’d probably be easier if you went ahead and knocked this out.
Make a file or an object with a group of functions for internal use that directly substitute the majority of important CPU-to-GPU functions provided by OpenGL. The important bit is that these functions should cache data submitted to the GPU on the CPU side, so that redundant data submissions can be avoided. (No, the driver doesn’t do this for you.) Functions like glEnable, glBlendFunc, glDepthTest, glColorMask/glDepthMask, glBindVertexAray, glBindBuffer, glBindProgram, etc. should all be accounted for. Note that the binding of VAOs and VBOs can be tricky, as the meaning of glBindBuffer in particular is highly state dependent. You’ll likely be expanding and refining this set of functions as you go (unless you’re really hardcore and want to go ahead and knock out the entire API). Refer to the manual and the specification, in that order, when looking up what particular functions mean and what the default values of the various setters are.
OpenGL Data Structures
Make some basic objects that directly reflect the data structures on the GPU. Do not try to do anything smart around OpenGL’s API here: let your classes directly mimic the low-level structure of OpenGL’s data structures, associations, getters, and setters, and let the fanciness occur at a higher level of abstraction that you can entirely dictate, not the API. Give each object a set of bind/unbind functions as appropriate. Inject dependencies as closely as possible to the place that they’re used (e.g. as method parameters), not simply in the objects’ constructors.
VAOs and VBOs
Make classes for VAOs and VBOs (which are optionally contained by a VAO), and some sort of VAOLayout class with a spiffy interface that allows its clients to specify exactly how they’d like their attribute data to be laid out for any VAOs that the object is passed to, in a flexible fashion. Remember that it’s possible for a VAO to have multiple streams of interleaved attributes! If using C++, passing in some user-defined structs as template params may be valid (though you’d need some sort of Run-Time Type Information (RTTI) or separate specification of the attributes, so your objects can submit that info to the GPU). Next, give the VAO functionality that optionally associates it with an index buffer. (An index buffer is effectively just a special use case of a VBO.) Finally, add methods for the various draw routines supported by OpenGL. (E.g. glDrawArrays, glDrawElements, glDrawArraysInstanced, etc.) You can swap between arrays/elements in each function depending on whether the VAO has an index buffer.
- OpenGL Wiki: Buffer Objects
- OpenGL Wiki: Vertex Specification (VAOs and relation to VBOs)
- OpenGL Wiki: Vertex Rendering
Texture Unit Manager
Make a TextureUnitManager class, that handles the allocation of texture units as a resource to those that request it, and again attempts to minimize the number of state changes. Learn everything about the distinction between texture units and texture image units. Don’t ever confuse them!
- StackOverflow (answer from Nicol Bolas): Relationship between glActiveTexture and glBindTexture
- OpenGL Wiki: Texture Image Units
Texture Objects
Make a set of TextureObject classes, instances of which wrap individual texture objects on the GPU. Let each archetype of texture supported by OpenGL be reflected by a distinct class. Provide getters and setters for all the variables that are associated with a texture, and maybe one method that resubmits it all to the GPU when a change is detected. If you want to unify the functionality of the different TextureObjects in some way (readbacks, data submission, etc.), accomplish it by letting each class implement interfaces/pure-virtual-base-classes exposing that similar functionality. You can even let 3D texture objects provide getters for objects that implement a 2D interface exposing sub-slices of its data, do the same for 2D to 1D, etc. This can take some time to implement, though.
Programs and Shaders
Make some objects that closely mimic OpenGL’s Programs and Shaders. This part may be a little tricky…
First off, make the Shader class, with code that submits source code to the GPU and compiles it. You’ll probably want a simple macro-substitution system involving regex, so you can pass in constants from the CPU side. (Note that GLSL has a preprocessor system in place similar to that of C++, so don’t reimplement it, and utilize it with constant injection as appropriate.) You’ll additionally want thorough logging so you can see what’s wrong with your non-compiling shaders. Execute this well, as you’ll be relying on it heavily.
Next, implement the Program class, which accepts a set of shaders and links them on the GPU. (There are different kinds of shaders, some of which you’ll need defined for every program, so plan accordingly.) Again, log the linking process thoroughly. Now most importantly, your Program object needs getters and setters for its shader-dependent “uniforms” and “uniform blocks”. Read up on these. Ideally, your class will cache submitted uniforms and uniform block indices on the CPU side, so that A) redundant data submissions can be avoided, and B) you can resubmit the uniforms after recompiling the program and all of its shaders at run-time. Speaking of which…
Now you should add a routine that iterates through all of your programs and shaders at run-time and recompiles them if a source change is detected! Or maybe just recompiles one if you don’t mind providing a program name every time you call it. Assign this routine to some key combination or console command in your game engine and this will help you tremendously in terms of iteration time… so do it.
Finally, once you’re done with all of this, polish polish polish! Improve your shader compilation logging if you have time. E.g., you can add accurate line number output to your shader compilation output (necessary if you’re manipulating the source code before compiling), which can be difficult to implement but well worthwhile. Also, it can be invaluable to use regex to detect an expression like “layout(location=X)”, manually applying the expression’s intent if its unsupported by the target GLSL version, then removing the targeted text if so. You can get pretty sophisticated with your shader compilation layer, but my advice would be not to overdo it: only add features on top of the language if they’re truly valuable.
- OpenGL Wiki: Programs and Shaders (Ignore the bits about pipeline objects, that’s for a later version of OpenGL than we’re targeting.)
- OpenGL Wiki: Uniforms in GLSL
- OpenGL Example: Brief but effective overview of uniform variables in GLSL and how to set them.
Frame Buffer Objects
(Note: you can skip this for the moment if you don’t need any render targets besides the default framebuffer… but eventually, you’re going to want to render-to-texture, at which point, this will be required! This is especially true if you plan on implementing a deferred shading path.)
Make a Frame Buffer Object (FBO) class, that encapsulates everything you’d need to know about, well, FBOs, and optionally, the default framebuffer (which is actually somewhat different). Add methods for binding textures and renderbuffers as color or depth objects, binding and unbinding, etc. Also, somewhere, you should have code that ensures that an FBO is never bound while any of its textures are bound, and logs or throws an exception if so. (A good time to check this is at both FBO and TextureObject binding time. Your suite of state-caching functions may need to get involved here.)
- OpenGL Wiki: Framebuffers
- OpenGL Wiki: Frame Buffer Objects (Yup, it’s a different but related concept.)
Resource Organization
Now that we have the basic data structures taken care of, we need to think about how we’re going to organize and manage them as resources. I’ll leave this open-ended, but for me, I did something similar to what the Android SDK does automatically, by declaring and/or loading my resources as public fields on mass declaration objects, which I then pass around to anyone needing resources. (Note: using an object is almost always better than doing things statically!) I could envision a way of doing this where, instead, individual Techniques (to be discussed) load and store references to the resources themselves, but this would probably complicate our implementation quite a bit.
Useful Abstractions
Let’s build some useful, highly reusable abstractions around our new GL data structure objects!
Vertex Accumulator
Design a class that accumulates vertex data in a CPU-side buffer, and has a method which accepts a VBO and submits the buffered vertex data to it. Ensure that it doesn’t care about VAO attributes or anything (I.e., make it format independent.) Don’t get too fancy — just expose low-level functionality, like clearing the accumulated vertices.
Techniques
If you know your stuff, you can always draw everything manually by binding and unbinding each of the necessary resources and calling the appropriate draw routines for your VAOs. However, this is pretty redundant and labor intensive, and there’s a much better approach to accomplishing the same thing. The solution is to design a Technique class that represents how something is drawn, and optionally, what gets drawn (though it might be helpful to separate that into a different object). The Technique should have setters of variables and possibly lambda expressions to assign the appropriate resources for the impending draw call. Clustering resources into Techniques is how almost all professional game engines do things, and there’s good reason for it, so let’s mimic them! To draw something, we need to gather a lot of information:
The How: What program are we using? What FBO are we rendering to? What state should be bound? (I.e., glEnable, glDepthTest, etc.) What what information should be constant across all uses of this Technique?
The What: What textures are we sampling? What uniforms and uniform block indices are we passing to the program? What VAOs are we drawing?
Now that we have all this information in a tidy package, we can basically draw anything in a highly legible manner, provided that we have the resources available. Wonderful!
Many game companies actually specify this information in an XML format or in an in-house language, but for now, a highly legible in-code solution is more than sufficient. Additionally, if you’re working in a relatively dynamic language, like Java, C#, or Python, this is likely a viable long-term solution. The crucial thing is that we’re able to quickly iterate on this data, without burning a lot of time re-compiling our code/information.
An in-code or otherwise existing-language-based solution is actually ideal from a maintenance standpoint, as we get all the advantages of static-type checking, IDE syntax checking and semantic analysis, etc, in addition to relatively little implementation effort. Otherwise, if we decided to write the language ourselves, it wouldn’t be nearly as ideal as existing solutions, in addition to being horribly bug-prone. I honestly think a lot of AAA companies shoot themselves in the foot by implementing their own language or using a purely data-driven format, so… let’s not repeat the mistakes of the past! Use a real programming or scripting language, dang it!
At a later point, if you’re feeling ambitious, you may decide to sort your Techniques in order to minimize state changes, in addition to other “meta-language” level manipulations. While there are some complicated algorithms to achieve minimization of state changes, they’re honestly not worth the effort, since the performance advantage they offer relative to naive sorting methods is negligible, and their maintenance overhead is high. Essentially, you want to preserving-ly sort by each resource type in a sequence, from least to most costly to swap, then perform a topological sort based on the order of operations executed against the various render targets (FBOs) to ensure correctness. The order of resources, from most to least costly to swap, to my knowledge, is as follows:
Programs -> FBOs -> Texture Image Units -> Uniforms -> VAOs
Techniques are an important part of the engine that will effect your everyday quality of life, so work hard on their implementation!
Epilogue
By this point, you’ll have implemented a pretty solid rendering engine, so congratulations to you! Now you can start thinking about how to generalize things, so you could swap graphics APIs between OpenGL and, say, one of the major console APIs. (There’s no strong reason to support DirectX if you’re already supporting OpenGL, though, in my opinion.) I’ll be honest, I haven’t actually tackled this problem myself as I haven’t needed to, OpenGL being quite ubiquitous, but it’s a good one to think about.
The most important thing to do now is to make use of your new engine! You should have been testing its functionality with some simple test cases so far, but now you can really start fleshing things out and experimenting with graphics algorithms that pique your interest. You’ll certainly discover some bugs and issues with your implementation, and I encourage you to take the time to address them and really polish your engine to perfection.
So, now that you’re ready to start expanding your engine, here are some somewhat optional, useful features to consider adding as you go.
Useful Additional Features
Text Rendering
One of the most ubiquitous features a good rendering layer needs is straightforward text rendering. Getting this nice and extensible with regard to various text rendering methods can be surprisingly hard, but you’ll be pleased if you manage to achieve this. Otherwise, just having some form of simple debug text rendering is invaluable for profiling and diagnosing issues at run-time (especially if you’re using a release build of C++ and can’t debug), so it’d be wise to go ahead and tackle this.
In my experience, there are two real use cases for text rendering: static text and dynamic text. Static text applies to text that rarely if ever changes, like text on the labels and buttons of UI elements. Meanwhile, dynamic text applies to text that changes almost every frame, like text presented on a debug screen. For static text, you could simply implement every separate text object as a VAO with one interleaved VBO which you both create and destroy with the object, whereas for dynamic text, you’ll likely want to clear a large buffer every frame and append every string’s vertices to the buffer with each iteration of the game loop. In Java or C#, you’ll definitely run into garbage collection issues with dynamic strings without some custom string formatting implementation, so that’s certainly something to consider. Anyhow, this roughly covers the low level details, but how we do convert a string of text into vertices to begin with?
Generally, what engines do is pre-allocate a font palette texture for every type and size of font, and produce the palette by using an external library like stb_truetype. You’ll probably want to allocate a set of character objects indexed by the character somehow, where each object knows attributes of its pre-rendered character (e.g. its dimensions, texture coordinates, x and y render offsets, and x and y kernings relative to the other characters). You may also want to support signed distance fields for text rendered in 3D and possibly projected onto surfaces (if you at some implement deferred shading), which I’ll include a link for if you’re interested.
Another feature to consider supporting are some markup commands for your text, so you can label a section of string with some tags to change color, boldness, etc. (Though targeting different texture palettes would likely complicate your implementation substantially.)
Finally, it’s worth mentioning that there’s such a thing as ClearType font rendering, which is ultra-patented by Microsoft, but is responsible for how crisp text appears on modern display devices! I’m not sure what the legalities are, but if it’s possible to incorporate it, then you should! It may only be possible to achieve this by utilizing an existing font rendering engine with permission to use the algorithm… but I’m not certain. There’s an open-license variant available, but I’m not so sure that its comparable in quality. I’ll include links below.
- The amazing, complete stb library. Of particular interest are stb_truetype.h and stb_easy_font.h. (The latter is great to get some quick debug text rendering up and running!)
- BMFont. If you don’t want to include a library for font palette creation with your build, this popular application will generate a palette and meta-information for you. It might be quicker to get something up and running with this, but in the long term I don’t recommend it over something like stb_truetype.
- PixFont. A very simple $20 application that will convert bitmap fonts into TrueType fonts. Definitely worth the price!
- Signed Distance Fields Paper from Valve, by Chris Green. This rendering method is now ubiquitous across the games industry for 3D, in-world text.
- ClearType on Wikipedia. Great description of the algorithm and why it works.
- Subpixel Rendering on Wikipedia. Describes a history of the patent wars and some free alternatives to ClearType.
GPU Debugging Info
One great extension to support in your application is KHR_Debug, for a number of reasons. It’ll make the use of external applications (like Nvidia PerfKit) more useful by allowing you to assign names to your objects on the GPU. Additionally, it will allow you to log important events on the GPU in a very extensible and elegant fashion. Incorporating it isn’t hard, so get going! Remember to check whether the extension is supported before issuing any commands.
Some useful, core OpenGL functionality is the existence of Timer objects – these are an absolute must if you want to quickly and easily profile the performance of the various Techniques in your application! Just record the timings and spit out the result to a debug screen, simple as that.
- KHR_Debug
- OpenGL Wiki: Query Objects (Includes information on timer objects!)
The Last Epilogue (I Swear!)
That’s it! I hope you’ve enjoyed this guide and found it to be useful. Stay tuned more guides, references, and knowledge dumps…