Monday 9 September 2013

Project Post: Tyro
Here's a project that I worked on for a while back in 2011. If you're wondering why it's called something strange like Tyro, it's because the definition of tyro is a beginner/novice, so it fit nicely. It's a deferred renderer written in OpenGL 2.0 with GLSL. It also had an "almost" complete super-basic rigid body physics library branch in it that I started working on as well... but things got in the way and I couldn't finish that. I was reluctant to put a non-complete project up but there's plenty of awesome and hopefully someone can still learn/use something.

Still, the cool stuff is the renderer. So I'll post some screenshots, write a bit about it and then post the source code for you all to browse through if you're interested. 

Some Cool Things It Had
The Deferred Renderer
The project initially started as a deferred renderer experiment. After watching the Killzone 2 tech demo's from 2005 to 2009 I was thoroughly enamored with the concept. For all of you who don't know what Deferred Rendering is, here's the short version.

Traditionally, in game engines up to that time, lighting was performed on geometry as the geometry itself was rendered. So (this will be a VERY basic representation), something similar to this would occur:


Render Function
Initialize renderer, do pre draw stuff;

Clear back buffer(s);

Set back buffer mode to accumulate;

foreach light do
    find geomtetry that intersects light volume;    
    render geometry lit with that light into back buffer;
end;

Swap back buffer

You can see that if an object in the world was lit by multiple lights, you would have to re-render the geometry every time, for each light. You could do some things to mitigate this, like batch multiple lights into a single run of the shader for the object... but the core still holds. There was a dependency between performing the lighting calculation and transforming and rendering the actual geometry itself. So, as your light count increased, your draw call count increased, and your triangle count increased as well. Enter deferred shading.

Deferred shading is called deferred shading because it defers shading calculations until after the geometry itself has been drawn. The core concept is that you write the 'properties' of a scene to various offscreen buffers (also called Geometry Buffers or collectively, the G-Buffer) during what is called a material pass. After this pass is done you perform shading passes using the properties stored in the buffers. This shading could be anything you want it to be but it has predominantly been associated with the lighting calculation(s).
G-Buffer visualization. View space normal(TL). Texture albedo(TR). Depth buffer(BL). Specular reflectivity(BR).

Oh, and before the terminology becomes too weird. A renderer that uses deferred shading tends to be called a deferred renderer, not a deferred shader. Yeah I know.

This approach is fundamentally cool because it clearly and cleanly separates lighting calculations from geometry and material calculations. They become two distinct parts of the pipeline. This helps not only performance but architectural cleanliness as well.

I'm going to digress here and add that I think this is one of the core technological approaches that really saved Sony's bacon this generation. The Playstation 3 had this really poor geometry throughput compared to its contemporary, the Xbox 360. Traditionally architected engines pretty much without fail performed much better on Microsoft's console with it's more PC like CPU, unified memory, and (for the time) powerful, unified-shader based GPU. Deferred shading helped level the graphical playing field because it removes a large portion of geometry transformations from the pipeline. Not only that, but the fact that you had all these material properties written to memory buffers, meant that they were conveniently in a format that the PS3's SPUs could operate on efficiently. This meant that the wily developer could alleviate the pressure on the weak GPU by moving entire portions of the graphics pipeline over to the satellite processors. Examples of this were titles like Split Second and Battlefield 3, which performed the entirety of their lighting calculations on SPUs; And titles like Killzone 2, and Uncharted 2, which performed the majority (or all) of their image post processing on the SPUs. Later titles used the G-Buffer to perform post-process anti-aliasing, removing that from the geometry pipeline as well.  So we can see the benefits that Deferred Shading bring to the table.

It's not all great though. The very same characteristics that make Deferred Shading a good thing also introduce some problems too. For one, the memory footprint and bandwidth required for the G-Buffer can get quite large. And if you want sub-pixel AA it gets even scarier.
Transparencies become a problem too, and what you generally have to have is a fallback forward-renderer to draw transparent elements of the scene.
Introducing new material properties means introducing new offscreen buffers. And so on.

A thorough introduction and/or overview of Deferred Shading is way beyond the scope of this post. I'd advise any interested readers to check out the myriad sources available. Some useful links are:

http://www.cg.tuwien.ac.at/courses/Seminar/WS2010/deferred_shading.pdf

http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/lectures/12_deferred_shading.pdf

http://www.guerrilla-games.com/publications/dr_kz2_rsx_dev07.pdf

http://www.dennisfx.com/wp-content/uploads/2013/02/Report_DRendering_Toufexis_D.pdf

http://developer.amd.com/wordpress/media/2012/10/D3DTutorial_DeferredShading.pdf

http://www.slideshare.net/guerrillagames/the-rendering-technology-of-killzone-2

Normal mapping
Tyro also supported tangent space normal mapping. I wasn't then, and am still not, aware of any tools that generate a tangent space for your meshes (links anyone??) so I ended up writing my own tangent space mesh parser. This turns out to be a very tricky thing to get right and that part of Tyro's code base is something I don't really want to look at ever again. In addition, it really does require an artists touch to go through and make sure that you don't average facet details when you shouldn't etc... the things an algorithm tends to not get perfectly right for every kind of mesh type.

Also, it's very annoying when you have to wait for the .exe to parse a mesh every single time you run the damn thing. The importance of a good back-end that gives you pre-parsed meshes in a format your front-end can read quickly...

Here are some links for tangent space normal mapping:


http://crytek.com/download/Triangle_mesh_tangent_space_calculation.pdf

http://www.ozone3d.net/tutorials/bump_mapping.php

Bloom
I added a bloom effect to the result when the light values reached a certain threshold and above.
This is accomplished by trivially taking the lighting result buffer, examining the values and, if they were above your threshold, moving them over to another buffer, call it the bright-pass buffer. Then you downsample that buffer and blur it a few times using a separable filter or something. To make it a little more effective, I think I had the hardware generate some mipmaps from the 
bright-pass buffer and then blur those as well to get the wide kernel effect.

I got the idea from here:
http://kalogirou.net/2006/05/20/how-to-do-good-bloom-for-hdr-rendering/

The standard lighting output.

Brightpass result: pixels above a certain  brightness threshold selected.

Brighpass buffer downsampled and blurred repeatedly.

Final composite result. Add the blurred brightpass buffer to the final image.

Some Things It Should Have Had, But Didn't

Anti-Aliasing

Running off of a DirectX-9 level setup, you can't get any kind of sub-pixel deferred rendering I believe. So it would of had to have been a post-process method. Something along the lines of MLAA or FXAA.

Gamma-Correction

A no-brainer. I wrote this before I had even heard the term. But gamma correction is an important topic all on it's own and vital for any proper lighting/rendering solution. 

Shadows

I was planning on this, but ran out of time to finish it before other things got in the way. Again, shadows are a huge topic all on their own.

In Conclusion

I think I'll leave it there. But I'll leave a couple more screenshots of it in action below and I'll post the whole project online so that you can all download it and try it out.
The project was written in CodeBlocks, runs on Windows only, and needs the Direct-X SDK setup to compile properly :) 










I used the SOIL library for loading pngs into OpenGL textures
.
http://www.lonesock.net/soil.html

Here is the URL to download the zipped project
https://docs.google.com/file/d/0B7HWlmfutdmxY3B4WDJvNko5UXc/edit?usp=sharing

Extract the folder to your C drive and open up the Codeblocks workspace. It should compile and run from there :) 

As a disclaimer, I take no responsibility for how you all use the code.
Also, if there are resources (textures, models, libraries) that I've used that you own the rights to and you want me to remove it and/or give you due credit, by all means let me know and I'll stick it in there.


Monday 29 April 2013

Indie Game...Of Thrones

Myself and three others from Celestial entered in the Indie Speed Run late last year. We managed to hack out a pretty cool project by the end of it. Our starting keywords were "afterlife" and "throne" so we decided to make a crazy mmo hack-and-slash isometric pen-art style game. Or as it will be known in the future a CMMOHSIPASG. *Cough*

Here's the link to it on the indiedb site: http://www.indiedb.com/games/throne

The team was
- Alison McAlinden, who did all the awesome pen art and animations for the characters, as well as the story.
- Travis Bulford, who did the network programming, UI programming, maze generation code and the AI.
- Cobus Saunderson, who did the user interface graphics (more work than it sounds like!)

My contribution was the isometric engine that powered the display, the particle system engine as well as the particle system scripts (no, the quality was higher than just "programmer art" thank you), and animation engine code and tie-ins for the characters.





It was an awesome experience, albeit a brutal one that I'm not too eager to try again until my degree is finished!

I'm going to talk a little bit about the engine, seeing as that's the component I spent the most time with.
The isometric component of the engine runs on top of the j4game framework (which handles all of the nasty set up tasks), but it's actually almost a completely separate entity. Aside from providing the graphics context and input, they are fairly detached from one another. It's better to think of it as being a plug-in into the j4game framework itself.

The engine had to be able to render the huge maps in the game and had to have sprites that could shear (trees blowing in the wind!) and enter a transparent state when an object of importance moved behind them (the player for example). To that end, it was a quadtree based design with a tricky sprite sorting and merging algorithm in the core loop that would dynamically check which sprites fell in front of priority sprites and do the appropriate blending etc.

The particle system represented particles in a true three-dimensional space (in fact, all coordinates in the engine were in true three dimensions, the engine would cull the quadtree nodes, transform visible sprites into camera space and then sort and draw), so they would have x, y, and z position components, and velocity and acceleration which would get integrated etc. The particle engine was flexible and allowed for custom code to be scripted in (there was no scripting language though, it's scripting was done by deriving from the vanilla particle system, overriding an update method and then writing whatever behavior you would need. The particles themselves could be derived from and customized, so it all worked out to be quick and flexible given the time constraints).

The animation system let you set animations, store animations, loop animations, and generate engine-wide events during certain frames, like cast a fireball or start a particle system during frame n for example.