Friday 29 August 2014

Rendering Post: Linearizing your depth buffer and simple fog.

Right so enough about tools for now, let's do some rendering work. I hadn't really devoted much time to the renderer for the project, since I've been focusing on the back end and tools. But, tools can be boring to talk about and not terribly useful to the reader, so I added a couple of features to the engine. Let's get started.

Linearizing Your Depth Buffer
An important fundamental step to a lot of graphics techniques is to be able to read the depth of the current pixel from your depth buffer. Well... no shit, but you need to be able to do it properly. When sampling from your depth buffer in the shader, you get a value back in the range [0, 1], and you might assume that 0 is the near plane and 1 is the far plane distance.. but nope, there's a characteristic here that we need to be aware of. What we need is to understand the transformation that the z component of a vertex will undergo and how and why it's interpolated across the surface of the triangle the way it is.

Now here is where it get's potentially confusing. When it comes to the handling of z values, there are two places we get the result 1/z and they are not related. Firstly we use it when it comes to interpolating vertex attributes successfully during rasterization. The second is a consequence of the perspective divide operation.

Let's start with the first (I'm going to assume that the you guys are familiar with perspective, our perception of the world, and why we need projection to replicate it in computer graphic. If not, see here). You see, when we go from the abstract three-dimensional description of a triangle to the two-dimensional triangle that gets interpolated across the screen, we suffer from a loss of dimensionality that makes it hard to determine the correct values we need for rendering. We project the three-dimensional triangle onto a two-dimensional plane and then fill it in. But we need to be able to reconstruct the three-dimensional information from the interpolated two-dimensional information.

Here on the left is the standard similar triangles diagram that explains why you divide by z to shrink the x value the farther the point is from the camera. Easy stuff.


For this quick explanation let's assume that your near plane is set to 1.0, which means we can remove it from the equation. We have:

x' = x / z

which we can manipulate to get the equation

x = x' * z

which is great because it means that we can reconstruct the original view space x from the projected x point by multiplying by the original z value. So as we're interpolating x' during the rasterization process we can find z and recalculate the original x value. We'd be able to recover a 3D attribute from a 2D interpolation. But this assumes that you have z handy, which we don't. We need to find a way to interpolate z linearly across the screen. A linear equation is an equation of the form y = Ax + B

Ignoring what A and B's actual values are for the moment, and substituting for x and z we get:

x = Az + B

but x = x' * z, so

x' * z = Az + B

which. after some manipulation gets us

z = B / (x' - A)

Which is hardly linear.  But now for the magic trick. Take the reciprocal of both sides.

(1 / z) = (1 / B) * x' - (A / B)

Hey, that's linear!

So, we can interpolate z's reciprocal in terms of x' (we get x' by interpolating across the screen during rasterization). From there, we can just take the interpolated value, and get it's reciprocal again and we'll have z. And from that we can recalculate the interpolated x and y in 3D space. So we've overcome the dimension loss of 2D rasterization. This is applied to all attributes associated to a vertex, things like texture coordinates etc.

Secondly, when we read from the depth buffer after a perspective projection. We get an odd value back. We need to understand where it comes from and how it's reversed to recover the proper depth value. Look at the summary below:

We can see the journey of the vertex from model space to view space. Here it gets projected into a unit cube centered on the origin where clipping is performed (hence the name). As an aside, it's also here that the GPU will generate more triangles if it needs to from the clipping process.

I use the stock standard OpenGL transformation system, so right handed coordinate systems with the part of view space that will be seen in the negative z range. Looking at the symmetric perspective projection matrix definition for OpenGL we have the following:



Renaming the items in the third row to A and B and multiplying a 4D homogeneous vector
(x, y, z, 1.0) by this matrix, our vector would be

(don't care, don't care, Az + B, -z)

Two things to note: Az + B is still a linear transform (it's just mapping the z range to the range [-1, 1] with -1 being the near plane and 1 being the far plane). but w has become -z. The negation is just because in our right handed view space the visible portion is on the negative z axis.
When the perspective w divide occurs we'll have a value of the form:

(1 / z) * C

Where C is -(Az + B).  I'd also add that somewhere here they sneakily transform the C value from the range [-1, 1] to [0, 1] so the final value is probably:

(C * 0.5 + 0.5) / z

And then, depending on your depth buffer format, that value might be converted to an integer value. But for modern day cards it should generally be floating point, and it's all hidden away by the API for you.

Using this information we can reconstruct the proper depth when we read from the depth buffer.
Firstly, we need to calculate some values and pass them into our shader as uniforms.
We'll calculate:

     float n = camera.GetNearClipPlaneDistance();  
     float f = camera.GetFarClipPlaneDistance();  
     float A = -(f + n) / (f - n);  
     float B = (-2.0f * f * n) / (f - n);  
     shader->SetUniform4f("ABnf", A, B, n, f);  

And in the shader, we firstly reverse the offset from [-1, 1] to [0, 1], so we get the value back in NDC space. Then we apply the inverse of the C / z equation, which can be best understood by this diagram:


So there's our reverse formula that we use to go from normalized device coordinates back to view space and we're done! In GLSL this is:
 #version 110  

uniform sampler2D textureUnit0;  

 /**  
 * Contains the components A, B, n, f in that order.  
 */  
 uniform vec4 ABnf;  

 varying vec2 vTexCoord0;  

 void main()  
 {  
     float A = ABnf.x;  
     float B = ABnf.y;  
     float n = ABnf.z;  
     float f = ABnf.w;  

     // Get the initial z value of the pixel.  
     float z = texture2D(textureUnit0, vTexCoord0).x;  
     z = (2.0 * z) - 1.0; 
 
     // Transform into view space.      
     float zView = -B / (z + A);  

     // Get normalized value.  
     zView /= -f;  

     gl_FragColor = vec4(zView, zView, zView, 1.0);  
 }  

I negate it because, remember, in OpenGL's view space, visible z values lie in the negative z axis.
Here's a screenshot of what this outputs in a test level:



To sum up the findings.
1) The odd nonlinear values you read from the depth buffer HAVE NOTHING TO DO WITH PERSPECTIVE CORRECT INTERPOLATION.  They are caused entirely by the perspective divide caused by perspective projection.

2) There still needs to be an interpolation of 1/z somewhere in order to enable perspective correct interpolation of vertex attributes, we just don't have to care about it. Your vertices attributes go in on one end, and come out perspective correct on the other, you don't need to divide by z again.

Bonus Feature: Simple exponential fog.
So what are the uses of linear depth? Well as you'll see in posts in the future its really used all over the place, but for starters, I just coded up a simple exponential fog function. All it really does is get the normalized linear distance (a value from [0, 1]) and square it, with some artist configurable parameters of course like starting distance and a density multiplier. It's as simple a shader as you can get, and takes about 30 minutes to get integrated... ok maybe a bit more because I coded up the beginnings of the post processing framework at the same time. Anyways, the point is for a simple shader, the effects can be quite dramatic:


Useful resources:

http://www.songho.ca/opengl/gl_projectionmatrix.html
A beautifully comprehensive guide to the derivations of both orthographic and perspective projection matrices for OpenGL camera systems. Also where I grabbed the projection matrix image from.

http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping
All hail! The original and unsurpassed articles on perspective texture mapping. A little outdated now but if you're looking for the fundamentals of texture mapping and manage to survive them, they're still very informative. In fact, while you're at it, go read all of Chris Hecker's articles, he's up there with Michael Abrash in terms of ability to bring extremely technical topics down to a casual conversation level without dumbing it down.

http://www.amazon.com/Mathematics-Programming-Computer-Graphics-Edition/dp/1584502770
I actually have two copies of this book. One I used to keep at work and one for home.  Chapters 1-3 are what you need if you need an introduction to matrices, vectors, affine transformations, projections, quaternions, and so on. Integrates nicely with OpenGL engine development as all the conventions are based off of that API.




No comments:

Post a Comment