KarBOOM Tech

Some notes I was collecting when I was working on a KarBOOM update many years ago. This project is on pause, as I've been filling my time as a consulting input specialist.

Clustered forward shading

With clustered forward shading, we can enjoy support for an arbitrary number of lights with cost roughly proportional to the screen space they cover, as with deferred shading. However, we also enjoy hardware antialiasing support (MSAA), and support for lights on transparent objects (with the same freedoms as on opaque objects), since it's actually forward shading.

The screen is divided into 3D tiles encoded into a 3D texture. Each pixel in the 3D texture encodes information corresponding to an array of lights passed to the shader. Here, tiles affected by 1 light are tinted red, 2 green, 3 blue, and 4 yellow.

The same tiles shown previously are here reduced to the ranges of their corresponding lights. There's some waste due to rectangular tiles covering more space than the lights they are touched by. Waste can be reduced by decreasing tile size. There's a CPU - GPU trade-off between tile resolution and waste due to lights not aligning with tiles.

The fragments are lit with per-material shaders in a forward shading pipeline, with each fragment looping through the lights that might affect it. This way, one can have different BRDFs for different materials, which is not easily (nor cheaply) done with deferred shading.

Lights can be shaped with a little extra info for spotlight effects. Unlike deferred shading, where each on-screen pixel has one position and normal as far as lights are concerned, transparent elements can sample their tile even if it doesn't correspond to an opaque surface behind them, and be lit accordingly:

Tsokvig filtering

Without Tsokvig filtering

With Tsokvig filtering

Mipmaps as they're normally generated don't play nice with normal maps. Normals get smoothed, and specular highlights exaggerate the change in normal from pixel to pixel, causing aliasing. Tsokvig filtering addresses both of these problems by modifying the roughness map to counter smoothing of normals, and to soften the transition from one pixel to another where aliasing might occur. Roughness is really just a way of describing sub-pixel normal variations. Since scaling down the texture due to zooming out makes formerly super-pixel normal variations become sub-pixel, it makes sense that the roughness can be modified to cover that up. That's what Tsokvig filtering achieves, and it's performed entirely in the mipmap generation process, so it has no effect on performance.

Antialiasing

These examples have been blown up to make the aliasing more obvious. Note the edges of the orange car, and the line where the blue tiles meet the grey wall.

No Antialiasing

Temporal Antialiasing (2x effective AA)

Temporal antialiasing combines information from the previous frame at a sub-pixel offset with the current frame, for effectively 2xAA. Other implementations continuously accumulate information across frames, for a very high number of effective samples per pixel, but that's more prone to flickering between frames and relies more heavily on complex filtering to avoid blending old fragments with new objects or lighting conditions, so I settled on using the second last frame to avoid flickering and the most recent frame for effectively two samples per pixel.

Temporal Antialiasing + 2xMSAA (4x effective AA)

Temporal Antialiasing + 8xMSAA (16x effective AA)

Combining MSAA with TAA (temporal antialiasing) combines really well to cheaply increase the effective number of samples per pixel, although there's some extra softness introduced due to MSAA + TAA samples spilling over the boundaries of each pixel.

Temporal Antialiasing + 2xSSAA (4x effective AA)

Temporal Antialiasing + 8xMSAA + 2xSSAA (32x effective AA)

Supersampling antialiasing (SSAA) simply involves rendering to a larger render target than the screen, and then downscaling to the screen size. KarBOOM currently supports up to 2xSSAA (multiply width and height each by √ 2), which combines with TAA and MSAA nicely if the user has the horsepower to spare. KarBOOM can also display at half the number of pixels, but at the moment isn't particularly smart about upscaling, so the result just looks blurry.