Cascaded Shadow Mapping

Category: Performance Visuals

Min PC GPU: GeForce 9xx

Min Tegra Device: Tegra X1

@ GitHub: Cascaded Shadow Mapping Source Code

Description

This sample implements the cascaded shadow mapping technique using Viewport Multicast and Fast Geometry Shader.

APIs Used

glViewportIndexedfv

Shared User Interface

The Graphics samples all share a common app framework and certain user interface elements, centered around the "Tweakbar" panel on the left side of the screen which lets you interactively control certain variables in each sample.

To show and hide the Tweakbar, simply click or touch the triangular button positioned in the top-left of the view.

Other controls are listed below.

Device	Input	Result
touch	1-Finger Drag	Orbit-rotate the camera
	2-Finger Drag	Move up/down/left/right
	2-Finger Pinch	Scale the view
mouse	Left-Button Drag	Orbit-rotate the camera
	Right-Button Drag	Move up/down/left/right
	Middle-Click Drag	Scale the view (up:out, down:in)
keyboard	Escape	Quit the application
	Tab	Toggle TweakBar visibility
gamepad	Start	Toggle TweakBar visibility
	Right ThumbStick	Orbit-rotate the camera
	Left ThumbStick	Move forward/backward, Slide left/right
	Left/Right Triggers	Move up/down
	A	Show TweakBar, Toggle Focused Item
	B	Close Focused UI, Hide TweakBar
	DPAD Up/Down	Move Focus to Prev/Next Item
	DPAD Left/Right	Decrease/Increase Focused Item

Technical Details

Overview

Cascaded shadow mapping attempts to get around quality issues associated with traditional shadow mapping by splitting the camera frustum into multiple "cascades," or sections in Z-space (from the camera's point of view). Then, for each sub-frustum/segment of the view frustum, we find the area it covers when projected into the light's point of view, and render a shadow map for the segment. Each segment shadow map is the same size, so for the smaller ones nearer to the camera, we should see a higher resolution shadow map.

Options

With pre-Maxwell capabilities, we can generate all of the shadow maps in a single rendering pass by using different viewports to define the boundaries of each segment and using geometry shaders to emit the primitives for each enabled viewport. With the new Viewport Multicast feature, we can actually send a primitive to multiple viewports from the vertex shader, bypassing the geometry shader entirely, which simplifies the code and gives a noticeable performance increase.

It may make sense to still keep the geometry shader around in order to perform culling steps to avoid sending primitives to viewports that they will not affect. While the VS-only method is faster than the normal geometry shader method, even with a culling step working, adding Viewport Multicast to the geometry shader with culling makes it pull ahead in performance.

Since the viewport mask used by Viewport Multicast is intended to be a per-primitive property, we can improve performance even more by using the new Fast Geometry Shader feature (a.k.a., Passthrough Geometry Shader), which is intended for use cases where we only use the geometry shader stage to set per-primitive attributes, instead of changing the primitive topology itself. With this feature, we can automatically copy vertex attributes from GS input to output and no longer need to call EmitVertex() and EndPrimitive(). This method gave the best performance in our own testing.

Performance Comparison

The relative performance of these shadowing methods will depend on scene complexity and number of cascades. Based on our sample test scene, we were able to see measurable differences in performance with as few as two cascades, but the gains were noticeably larger when scaling up to more.

Overall, these are the methods ranked in order of relative performance (fastest to slowest):

Fast Geometry Shader, Viewport Multicast, Culling
Normal Geometry Shader, Viewport Multicast, Culling
No Geometry Shader (Vertex Shader Only), Viewport Multicast, No Culling
Normal Geometry Shader, No Viewport Multicast, Culling
Normal Geometry Shader, No Viewport Multicast, No Culling

Note that the three fastest methods here require the new OpenGL features available with the Maxwell architecture (Viewport Multicast, Fast Geometry Shader).