Feedback Particles Sample

Category: Performance Visuals

Min PC GPU: GeForce Fermi-class

Min Tegra Device: Tegra K1

@ GitHub: Feedback Particles Sample Source Code

Description

The Feedback Particles sample shows how normal vertex shaders can be used to animate particles and write the results back into vertex buffer objects via Transform Feedback, for use in subsequent frames. This is another way of implementing GPU-only particle animations. The sample also uses Geometry Shaders to generate custom particles from single points and also to kill the dead ones.

APIs Used

GL_EXT_transform_feedback
glBindTransformFeedback
glDrawTransformFeedback
GL_EXT_geometry_shader4

Shared User Interface

The Graphics samples all share a common app framework and certain user interface elements, centered around the "Tweakbar" panel on the left side of the screen which lets you interactively control certain variables in each sample.

To show and hide the Tweakbar, simply click or touch the triangular button positioned in the top-left of the view.

Other controls are listed below.

Device	Input	Result
touch	1-Finger Drag	Orbit-rotate the camera
	2-Finger Drag	Move up/down/left/right
	2-Finger Pinch	Scale the view
mouse	Left-Button Drag	Orbit-rotate the camera
	Right-Button Drag	Move up/down/left/right
	Middle-Click Drag	Scale the view (up:out, down:in)
keyboard	Escape	Quit the application
	Tab	Toggle TweakBar visibility
gamepad	Start	Toggle TweakBar visibility
	Right ThumbStick	Orbit-rotate the camera
	Left ThumbStick	Move forward/backward, Slide left/right
	Left/Right Triggers	Move up/down
	A	Show TweakBar, Toggle Focused Item
	B	Close Focused UI, Hide TweakBar
	DPAD Up/Down	Move Focus to Prev/Next Item
	DPAD Left/Right	Decrease/Increase Focused Item

Technical Details

Naive Implementation

The first idea that comes to mind when it comes to GPU particle system using transform feedback is the idea of creating a geometry shader which will handle logic and lifecycle of the entire particle system or even a number of particle systems. With a single GPU program a single draw call can emit, move and delete particles. This approach could have simple shader logic like the following:

Single Pass (input from the previous frame transform feedback)

Process TTL (Time to Live): if expired -- exit the shader;
Read particle type;
If it is an emitter:
- If it is time to emit - emit new particles to the output; Reset the time to emit counter;
- Process emitter data and push it to the output;
If it is an ordinary particle:
- Process particle data and push it to the output.

Running a simulation is straightforward:

//Turn rendering OFF
glEnable(GL_RASTERIZER_DISCARD_EXT);

glUseProgram(m_simulationProgram);
{
    //Output particles to Current feedback object
    glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_Current);
    glBeginTransformFeedback(GL_POINTS);

    //If not first frame, run GPU pass with input from feedback object Previous
    //If is first frame, run GPU pass with input from emitter VBO
    if (!m_isFirstHit)
        glDrawTransformFeedback(GL_POINTS, m_Previous);
    else
    {
        glDrawArrays(GL_POINTS, 0, m_emitterCount);
        m_isFirstHit = false;
    }

    glEndTransformFeedback();
}

//Turn rendering ON
glDisable(GL_RASTERIZER_DISCARD_EXT);

...

//Render particles from feedback object Current
glDrawTransformFeedback(GL_POINTS, m_Current);

//Swap feedback objects IDs
swap(m_Current, m_Previous);

However, although it may sound great and could be an interesting programming challenge, this approach is not that great performance-wise. The output of the geometry shader can be placed into the fast on-chip memory, which is usually quite limited in size. Because the GPU runs threads in parallel, this limitation may reduce the number of simultaneously running geometry shader threads and so reduce the overall performance of the particle system.

An Optimized Approach

A basic idea is to somehow workaround a possible GPU under-utilization introduced by the limited fast memory space. To do this, the particle system GPU program from the naive approach has to be split into two passes: a particle emission pass and a particle processing and deletion pass. Both passes stream data out to the same transform feedback buffer which will be later used during rendering and also on the next frame as input for the second pass.

However, there is an obstacle in the way: the GPU program can't be changed during transform feedback. To work around this obstacle we have to introduce a third pass: a generated particles copy pass. Obviously, the particle emission pass has shared memory limitations; but it will be issued only on emitters, which are usually small in number. The second pass GPU program will only work with one particle at a time and is free from the described limitation, unless particle system uses a very fat particle structure. It is also best to have emitters at the CPU side, as they can be controlled more comfortably. The simplest shader logic then for this approach would be:

First pass (input from CPU memory):

[GS] If it is time to emit - emit new particles to the output;

Second pass (input from the previous PASS transform feedback):

[VS] Process particle;
[GS] If TTL is not expired push particle to the output;

Third pass (input from the previous FRAME transform feedback):

[VS] Process particle;
[GS] If TTL is not expired, push particle to the output;

So, pulling it all together, the new algorithm would be:

//Turn rendering OFF;
glEnable(GL_RASTERIZER_DISCARD_EXT);

//Run first GPU pass;
glUseProgram(m_emitProgram);
{
    //emit particles to EmitterFeedback transform feedback object;
    glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_EmitterFeedback);
    glBeginTransformFeedback(GL_POINTS);
    glDrawArrays(GL_POINTS, 0, m_EmitterCount);
    glEndTransformFeedback();
}

//Run second and third GPU passes;
glUseProgram(m_processProgram);
{
    //Output particles to Current
    glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_Current);

    //Run second GPU pass with input from feedback object EmitterFeedback
    glDrawTransformFeedback(GL_POINTS, m_EmitterFeedback);

    //If not the first frame, then run third GPU pass with
    //the input from feedback object Previous
    if (!m_isFirstHit)
        glDrawTransformFeedback(GL_POINTS, m_Previous);

    glEndTransformFeedback();
    m_isFirstHit = false;
}

//Turn rendering ON;
glDisable(GL_RASTERIZER_DISCARD_EXT);

...

//Render particles from feedback object Current
glDrawTransformFeedback(GL_POINTS, m_Current);

//Swap feedback object IDs
swap(m_Current, m_Previous);