Feedback Particles Sample

Category: Performance Visuals

Min PC GPU: Fermi-based (GTX 4xx)

Min Tegra Device: Tegra K1

Description

The Feedback Particles sample shows how normal vertex shaders can be used to animate particles and write the results back into vertex buffer objects via Transform Feedback, for use in subsequent frames. This is another way of implementing GPU-only particle animations. The sample also uses Geometry Shaders to generate custom particles from single points and also to kill the dead ones.

APIs Used

GL_EXT_transform_feedback
glBindTransformFeedback
glDrawTransformFeedback
GL_EXT_geometry_shader4

Shared User Interface

The OpenGL samples all share a common app framework and certain user interface elements, centered around the "Tweakbar" panel on the left side of the screen, which lets you interactively control certain variables in each sample.

To show and hide the Tweakbar, simply click or touch the triangular button positioned in the top-left of the view.

Other controls are listed below:

Device	Input	Result
touch	1-Finger Drag	Orbit-rotate the camera
	2-Finger Drag	Move up/down/left/right
	2-Finger Pinch	Scale the view
mouse	Left-Button Drag	Orbit-rotate the camera
	Right-Button Drag	Move up/down/left/right
	Middle-Click Drag	Scale the view (up:out, down:in)
keyboard	Escape	Quit the application
	Tab	Toggle Tweakbar visibility
gamepad	Start	Toggle Tweakbar visibility
	Right Thumbstick	Orbit-rotate the camera
	Left Thumbstick	Move forward/backward. Slide left/right
	Left/Right Triggers	Move up/down

Technical Details

Naive Implementation

The first idea that comes to mind when it comes to GPU particle system using transform feedback is the idea of creating a geometry shader which will handle logic and lifecycle of the entire particle system or even a number of particle systems. With a single GPU program a single draw call can emit, move and delete particles. This approach could have simple shader logic like the following:

Single Pass (input from the previous frame transform feedback)

Process TTL (Time to Live): if expired -- exit the shader;
Read particle type;
If it is an emitter:
- If it is time to emit - emit new particles to the output; Reset the time to emit counter;
- Process emitter data and push it to the output;
If it is an ordinary particle:
- Process particle data and push it to the output.

Running a simulation is straightforward:

				//Turn rendering OFF
				glEnable(GL_RASTERIZER_DISCARD_EXT);

				glUseProgram(m_simulationProgram);
				{
				//Output particles to Current feedback object
				glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_Current);
				glBeginTransformFeedback(GL_POINTS);

				//If not first frame, run GPU pass with input from feedback object Previous
				//If is first frame, run GPU pass with input from emitter VBO
				if (!m_isFirstHit)
				glDrawTransformFeedback(GL_POINTS, m_Previous);
				else
				{
				glDrawArrays(GL_POINTS, 0, m_emitterCount);
				m_isFirstHit = false;
				}

				glEndTransformFeedback();
				}

				//Turn rendering ON
				glDisable(GL_RASTERIZER_DISCARD_EXT);

				...

				//Render particles from feedback object Current
				glDrawTransformFeedback(GL_POINTS, m_Current);

				//Swap feedback objects IDs
				swap(m_Current, m_Previous);

However, although it may sound great and could be an interesting programming challenge, this approach is not that great performance-wise. The output of the geometry shader can be placed into the fast on-chip memory, which is usually quite limited in size. Because the GPU runs threads in parallel, this limitation may reduce the number of simultaneously running geometry shader threads and so reduce the overall performance of the particle system.

An Optimized Approach

A basic idea is to somehow workaround a possible GPU under-utilization introduced by the limited fast memory space. To do this, the particle system GPU program from the naive approach has to be split into two passes: a particle emission pass and a particle processing and deletion pass. Both passes stream data out to the same transform feedback buffer which will be later used during rendering and also on the next frame as input for the second pass.

However, there is an obstacle in the way: the GPU program can't be changed during transform feedback. To work around this obstacle we have to introduce a third pass: a generated particles copy pass. Obviously, the particle emission pass has shared memory limitations; but it will be issued only on emitters, which are usually small in number. The second pass GPU program will only work with one particle at a time and is free from the described limitation, unless particle system uses a very fat particle structure. It is also best to have emitters at the CPU side, as they can be controlled more comfortably. The simplest shader logic then for this approach would be:

First pass (input from CPU memory):

[GS] If it is time to emit - emit new particles to the output;

Second pass (input from the previous PASS transform feedback):

[VS] Process particle;
[GS] If TTL is not expired push particle to the output;

Third pass (input from the previous FRAME transform feedback):

[VS] Process particle;
[GS] If TTL is not expired, push particle to the output;

So, pulling it all together, the new algorithm would be:

					//Turn rendering OFF;
					glEnable(GL_RASTERIZER_DISCARD_EXT);

					//Run first GPU pass;
					glUseProgram(m_emitProgram);
					{
					//emit particles to EmitterFeedback transform feedback object;
					glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_EmitterFeedback);
					glBeginTransformFeedback(GL_POINTS);
					glDrawArrays(GL_POINTS, 0, m_EmitterCount);
					glEndTransformFeedback();
					}

					//Run second and third GPU passes;
					glUseProgram(m_processProgram);
					{
					//Output particles to Current
					glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_Current);

					//Run second GPU pass with input from feedback object EmitterFeedback
					glDrawTransformFeedback(GL_POINTS, m_EmitterFeedback);

					//If not the first frame, then run third GPU pass with
					//the input from feedback object Previous
					if (!m_isFirstHit)
					glDrawTransformFeedback(GL_POINTS, m_Previous);

					glEndTransformFeedback();
					m_isFirstHit = false;
					}

					//Turn rendering ON;
					glDisable(GL_RASTERIZER_DISCARD_EXT);

					...

					//Render particles from feedback object Current
					glDrawTransformFeedback(GL_POINTS, m_Current);

					//Swap feedback object IDs
					swap(m_Current, m_Previous);