Feedback Particles Sample


Category:  Performance   Visuals
Min PC GPU: Fermi-based (GTX 4xx)
Min Tegra Device: Tegra K1

Description

The Feedback Particles sample shows how normal vertex shaders can be used to animate particles and write the results back into vertex buffer objects via Transform Feedback, for use in subsequent frames. This is another way of implementing GPU-only particle animations. The sample also uses Geometry Shaders to generate custom particles from single points and also to kill the dead ones.

APIs Used

Shared User Interface

The OpenGL samples all share a common app framework and certain user interface elements, centered around the "Tweakbar" panel on the left side of the screen, which lets you interactively control certain variables in each sample.

To show and hide the Tweakbar, simply click or touch the triangular button positioned in the top-left of the view.

Other controls are listed below:

Device Input Result
touch 1-Finger Drag Orbit-rotate the camera
2-Finger Drag Move up/down/left/right
2-Finger Pinch Scale the view
mouse Left-Button Drag Orbit-rotate the camera
Right-Button Drag Move up/down/left/right
Middle-Click Drag Scale the view (up:out, down:in)
keyboard Escape Quit the application
  Tab Toggle Tweakbar visibility
gamepad Start Toggle Tweakbar visibility
  Right Thumbstick Orbit-rotate the camera
  Left Thumbstick Move forward/backward. Slide left/right
  Left/Right Triggers Move up/down

Technical Details

The Feedback Particles sample shows how normal vertex shaders can be used to animate particles and write the results back into vertex buffer objects via Transform Feedback, for use in subsequent frames. This is another way of implementing GPU-only particle animations. The sample also uses Geometry Shaders to generate custom particles from single points and to delete expired ones.

Naive Implementation

The first idea that comes to mind when it comes to GPU particle system using transform feedback is the idea of creating a geometry shader which will handle logic and lifecycle of the entire particle system or even a number of particle systems. With a single GPU program a single draw call can emit, move and delete particles. This approach could have simple shader logic like the following:

Single Pass (input from the previous frame transform feedback)

  1. Process TTL (Time to Live): if expired -- exit the shader;
  2. Read particle type;
  3. If it is an emitter:
    • If it is time to emit - emit new particles to the output; Reset the time to emit counter;
    • Process emitter data and push it to the output;
  4. If it is an ordinary particle:
    • Process particle data and push it to the output.

Running a simulation is straightforward:

				//Turn rendering OFF
				glEnable(GL_RASTERIZER_DISCARD_EXT);

				glUseProgram(m_simulationProgram);
				{
				//Output particles to Current feedback object
				glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_Current);
				glBeginTransformFeedback(GL_POINTS);

				//If not first frame, run GPU pass with input from feedback object Previous
				//If is first frame, run GPU pass with input from emitter VBO
				if (!m_isFirstHit)
				glDrawTransformFeedback(GL_POINTS, m_Previous);
				else
				{
				glDrawArrays(GL_POINTS, 0, m_emitterCount);
				m_isFirstHit = false;
				}

				glEndTransformFeedback();
				}

				//Turn rendering ON
				glDisable(GL_RASTERIZER_DISCARD_EXT);

				...

				//Render particles from feedback object Current
				glDrawTransformFeedback(GL_POINTS, m_Current);

				//Swap feedback objects IDs
				swap(m_Current, m_Previous);
			

However, although it may sound great and could be an interesting programming challenge, this approach is not that great performance-wise. The output of the geometry shader can be placed into the fast on-chip memory, which is usually quite limited in size. Because the GPU runs threads in parallel, this limitation may reduce the number of simultaneously running geometry shader threads and so reduce the overall performance of the particle system.

An Optimized Approach

A basic idea is to somehow workaround a possible GPU under-utilization introduced by the limited fast memory space. To do this, the particle system GPU program from the naive approach has to be split into two passes: a particle emission pass and a particle processing and deletion pass. Both passes stream data out to the same transform feedback buffer which will be later used during rendering and also on the next frame as input for the second pass.

However, there is an obstacle in the way: the GPU program can't be changed during transform feedback. To work around this obstacle we have to introduce a third pass: a generated particles copy pass. Obviously, the particle emission pass has shared memory limitations; but it will be issued only on emitters, which are usually small in number. The second pass GPU program will only work with one particle at a time and is free from the described limitation, unless particle system uses a very fat particle structure. It is also best to have emitters at the CPU side, as they can be controlled more comfortably. The simplest shader logic then for this approach would be:

First pass (input from CPU memory):

Second pass (input from the previous PASS transform feedback):

Third pass (input from the previous FRAME transform feedback):

So, pulling it all together, the new algorithm would be:

					//Turn rendering OFF;
					glEnable(GL_RASTERIZER_DISCARD_EXT);

					//Run first GPU pass;
					glUseProgram(m_emitProgram);
					{
					//emit particles to EmitterFeedback transform feedback object;
					glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_EmitterFeedback);
					glBeginTransformFeedback(GL_POINTS);
					glDrawArrays(GL_POINTS, 0, m_EmitterCount);
					glEndTransformFeedback();
					}

					//Run second and third GPU passes;
					glUseProgram(m_processProgram);
					{
					//Output particles to Current
					glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, m_Current);

					//Run second GPU pass with input from feedback object EmitterFeedback
					glDrawTransformFeedback(GL_POINTS, m_EmitterFeedback);

					//If not the first frame, then run third GPU pass with
					//the input from feedback object Previous
					if (!m_isFirstHit)
					glDrawTransformFeedback(GL_POINTS, m_Previous);

					glEndTransformFeedback();
					m_isFirstHit = false;
					}

					//Turn rendering ON;
					glDisable(GL_RASTERIZER_DISCARD_EXT);

					...

					//Render particles from feedback object Current
					glDrawTransformFeedback(GL_POINTS, m_Current);

					//Swap feedback object IDs
					swap(m_Current, m_Previous);
				

See Also


NVIDIA® GameWorks™ Documentation Rev. 1.0.220830 ©2014-2022. NVIDIA Corporation and affiliates. All Rights Reserved.