The Sound Source Localization feature analyzes the audio packets and computes the direction of the dominant sound source with respect to the reference axis. The direction is computed as an azimuth angle measured as radians in counter-clockwise direction from the reference axis. This feature supports only circular microphone arrays with at least 4 microphones in this release.
This feature is available as Sound Source Localization codelet.
Parameter |
Description |
---|---|
audio_duration |
The duration of audio in seconds for computing sound source localization. This duration should be an integer multiple of the input audio duration. |
microphone_distance |
The distance in meters between two diagonally opposite microphones which form one pair in the microphone_pairs list. Eg., for a Respeaker 4 mic array, the microphone distance is 0.064 m. |
microphone_pairs |
List of pairs of diagonally opposite microphone raw audio channel indices. The pairs should be provided with microphones in anti-clockwise direction. Eg., microphone_pairs: [[1,3], [2,4]] which correspond to microphones 1-4 that are arranged in counter-clockwise direction. |
reference_offset_angle |
This is the angle (in degrees) from the reference axis to the first microphone pair measured in counter-clockwise direction. Only integer values are accepted for offset angle. |
Message |
Proto Type |
Name |
---|---|---|
Input |
AudioDataProto |
audio_packets |
Output |
SourceAngleState |
audio_angle |
The computed azimuth angle (in radians) is converted to degrees. This converted angle in degrees
is available in Sight as angle
.
The Sound source localization sample application demonstrates both sound source localization and audio energy calculation features. This application contains four codelets:
Audio Capture codelet: Captures multi-channel audio data from microphone array.
Sound Source Localization codelet: This codelet computes the angle in radians for the direction of the dominant sound source in every 100 milliseconds audio packet. A dominant sound is one which spans the frequencies with higher energy when captured by the microphone array. This codelet receives audio packets from the Audio Capture codelet.
Audio Energy Calculation codelet: Computes average energy of the raw audio channels in the audio packets captured from the microphone array. This codelet receives audio packets from the Audio Capture codelet.
Direction Of Audio Event codelet: This is custom codelet which receives the azimuth angles of dominant sound sources from the Sound Source Localization codelet and the audio energies of the audio packets from the Audio Energy Calculation codelet. Both the input messages are synchronized with timestamps. The azimuth angles of direction of dominant sound sources where the energy of the corresponding audio packets is higher than the configured
energy_threshold
are plotted in Sight.
This application requires a microphone array connected to the host/device and set as default audio
input device in the system settings. The capture volume of the microphone should be set to 100%.
The specifications of the connected microphone array should be used to configure the audio capture
component (num_channels
and sample_rate
), the sound source localization component
(microphone_distance
and microphone_pairs
) and the audio energy calculation
component (channel_indices
and reference_energy
).
The application is configured for a ReSpeaker 4-mic array v2.0 with microphone_distance
as
0.064 meters and reference_energy
as 120 dB. The reference_offset_angle
is
configured for the reference axis as shown in the image for
the ReSpeaker 4-mic array.

The energy_threshold
configuration parameter of the direction of audio event component
defines the volume threshold of the audio packets for publishing angle. This is measured in dB and
is configured to 80 dB. The appropriate value for this parameter can be determined by observing
the average_energy_per_audio_packet
value in the Audio Energy Calculator window in Websight
(as shown in the image below). The threshold value can be set to the
highest value of this plot with ambient noise present in the environment.

The application can be launched with following command:
bob@desktop:~/isaac$ bazel run apps/samples/sound_source_localization
The application plots the angle in degrees and the average energy in dB for every audio packet
published by the audio capture component, and the angle of direction of audio event which is the
angle in degrees for those audio packets whose energy is above the threshold. These plots are
accessible in the Sight UI at http://localhost:3000
for the desktop or
http://ROBOTIP:3000
for Jetson.
Platforms: Desktop, Jetson TX/2, Jetson Xavier, Jetson Nano
Hardware: Requires a ReSpeaker 4 microphone array with the default configuration. Any circular microphone array with at least 4 channels can be used by updating the configuration accordingly.