Performance¶
Below are measured performance for the Riva ASR, NLP, and TTS services on NVIDIA T4, V100 SXM2 16 GB, and NVIDIA A100 SXM4 40 GB GPUs. CPU specifications for each system can be found here:
ASR¶
The latency numbers below were measured using the streaming recognition mode, with the
BERT-based punctuation model enabled, a 4-gram language model, a decoder beam width of
128, and timestamps enabled. The Jasper, QuartzNet and Citrinet-1024 acoustic models were tested. The client and the
server used audio chunks of the same duration (100ms, 160ms, 800ms, 3200ms depending on
the server configuration). The Riva streaming client riva_streaming_asr_client
,
provided in the Riva client image, was used with the --simulate_realtime
flag to
simulate transcription from a microphone, where each stream was doing 5 iterations
over a sample audio file from the Librispeech dataset (1272-135031-0000.wav).
The command used was:
riva_streaming_asr_client \
--chunk_duration_ms=<chunk_duration> --simulate_realtime=true \
--automatic_punctuation=true --num_parallel_requests=<num_streams> \
--word_time_offsets=true --print_transcripts=false \
--interim_results=false --num_iterations=<5*num_streams> \
--audio_file=1272-135031-0000.wav --output_filename=/tmp/output.json
Note
There is one audio channel per stream. For example, to handle a stereo audio file with two channels, there will need to be two streams.
The riva_streaming_asr_client
returns latency measured in three different ways
after executing the benchmark task:
intermediate latency
: latency to return an intermediate transcript withis_final == false
final latency
: latency of messages return withis_final == true
latency
: the overall latency of all returned message types
The overall latency numbers are reported below.
NVIDIA A100 GPU¶
Streaming, low-latency¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
160 |
9.7577 |
9.5604 |
9.9649 |
11.466 |
14.906 |
0.99969 |
citrinet |
8 |
160 |
14.403 |
14.171 |
14.973 |
16.205 |
29.493 |
7.9947 |
citrinet |
16 |
160 |
26.812 |
26.518 |
29.656 |
30.356 |
59.582 |
15.979 |
citrinet |
32 |
160 |
41.707 |
41.789 |
43.589 |
45.316 |
98.952 |
31.923 |
citrinet |
48 |
160 |
56.107 |
55.825 |
59.398 |
60.751 |
139.32 |
47.837 |
citrinet |
64 |
160 |
59.71 |
58.399 |
66.993 |
69.52 |
161.63 |
63.734 |
citrinet |
96 |
160 |
73.294 |
74.294 |
85.567 |
91.818 |
229.74 |
95.508 |
citrinet |
128 |
160 |
91.074 |
90.655 |
102.74 |
107.51 |
292.04 |
127.08 |
jasper |
1 |
100 |
13.531 |
13.183 |
15.061 |
17.599 |
21.585 |
0.99955 |
jasper |
8 |
100 |
22.796 |
22.713 |
29.995 |
31.778 |
48.767 |
7.9914 |
jasper |
16 |
100 |
31.498 |
29.847 |
40.571 |
44.482 |
59.163 |
15.979 |
jasper |
32 |
100 |
41.884 |
41.578 |
50.799 |
54.911 |
79.555 |
31.942 |
jasper |
48 |
100 |
46.696 |
46.577 |
57.675 |
63.062 |
90.114 |
47.89 |
jasper |
64 |
100 |
54.044 |
54.195 |
66.216 |
71.833 |
112.54 |
63.83 |
jasper |
96 |
100 |
71.604 |
72.763 |
90.908 |
96.76 |
182.88 |
95.631 |
jasper |
128 |
100 |
98.472 |
93.921 |
120.42 |
132.16 |
385.83 |
127.43 |
quartznet |
1 |
100 |
9.1328 |
8.6998 |
10.633 |
11.756 |
17.986 |
0.99955 |
quartznet |
8 |
100 |
14.043 |
13.178 |
18.048 |
21.065 |
36.566 |
7.9927 |
quartznet |
16 |
100 |
17.884 |
17.083 |
23.236 |
26.338 |
45.201 |
15.981 |
quartznet |
32 |
100 |
25.94 |
25.778 |
32.443 |
36.645 |
65.529 |
31.946 |
quartznet |
48 |
100 |
34.368 |
34.593 |
42.599 |
46.881 |
92.013 |
47.879 |
quartznet |
64 |
100 |
41.66 |
41.664 |
50.617 |
54.864 |
114.01 |
63.808 |
quartznet |
96 |
100 |
51.074 |
49.639 |
64.404 |
71.843 |
170.55 |
95.633 |
quartznet |
128 |
100 |
55.536 |
53.139 |
74.938 |
84.083 |
186.51 |
127.46 |
Streaming, high-throughput¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
800 |
10.267 |
9.9903 |
11.102 |
12.933 |
14.493 |
0.99973 |
citrinet |
64 |
800 |
67.571 |
66.531 |
76.846 |
105.14 |
151.98 |
63.796 |
citrinet |
128 |
800 |
97.206 |
100.76 |
122.88 |
147.72 |
222.68 |
127.41 |
citrinet |
256 |
800 |
167.38 |
175.42 |
197.88 |
266.98 |
435.39 |
253.48 |
citrinet |
384 |
800 |
238.29 |
251.99 |
291.62 |
372.06 |
613.68 |
379 |
citrinet |
512 |
800 |
293.79 |
309.88 |
358.29 |
479.51 |
865.01 |
503.17 |
citrinet |
768 |
800 |
436.27 |
439.45 |
520.01 |
727.25 |
2058.7 |
748.34 |
citrinet |
1024 |
800 |
661.79 |
573.26 |
865.74 |
1552.2 |
4643.1 |
987.68 |
jasper |
1 |
800 |
20.922 |
20.692 |
28.875 |
29.405 |
29.865 |
0.99955 |
jasper |
64 |
800 |
84.316 |
82.369 |
117.57 |
134 |
163 |
63.803 |
jasper |
128 |
800 |
119.35 |
119.19 |
158.91 |
198.17 |
235.79 |
127.41 |
jasper |
256 |
800 |
173.3 |
169.53 |
241.81 |
307.34 |
372.86 |
253.97 |
jasper |
384 |
800 |
235.77 |
230.55 |
352.95 |
445.74 |
544.94 |
379.37 |
jasper |
512 |
800 |
286.03 |
281.09 |
430.92 |
592.05 |
725.74 |
504.25 |
jasper |
768 |
800 |
422.54 |
376.28 |
664.59 |
1232.5 |
1454 |
750.46 |
jasper |
1024 |
800 |
700.63 |
466.39 |
1740.5 |
2513.7 |
3490.7 |
988.89 |
quartznet |
1 |
800 |
17.209 |
17.765 |
23.747 |
24.169 |
25.651 |
0.99958 |
quartznet |
64 |
800 |
71.822 |
70.378 |
101.56 |
120.75 |
142.12 |
63.808 |
quartznet |
128 |
800 |
95.425 |
92.052 |
137.83 |
172.7 |
219.42 |
127.44 |
quartznet |
256 |
800 |
142.75 |
131.17 |
223.14 |
287.92 |
350.75 |
254.04 |
quartznet |
384 |
800 |
184.19 |
169.87 |
284.12 |
374.8 |
468.78 |
380.15 |
quartznet |
512 |
800 |
214.21 |
198.39 |
330.84 |
494.27 |
617.66 |
505.23 |
quartznet |
768 |
800 |
294.87 |
257.71 |
486.65 |
761.83 |
1144.2 |
752.05 |
quartznet |
1024 |
800 |
377.75 |
308.13 |
672.65 |
1197.6 |
1678.3 |
998.09 |
Offline¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
1600 |
11.581 |
11.076 |
13.587 |
14.2 |
16.066 |
0.99972 |
citrinet |
256 |
1600 |
203.91 |
208.22 |
280.15 |
375.9 |
449.63 |
253.71 |
citrinet |
512 |
1600 |
366.76 |
377.84 |
525.1 |
730.28 |
882.79 |
503.15 |
citrinet |
768 |
1600 |
520.51 |
537.58 |
789.66 |
1073.8 |
1305.8 |
748.03 |
citrinet |
1024 |
1600 |
680.4 |
696.13 |
1046.5 |
1421.1 |
2420.1 |
989.19 |
citrinet |
1280 |
1600 |
809.97 |
762.33 |
1278.9 |
2497.8 |
2975.9 |
1226 |
citrinet |
1512 |
1600 |
981.25 |
833.21 |
1525.6 |
2928.7 |
4528 |
1437 |
jasper |
1 |
3200 |
35.778 |
37.577 |
40.806 |
40.839 |
40.839 |
0.9994 |
jasper |
256 |
3200 |
370.35 |
371.36 |
486.26 |
506.86 |
531.23 |
253.55 |
jasper |
512 |
3200 |
631.34 |
637.79 |
855.65 |
892.55 |
956.61 |
502.68 |
jasper |
768 |
3200 |
993.58 |
1004 |
1437.5 |
1792.7 |
1997.5 |
744.72 |
jasper |
1024 |
3200 |
1495.5 |
1481.1 |
2371.6 |
2474.2 |
2620.5 |
977.04 |
jasper |
1280 |
3200 |
2028.4 |
2040.6 |
3173.1 |
4182.9 |
4434.1 |
1198.3 |
jasper |
1512 |
3200 |
2544.2 |
2512.9 |
4790.7 |
5109.1 |
5445.4 |
1395.1 |
quartznet |
1 |
3200 |
34.388 |
34.271 |
38.126 |
38.376 |
38.376 |
0.99941 |
quartznet |
256 |
3200 |
267.27 |
260.58 |
376.31 |
396.37 |
434.8 |
254.11 |
quartznet |
512 |
3200 |
457.53 |
445.7 |
683.32 |
715.81 |
757.45 |
504.45 |
quartznet |
768 |
3200 |
637.18 |
620.67 |
972.11 |
1020.9 |
1094.7 |
751.29 |
quartznet |
1024 |
3200 |
941.86 |
939.81 |
1549.6 |
1687.6 |
1860.7 |
988.58 |
quartznet |
1280 |
3200 |
1290.3 |
1245.3 |
2183.7 |
2300 |
2462.1 |
1218.6 |
quartznet |
1512 |
3200 |
1592.7 |
1590.9 |
2678.5 |
2861.7 |
3493.8 |
1421 |
NVIDIA A30¶
Streaming, low-latency¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
160 |
14.097 |
13.963 |
14.241 |
14.467 |
23.026 |
0.99952 |
citrinet |
8 |
160 |
31.256 |
33.222 |
34.823 |
35.302 |
51.29 |
7.9904 |
citrinet |
16 |
160 |
44.911 |
44.726 |
48.284 |
49.754 |
83.231 |
15.97 |
citrinet |
32 |
160 |
63.817 |
64.056 |
69.934 |
71.409 |
141.25 |
31.902 |
citrinet |
48 |
160 |
70.657 |
69.99 |
76.363 |
79.853 |
186.81 |
47.811 |
citrinet |
64 |
160 |
85.287 |
84.761 |
93.108 |
100.53 |
240.1 |
63.674 |
citrinet |
96 |
160 |
126.2 |
122.43 |
135.75 |
149.39 |
349.59 |
95.277 |
citrinet |
128 |
160 |
177.57 |
161.16 |
208.79 |
302.54 |
515.74 |
126.76 |
jasper |
1 |
100 |
15.144 |
14.742 |
16.153 |
18.412 |
25.529 |
0.99947 |
jasper |
8 |
100 |
23.389 |
21.772 |
30.426 |
35.951 |
53.913 |
7.9893 |
jasper |
16 |
100 |
40.269 |
39.476 |
45.645 |
49.016 |
73.797 |
15.974 |
jasper |
32 |
100 |
50.508 |
49.153 |
58.838 |
62.989 |
91.88 |
31.936 |
jasper |
48 |
100 |
61.582 |
60.531 |
70.907 |
76.604 |
114.88 |
47.879 |
jasper |
64 |
100 |
70.52 |
72.573 |
85.199 |
91.798 |
176.93 |
63.796 |
jasper |
96 |
100 |
139.76 |
119.76 |
169.73 |
203.13 |
667.9 |
95.558 |
jasper |
128 |
100 |
2663.2 |
2734.4 |
3690.8 |
4282.6 |
5406.3 |
120.24 |
quartznet |
1 |
100 |
10.224 |
9.7236 |
11.821 |
12.977 |
20.453 |
0.99948 |
quartznet |
8 |
100 |
17.442 |
16.234 |
21.907 |
24.661 |
45.751 |
7.9914 |
quartznet |
16 |
100 |
25.758 |
24.922 |
30.201 |
33.584 |
58.757 |
15.979 |
quartznet |
32 |
100 |
34.549 |
33.219 |
41.661 |
46.362 |
89.622 |
31.928 |
quartznet |
48 |
100 |
41.023 |
38.696 |
53.807 |
59.854 |
118.91 |
47.855 |
quartznet |
64 |
100 |
46.091 |
44.091 |
58.148 |
66.58 |
152.35 |
63.798 |
quartznet |
96 |
100 |
55.743 |
54.269 |
67.866 |
73.946 |
171.05 |
95.65 |
quartznet |
128 |
100 |
65.466 |
62.553 |
79.624 |
90.737 |
255.93 |
127.42 |
Streaming, high-throughput¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
800 |
15.273 |
14.994 |
15.455 |
19.696 |
19.949 |
0.99958 |
citrinet |
64 |
800 |
109.02 |
107.38 |
128.03 |
151.81 |
202.02 |
63.715 |
citrinet |
128 |
800 |
148.75 |
152.08 |
192.99 |
211.04 |
297.74 |
127.18 |
citrinet |
256 |
800 |
262.4 |
283.86 |
317.88 |
356.92 |
560.89 |
252.86 |
citrinet |
384 |
800 |
371.76 |
404.16 |
447.92 |
523.96 |
862.57 |
377.17 |
citrinet |
512 |
800 |
494.45 |
506.77 |
581.76 |
775.86 |
1891.1 |
500.13 |
citrinet |
768 |
800 |
2783.7 |
2135.2 |
4189.8 |
7105.5 |
15928 |
690.64 |
citrinet |
1024 |
800 |
12500 |
11704 |
23256 |
25405 |
31698 |
690.07 |
jasper |
1 |
800 |
22.319 |
22.159 |
31.174 |
32.097 |
33.375 |
0.99948 |
jasper |
64 |
800 |
114.17 |
115.44 |
161.14 |
176.17 |
201.95 |
63.745 |
jasper |
128 |
800 |
153.32 |
150.7 |
207.83 |
234.87 |
274.78 |
127.28 |
jasper |
256 |
800 |
252.14 |
255.48 |
342.14 |
422.99 |
500.16 |
253.31 |
jasper |
384 |
800 |
343 |
348.23 |
486.63 |
628.56 |
750.23 |
378.22 |
jasper |
512 |
800 |
454.08 |
435.64 |
634.21 |
1209.4 |
1391.3 |
501.97 |
jasper |
768 |
800 |
1147.9 |
630.28 |
3483.1 |
4374 |
6244.5 |
738.67 |
jasper |
1024 |
800 |
7770.8 |
2115.2 |
24299 |
27926 |
39791 |
722.18 |
quartznet |
1 |
800 |
19.7 |
20.492 |
26.528 |
26.984 |
28.295 |
0.99956 |
quartznet |
64 |
800 |
90.109 |
88.12 |
136.98 |
155.58 |
186.51 |
63.745 |
quartznet |
128 |
800 |
134.38 |
130.89 |
206.04 |
236.54 |
284.1 |
127.22 |
quartznet |
256 |
800 |
177.91 |
167.48 |
256.5 |
336.28 |
417.29 |
253.86 |
quartznet |
384 |
800 |
228.15 |
214.13 |
333.46 |
484.6 |
603.72 |
379.36 |
quartznet |
512 |
800 |
293.71 |
274.95 |
437.49 |
633.69 |
959.8 |
503.66 |
quartznet |
768 |
800 |
416.94 |
367.62 |
682.84 |
1227.2 |
1486.6 |
749.88 |
quartznet |
1024 |
800 |
654.73 |
459.37 |
1488.7 |
2206.2 |
3094.9 |
986.78 |
Offline¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
1600 |
16.888 |
16.21 |
20.823 |
21.084 |
21.733 |
0.99957 |
citrinet |
256 |
1600 |
321.49 |
343.74 |
433.47 |
527.6 |
611.02 |
252.89 |
citrinet |
512 |
1600 |
574.1 |
618.1 |
787.82 |
973.16 |
1119.4 |
500.59 |
citrinet |
768 |
1600 |
815.24 |
855.14 |
1169.4 |
1446.5 |
2576.5 |
742.88 |
citrinet |
1024 |
1600 |
1166.1 |
1042 |
1577.1 |
3085.1 |
5150.6 |
979.48 |
citrinet |
1280 |
1600 |
3802.6 |
1546.3 |
13707 |
25415 |
29243 |
1064.1 |
citrinet |
1512 |
1600 |
8992.4 |
7448.2 |
18628 |
28306 |
37498 |
1107.8 |
jasper |
1 |
3200 |
40.267 |
42.947 |
46.948 |
47.195 |
47.195 |
0.99932 |
jasper |
256 |
3200 |
584.09 |
590.71 |
726.94 |
753.54 |
783.72 |
252.4 |
jasper |
512 |
3200 |
1149.5 |
1115.7 |
1500.8 |
2114 |
2228.4 |
497.58 |
jasper |
768 |
3200 |
2000.4 |
2040.5 |
2956.6 |
3123.5 |
3267.1 |
728.64 |
jasper |
1024 |
3200 |
2981.7 |
2702.2 |
5284.5 |
5551.6 |
5768 |
947.66 |
jasper |
1280 |
3200 |
11241 |
10417 |
22915 |
26534 |
30618 |
1018.4 |
jasper |
1512 |
3200 |
18136 |
17164 |
39755 |
42884 |
45775 |
977.51 |
quartznet |
1 |
3200 |
41.432 |
41.73 |
47.32 |
47.393 |
47.393 |
0.99922 |
quartznet |
256 |
3200 |
389.99 |
387.1 |
525.14 |
551.07 |
588.59 |
253.36 |
quartznet |
512 |
3200 |
696.41 |
691.1 |
973.09 |
1016.4 |
1057.1 |
502.43 |
quartznet |
768 |
3200 |
1081.9 |
1059.1 |
1654.5 |
1940.8 |
2132.9 |
743.25 |
quartznet |
1024 |
3200 |
1536.5 |
1562.8 |
2509.1 |
2619.1 |
2743.1 |
973.64 |
quartznet |
1280 |
3200 |
2082.1 |
2110.5 |
3373.1 |
4299.3 |
4604.1 |
1195.5 |
quartznet |
1512 |
3200 |
2712.4 |
2633.9 |
5057.5 |
5444 |
5799.6 |
1391.9 |
NVIDIA V100 GPU¶
Streaming, low-latency¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
160 |
11.155 |
11.046 |
11.288 |
11.41 |
17.872 |
0.99962 |
citrinet |
8 |
160 |
19.291 |
19.115 |
20.081 |
20.55 |
38.068 |
7.9929 |
citrinet |
16 |
160 |
34.927 |
34.655 |
37.423 |
38.275 |
67.863 |
15.97 |
citrinet |
32 |
160 |
55.109 |
54.295 |
59.011 |
59.798 |
120 |
31.904 |
citrinet |
48 |
160 |
71.089 |
70.934 |
75.116 |
76.805 |
176.51 |
47.814 |
citrinet |
64 |
160 |
86.125 |
84.837 |
95.294 |
98.694 |
247.09 |
63.67 |
citrinet |
96 |
160 |
126.16 |
122.67 |
135.84 |
151.61 |
381.84 |
95.277 |
citrinet |
128 |
160 |
213.25 |
178.64 |
293.87 |
381.58 |
610.95 |
126.6 |
jasper |
1 |
100 |
19.313 |
19.058 |
19.922 |
21.069 |
25.638 |
0.99949 |
jasper |
8 |
100 |
22.77 |
22.278 |
24.183 |
26.782 |
39.872 |
7.9928 |
jasper |
16 |
100 |
38.901 |
38.689 |
41.238 |
44.288 |
63.133 |
15.978 |
jasper |
32 |
100 |
62.84 |
64 |
73.879 |
77.532 |
99.325 |
31.925 |
jasper |
48 |
100 |
79.371 |
79.701 |
91.871 |
95.88 |
232.79 |
47.845 |
jasper |
64 |
100 |
123.11 |
104.12 |
159.64 |
194.57 |
621.62 |
63.73 |
jasper |
96 |
100 |
6296 |
6773.9 |
9091.7 |
9558.3 |
10582 |
84.651 |
jasper |
128 |
100 |
14234 |
12951 |
27171 |
28922 |
30684 |
85.557 |
quartznet |
1 |
100 |
8.2814 |
7.9853 |
9.1149 |
9.938 |
14.611 |
0.99964 |
quartznet |
8 |
100 |
12.319 |
11.417 |
14.926 |
17.17 |
33.032 |
7.9933 |
quartznet |
16 |
100 |
17.689 |
16.901 |
20.816 |
22.627 |
45.56 |
15.981 |
quartznet |
32 |
100 |
22.753 |
21.951 |
26.937 |
28.943 |
66.365 |
31.942 |
quartznet |
48 |
100 |
28.46 |
27.856 |
34.195 |
37.101 |
83.296 |
47.88 |
quartznet |
64 |
100 |
147.01 |
149.18 |
273.91 |
301.15 |
379.3 |
63.724 |
quartznet |
96 |
100 |
102.59 |
63.048 |
200.29 |
260.69 |
317.83 |
95.218 |
quartznet |
128 |
100 |
183.08 |
169.84 |
312.08 |
380.39 |
562.45 |
127.12 |
Streaming, high-throughput¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
800 |
12.499 |
12.171 |
12.917 |
16.227 |
16.715 |
0.99964 |
citrinet |
64 |
800 |
96.624 |
95.566 |
104.06 |
139.88 |
190.72 |
63.727 |
citrinet |
128 |
800 |
154.45 |
160.09 |
176.91 |
216.03 |
310.11 |
127.15 |
citrinet |
256 |
800 |
284.54 |
312.12 |
337.24 |
426.03 |
623.07 |
252.74 |
citrinet |
384 |
800 |
407.1 |
427.27 |
471.66 |
592 |
1288.1 |
376.46 |
citrinet |
512 |
800 |
553.92 |
544.28 |
632.77 |
1066 |
2346.7 |
498.69 |
citrinet |
768 |
800 |
4443.9 |
3536.1 |
7644 |
10741 |
19530 |
649.68 |
citrinet |
1024 |
800 |
13985 |
13048 |
25700 |
27722 |
35948 |
662.18 |
jasper |
1 |
800 |
23.608 |
23.412 |
29.76 |
30.542 |
30.636 |
0.99949 |
jasper |
64 |
800 |
117.65 |
113.98 |
158.53 |
179.15 |
206.35 |
63.764 |
jasper |
128 |
800 |
191.68 |
191.68 |
249.16 |
301.47 |
354.78 |
127.15 |
jasper |
256 |
800 |
336.55 |
341.41 |
471.84 |
575.72 |
678.39 |
252.51 |
jasper |
384 |
800 |
498.33 |
480.12 |
658.43 |
1267.7 |
1449.2 |
376.51 |
jasper |
512 |
800 |
937.12 |
652.12 |
2284.5 |
3164 |
4349.1 |
497.21 |
jasper |
768 |
800 |
10850 |
4084.4 |
29171 |
35335 |
53749 |
463.86 |
jasper |
1024 |
800 |
21063 |
11314 |
55549 |
61478 |
87355 |
447.81 |
quartznet |
1 |
800 |
13.688 |
13.055 |
18.644 |
19.929 |
20.172 |
0.99966 |
quartznet |
64 |
800 |
79.151 |
67.019 |
146.7 |
165.51 |
225.99 |
63.799 |
quartznet |
128 |
800 |
137.63 |
128.42 |
235.86 |
278.86 |
340.59 |
127.09 |
quartznet |
256 |
800 |
196.42 |
188.09 |
331.67 |
400.93 |
530.61 |
253.69 |
quartznet |
384 |
800 |
256.63 |
229.5 |
414.07 |
556.41 |
712.33 |
378.83 |
quartznet |
512 |
800 |
308.14 |
270.6 |
509.56 |
719.1 |
1115.9 |
501.67 |
quartznet |
768 |
800 |
458.11 |
376.72 |
737.8 |
1322.9 |
1950.3 |
747.79 |
quartznet |
1024 |
800 |
758.74 |
462.36 |
2045.4 |
2778.1 |
3697 |
978.44 |
Offline¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
1600 |
12.722 |
12.232 |
15.655 |
15.965 |
15.969 |
0.99967 |
citrinet |
256 |
1600 |
350.88 |
368.51 |
479.78 |
606.16 |
703.21 |
252.38 |
citrinet |
512 |
1600 |
627.13 |
675.13 |
879.89 |
1126.6 |
1326 |
498.74 |
citrinet |
768 |
1600 |
936.74 |
966.71 |
1311.3 |
2066.7 |
2932.3 |
738.16 |
citrinet |
1024 |
1600 |
1500.5 |
1277.3 |
2426.4 |
5046.6 |
7855.1 |
971.78 |
citrinet |
1280 |
1600 |
5291.8 |
2806.3 |
16199 |
29723 |
35459 |
983.95 |
citrinet |
1512 |
1600 |
10587 |
9166.5 |
22602 |
32147 |
41210 |
1041.7 |
jasper |
1 |
3200 |
35.26 |
37.546 |
40.014 |
40.316 |
40.316 |
0.99939 |
jasper |
256 |
3200 |
740.53 |
734.81 |
918.48 |
944.47 |
982.03 |
251.35 |
jasper |
512 |
3200 |
1757.2 |
1681.1 |
2699.1 |
2842.6 |
2960.5 |
488.75 |
jasper |
768 |
3200 |
3138.9 |
2733 |
5488.5 |
5763.1 |
5991.5 |
711.18 |
jasper |
1024 |
3200 |
17168 |
15068 |
34410 |
41135 |
44935 |
688.57 |
jasper |
1280 |
3200 |
24240 |
22543 |
50907 |
54452 |
57914 |
713.93 |
jasper |
1512 |
3200 |
31701 |
30755 |
66014 |
69236 |
74463 |
725.97 |
quartznet |
1 |
3200 |
29.538 |
31.703 |
33.113 |
33.295 |
33.295 |
0.99946 |
quartznet |
256 |
3200 |
365.4 |
368.32 |
530.98 |
557.93 |
593.19 |
253.27 |
quartznet |
512 |
3200 |
669.9 |
648 |
1005.1 |
1056.1 |
1113.1 |
501.11 |
quartznet |
768 |
3200 |
1199.8 |
1127.8 |
1973.3 |
2081.5 |
2215.7 |
737.01 |
quartznet |
1024 |
3200 |
1662.2 |
1658.7 |
2784.6 |
2940 |
3352.4 |
965 |
quartznet |
1280 |
3200 |
2237.2 |
2140.3 |
4251.7 |
4754.7 |
5143.7 |
1182.6 |
quartznet |
1512 |
3200 |
4715.5 |
5637.5 |
7989.1 |
8529.6 |
10924 |
1305.6 |
NVIDIA T4 GPU¶
Streaming, low-latency¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
160 |
22.635 |
22.249 |
22.582 |
22.685 |
42.217 |
0.99917 |
citrinet |
8 |
160 |
49.495 |
49.966 |
50.891 |
53.131 |
113.45 |
7.98 |
citrinet |
16 |
160 |
66.058 |
64.554 |
72.181 |
75.346 |
128.65 |
15.955 |
citrinet |
32 |
160 |
101.06 |
98.529 |
102.66 |
111.5 |
245.29 |
31.846 |
citrinet |
48 |
160 |
155.53 |
146.41 |
157.71 |
222.9 |
384.39 |
47.653 |
citrinet |
64 |
160 |
1803.7 |
1540.8 |
3654.9 |
3920 |
4135.9 |
61.772 |
citrinet |
96 |
160 |
14700 |
14635 |
26812 |
28045 |
28853 |
62.502 |
citrinet |
128 |
160 |
27775 |
25526 |
51278 |
54219 |
58549 |
62.646 |
jasper |
1 |
100 |
46.574 |
46.514 |
48.436 |
52.968 |
61.571 |
0.99882 |
jasper |
8 |
100 |
47.432 |
47.816 |
53.724 |
61.504 |
86.435 |
7.9854 |
jasper |
16 |
100 |
71.97 |
69.933 |
80.435 |
88.609 |
166.16 |
15.96 |
jasper |
32 |
100 |
242.83 |
213.96 |
313.9 |
380.22 |
769.68 |
31.817 |
jasper |
48 |
100 |
11109 |
10564 |
22207 |
23345 |
26297 |
39.077 |
jasper |
64 |
100 |
17960 |
17380 |
32749 |
34696 |
36515 |
39.716 |
jasper |
96 |
100 |
32650 |
31021 |
59770 |
63664 |
69997 |
40.135 |
jasper |
128 |
100 |
47530 |
46526 |
89112 |
95926 |
1.0421e+05 |
40.237 |
quartznet |
1 |
100 |
16.095 |
15.359 |
18.511 |
20.095 |
32.045 |
0.99927 |
quartznet |
8 |
100 |
26.138 |
24.417 |
31.26 |
35.824 |
68.615 |
7.9863 |
quartznet |
16 |
100 |
40.712 |
38.862 |
52.142 |
57.644 |
96.548 |
15.963 |
quartznet |
32 |
100 |
48.311 |
47.788 |
61.818 |
70.364 |
113.81 |
31.91 |
quartznet |
48 |
100 |
67.572 |
68.755 |
84.728 |
107.42 |
175.52 |
47.817 |
quartznet |
64 |
100 |
104.47 |
100.67 |
146.32 |
179.42 |
282.26 |
63.672 |
quartznet |
96 |
100 |
3101 |
3131.4 |
5489.3 |
5709.5 |
5847.9 |
86.397 |
quartznet |
128 |
100 |
13448 |
13493 |
24066 |
25320 |
26140 |
85.856 |
Streaming, high-throughput¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
800 |
25.995 |
25.455 |
26.11 |
34.216 |
34.388 |
0.99932 |
citrinet |
64 |
800 |
178.69 |
175.21 |
202.32 |
269.79 |
320.77 |
63.617 |
citrinet |
128 |
800 |
300.44 |
335.46 |
369.27 |
423.07 |
568.59 |
126.51 |
citrinet |
256 |
800 |
710.44 |
667.58 |
844.33 |
1521.5 |
3202.1 |
249.32 |
citrinet |
384 |
800 |
8847 |
8405.9 |
15005 |
16980 |
25160 |
289.74 |
citrinet |
512 |
800 |
19569 |
20095 |
34106 |
35865 |
44480 |
292.51 |
citrinet |
768 |
800 |
39067 |
38517 |
68252 |
73955 |
95873 |
292.64 |
citrinet |
1024 |
800 |
58068 |
59692 |
1.0214e+05 |
1.1194e+05 |
1.3164e+05 |
294.07 |
jasper |
1 |
800 |
74.805 |
74.618 |
83.792 |
88.073 |
88.556 |
0.9985 |
jasper |
64 |
800 |
218.84 |
222.47 |
276.46 |
313.51 |
352.64 |
63.597 |
jasper |
128 |
800 |
359.54 |
377.23 |
447.89 |
537.72 |
612.07 |
126.45 |
jasper |
256 |
800 |
1030.6 |
722.97 |
2353.5 |
3576.4 |
4687.6 |
248.78 |
jasper |
384 |
800 |
10746 |
6039.5 |
29346 |
36416 |
46165 |
254.18 |
jasper |
512 |
800 |
20701 |
13827 |
53326 |
57297 |
76804 |
246.6 |
jasper |
768 |
800 |
38972 |
27584 |
97278 |
1.03e+05 |
1.2926e+05 |
251 |
jasper |
1024 |
800 |
59655 |
44370 |
1.4406e+05 |
1.5043e+05 |
1.8314e+05 |
251.25 |
quartznet |
1 |
800 |
28.383 |
29.148 |
36.584 |
36.763 |
37.501 |
0.99928 |
quartznet |
64 |
800 |
144.08 |
139.86 |
212.4 |
241.42 |
305.26 |
63.626 |
quartznet |
128 |
800 |
190.34 |
175.36 |
259.42 |
338.67 |
411.78 |
126.94 |
quartznet |
256 |
800 |
296.54 |
280.51 |
433.4 |
592.82 |
733.86 |
252.22 |
quartznet |
384 |
800 |
422.68 |
383.37 |
641.46 |
1206.5 |
1443.6 |
375.73 |
quartznet |
512 |
800 |
645.59 |
493.39 |
1437 |
2161.5 |
2949.2 |
496.04 |
quartznet |
768 |
800 |
5576.4 |
776.85 |
17509 |
21659 |
29913 |
606.3 |
quartznet |
1024 |
800 |
13133 |
8042.4 |
36110 |
41498 |
53066 |
617.3 |
Offline¶
Acoustic model |
# of streams |
Chunk size (ms) |
Latency (ms) |
Throughput (RTFX) |
||||
---|---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
||||
citrinet |
1 |
1600 |
28.264 |
27.277 |
35.521 |
35.658 |
35.894 |
0.9993 |
citrinet |
256 |
1600 |
709.2 |
787.66 |
919.32 |
1094.4 |
1232.4 |
249.91 |
citrinet |
512 |
1600 |
3510.8 |
1507 |
12597 |
22842 |
26429 |
449.24 |
citrinet |
768 |
1600 |
16036 |
14794 |
33148 |
43614 |
55698 |
466.63 |
citrinet |
1024 |
1600 |
28188 |
26410 |
50763 |
70603 |
80471 |
472.93 |
citrinet |
1280 |
1600 |
39776 |
36591 |
68475 |
98183 |
1.0802e+05 |
475.27 |
citrinet |
1512 |
1600 |
51037 |
48000 |
87882 |
1.2297e+05 |
1.3285e+05 |
476.79 |
jasper |
1 |
3200 |
96.734 |
99.951 |
103 |
103.1 |
103.1 |
0.9983 |
jasper |
256 |
3200 |
1888.4 |
1823.2 |
2825.5 |
2928.4 |
3014.4 |
245.39 |
jasper |
512 |
3200 |
15452 |
13663 |
31656 |
36348 |
40256 |
367.19 |
jasper |
768 |
3200 |
32172 |
30853 |
63834 |
65991 |
72260 |
390.75 |
jasper |
1024 |
3200 |
49360 |
56457 |
92969 |
99690 |
1.0594e+05 |
390.25 |
jasper |
1280 |
3200 |
66512 |
77675 |
1.2441e+05 |
1.3225e+05 |
1.3976e+05 |
391.93 |
jasper |
1512 |
3200 |
87966 |
91424 |
1.6906e+05 |
1.7142e+05 |
1.7813e+05 |
380.89 |
quartznet |
1 |
3200 |
54.223 |
56.042 |
60.447 |
60.611 |
60.611 |
0.99896 |
quartznet |
256 |
3200 |
770.9 |
767.62 |
1003.3 |
1045.2 |
1093.5 |
251.06 |
quartznet |
512 |
3200 |
1685.9 |
1713.3 |
2603.6 |
2711.8 |
2847.1 |
486.01 |
quartznet |
768 |
3200 |
2953.8 |
2811.9 |
5399.7 |
5776.5 |
6094.3 |
704.69 |
quartznet |
1024 |
3200 |
14030 |
12917 |
28169 |
34882 |
38790 |
728.52 |
quartznet |
1280 |
3200 |
21590 |
19951 |
47433 |
49750 |
55253 |
727.14 |
quartznet |
1512 |
3200 |
27735 |
26985 |
60254 |
62852 |
66594 |
740.31 |
NLP¶
Performance of the Riva named entity recognition (NER) service (using a BERT-base model, sequence length of 128) and the Riva Question Answering (QA) service (using a BERT-large model, sequence length of 384) was measured in Riva. Batch size 1 latency and maximum throughput were measured.
NVIDIA A100 GPU¶
Task |
# of streams |
Latency (ms) |
Throughput (seq/s) |
||||
---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
|||
NER |
1 |
2.64 |
2.62 |
2.74 |
2.79 |
3.03 |
375.374 |
NER |
256 |
245 |
250 |
275 |
285 |
285 |
993.923 |
Q&A |
1 |
5.96 |
5.77 |
6.59 |
6.97 |
7.38 |
166.997 |
Q&A |
128 |
526 |
535 |
547 |
549 |
554 |
237.02 |
NVIDIA A30 GPU¶
Task |
# of streams |
Latency (ms) |
Throughput (seq/s) |
||||
---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
|||
NER |
1 |
3.6 |
3.53 |
3.88 |
4.09 |
4.34 |
274.549 |
NER |
256 |
280 |
287 |
293 |
294 |
338 |
868.025 |
Q&A |
1 |
7.98 |
7.76 |
8 |
10.6 |
10.7 |
124.643 |
Q&A |
128 |
671 |
684 |
688 |
688 |
715 |
185.882 |
NVIDIA V100 GPU¶
Task |
# of streams |
Latency (ms) |
Throughput (seq/s) |
||||
---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
|||
NER |
1 |
3.48 |
3.48 |
3.61 |
3.63 |
3.75 |
284.68 |
NER |
256 |
393 |
418 |
427 |
429 |
431 |
617.25 |
Q&A |
1 |
9.38 |
9.37 |
9.55 |
9.59 |
9.65 |
106.053 |
Q&A |
128 |
932 |
955 |
957 |
959 |
964 |
133.901 |
NVIDIA T4 GPU¶
Task |
# of streams |
Latency (ms) |
Throughput (seq/s) |
||||
---|---|---|---|---|---|---|---|
avg |
p50 |
p90 |
p95 |
p99 |
|||
NER |
1 |
5.23 |
5 |
6.52 |
6.62 |
7.16 |
189.031 |
NER |
256 |
541 |
560 |
567 |
568 |
621 |
450.158 |
Q&A |
1 |
14.9 |
14.2 |
14.9 |
15.2 |
26.2 |
66.8848 |
Q&A |
128 |
1.55e+03 |
1.58e+03 |
1.59e+03 |
1.59e+03 |
1.59e+03 |
80.5507 |
TTS¶
Performance of the Riva text-to-speech (TTS) service was measured for different number of parallel streams. Each parallel stream performed 10 iterations over 10 input strings from the LJSpeech dataset. Latency to first audio chunk and latency between successive audio chunks and throughput were measured.
NVIDIA A100 GPU¶
Model |
# of streams |
Latency to first audio (s) |
Latency between audio chunks (s) |
Throughput (RTFX) |
||||||
---|---|---|---|---|---|---|---|---|---|---|
avg |
p90 |
p95 |
p99 |
avg |
p90 |
p95 |
p99 |
|||
FastPitch + Hifi-GAN |
1 |
0.028 |
0.035 |
0.036 |
0.038 |
0.003 |
0.004 |
0.004 |
0.005 |
133.139 |
FastPitch + Hifi-GAN |
4 |
0.044 |
0.056 |
0.062 |
0.069 |
0.006 |
0.009 |
0.010 |
0.012 |
340.038 |
FastPitch + Hifi-GAN |
6 |
0.057 |
0.073 |
0.079 |
0.092 |
0.007 |
0.011 |
0.012 |
0.015 |
389.500 |
FastPitch + Hifi-GAN |
8 |
0.066 |
0.086 |
0.091 |
0.107 |
0.009 |
0.013 |
0.015 |
0.018 |
443.373 |
FastPitch + Hifi-GAN |
10 |
0.070 |
0.090 |
0.095 |
0.110 |
0.009 |
0.014 |
0.016 |
0.019 |
464.114 |
Tacotron 2 + WaveGlow |
1 |
0.046 |
0.051 |
0.052 |
0.059 |
0.022 |
0.024 |
0.025 |
0.028 |
34.007 |
Tacotron 2 + WaveGlow |
4 |
0.258 |
0.364 |
0.392 |
0.442 |
0.027 |
0.040 |
0.047 |
0.060 |
59.364 |
Tacotron 2 + WaveGlow |
6 |
0.381 |
0.503 |
0.548 |
0.603 |
0.033 |
0.052 |
0.061 |
0.079 |
65.714 |
Tacotron 2 + WaveGlow |
8 |
0.509 |
0.675 |
0.715 |
0.789 |
0.036 |
0.059 |
0.069 |
0.088 |
69.662 |
Tacotron 2 + WaveGlow |
10 |
0.612 |
0.787 |
0.887 |
1.041 |
0.039 |
0.066 |
0.076 |
0.096 |
72.609 |
NVIDIA A30 GPU¶
Model |
# of streams |
Latency to first audio (s) |
Latency between audio chunks (s) |
Throughput (RTFX) |
||||||
---|---|---|---|---|---|---|---|---|---|---|
avg |
p90 |
p95 |
p99 |
avg |
p90 |
p95 |
p99 |
|||
FastPitch + Hifi-GAN |
1 |
0.032 |
0.036 |
0.037 |
0.039 |
0.004 |
0.004 |
0.005 |
0.005 |
118.981 |
FastPitch + Hifi-GAN |
4 |
0.055 |
0.071 |
0.075 |
0.084 |
0.007 |
0.011 |
0.013 |
0.016 |
265.075 |
FastPitch + Hifi-GAN |
6 |
0.071 |
0.091 |
0.096 |
0.108 |
0.009 |
0.015 |
0.017 |
0.020 |
308.260 |
FastPitch + Hifi-GAN |
8 |
0.084 |
0.107 |
0.114 |
0.126 |
0.011 |
0.018 |
0.021 |
0.024 |
343.723 |
FastPitch + Hifi-GAN |
10 |
0.094 |
0.121 |
0.129 |
0.156 |
0.012 |
0.020 |
0.022 |
0.027 |
349.511 |
Tacotron 2 + WaveGlow |
1 |
0.065 |
0.070 |
0.071 |
0.073 |
0.031 |
0.033 |
0.034 |
0.035 |
24.627 |
Tacotron 2 + WaveGlow |
4 |
0.328 |
0.456 |
0.480 |
0.554 |
0.039 |
0.057 |
0.064 |
0.086 |
44.569 |
Tacotron 2 + WaveGlow |
6 |
0.506 |
0.683 |
0.737 |
0.814 |
0.048 |
0.076 |
0.089 |
0.112 |
47.723 |
Tacotron 2 + WaveGlow |
8 |
0.688 |
0.914 |
0.968 |
1.059 |
0.055 |
0.089 |
0.103 |
0.139 |
49.749 |
Tacotron 2 + WaveGlow |
10 |
0.840 |
1.178 |
1.315 |
1.480 |
0.061 |
0.106 |
0.124 |
0.155 |
49.764 |
NVIDIA V100 GPU¶
Model |
# of streams |
Latency to first audio (s) |
Latency between audio chunks (s) |
Throughput (RTFX) |
||||||
---|---|---|---|---|---|---|---|---|---|---|
avg |
p90 |
p95 |
p99 |
avg |
p90 |
p95 |
p99 |
|||
FastPitch + Hifi-GAN |
1 |
0.030 |
0.033 |
0.033 |
0.034 |
0.005 |
0.006 |
0.006 |
0.007 |
107.159 |
FastPitch + Hifi-GAN |
4 |
0.065 |
0.085 |
0.092 |
0.108 |
0.010 |
0.017 |
0.020 |
0.025 |
212.439 |
FastPitch + Hifi-GAN |
6 |
0.095 |
0.129 |
0.141 |
0.159 |
0.013 |
0.023 |
0.027 |
0.033 |
225.813 |
FastPitch + Hifi-GAN |
8 |
0.125 |
0.168 |
0.177 |
0.206 |
0.016 |
0.029 |
0.034 |
0.042 |
235.513 |
FastPitch + Hifi-GAN |
10 |
0.150 |
0.204 |
0.228 |
0.313 |
0.018 |
0.033 |
0.037 |
0.052 |
232.068 |
Tacotron 2 + WaveGlow |
1 |
0.057 |
0.059 |
0.060 |
0.060 |
0.031 |
0.033 |
0.033 |
0.033 |
25.236 |
Tacotron 2 + WaveGlow |
4 |
0.388 |
0.545 |
0.592 |
0.677 |
0.047 |
0.071 |
0.080 |
0.100 |
37.289 |
Tacotron 2 + WaveGlow |
6 |
0.598 |
0.847 |
0.893 |
1.039 |
0.057 |
0.090 |
0.103 |
0.136 |
40.055 |
Tacotron 2 + WaveGlow |
8 |
0.814 |
1.097 |
1.162 |
1.307 |
0.063 |
0.100 |
0.116 |
0.148 |
42.555 |
Tacotron 2 + WaveGlow |
10 |
0.979 |
1.345 |
1.472 |
1.630 |
0.071 |
0.114 |
0.134 |
0.178 |
42.907 |
NVIDIA T4 GPU¶
Model |
# of streams |
Latency to first audio (s) |
Latency between audio chunks (s) |
Throughput (RTFX) |
||||||
---|---|---|---|---|---|---|---|---|---|---|
avg |
p90 |
p95 |
p99 |
avg |
p90 |
p95 |
p99 |
|||
FastPitch + Hifi-GAN |
1 |
0.052 |
0.058 |
0.059 |
0.075 |
0.006 |
0.007 |
0.008 |
0.008 |
73.429 |
FastPitch + Hifi-GAN |
4 |
0.105 |
0.131 |
0.139 |
0.156 |
0.016 |
0.025 |
0.028 |
0.034 |
132.176 |
FastPitch + Hifi-GAN |
6 |
0.148 |
0.188 |
0.199 |
0.233 |
0.022 |
0.036 |
0.041 |
0.048 |
140.982 |
FastPitch + Hifi-GAN |
8 |
0.188 |
0.240 |
0.258 |
0.313 |
0.028 |
0.047 |
0.052 |
0.062 |
148.327 |
FastPitch + Hifi-GAN |
10 |
0.211 |
0.275 |
0.295 |
0.327 |
0.030 |
0.050 |
0.058 |
0.071 |
150.003 |
Tacotron 2 + WaveGlow |
1 |
0.107 |
0.115 |
0.117 |
0.118 |
0.051 |
0.055 |
0.055 |
0.056 |
14.859 |
Tacotron 2 + WaveGlow |
4 |
0.717 |
1.018 |
1.100 |
1.225 |
0.107 |
0.167 |
0.191 |
0.246 |
18.454 |
Tacotron 2 + WaveGlow |
6 |
1.158 |
1.607 |
1.723 |
1.929 |
0.137 |
0.224 |
0.255 |
0.311 |
18.964 |
Tacotron 2 + WaveGlow |
8 |
1.643 |
2.252 |
2.390 |
2.606 |
0.163 |
0.269 |
0.307 |
0.391 |
19.275 |
Tacotron 2 + WaveGlow |
10 |
2.065 |
2.848 |
3.176 |
3.615 |
0.174 |
0.296 |
0.338 |
0.430 |
18.726 |
Performance Considerations¶
When the server is under high load, requests might time out, as the server will not start inference for a new request until a previous request is completely generated so that inference slot can be freed. This is done to maximize throughput for the TTS service and allow for real-time interaction. NVIDIA does not recommend making more than 8-10 simultaneous requests with the models provided in Riva 1.0.0 beta.