Scores#

EN#

Model Name

Language

EuroParl Test Set (en)

Fisher Test Set (en)

Librispeech Dev-Clean

Librispeech Dev-Other

Librispeech Test-Clean

Librispeech Test-Other

MCV Test-Set v11.0 (en)

MCV Test-Set v8.0 (en)

MLS Dev (en)

MLS Test (en)

NSC Part1

NSC Part6

Peoples Speech Test v1

SLR 83 Test

SPGI Test

VoxPopuli Test (en)

WSJ Dev 93

WSJ Eval 92

stt_en_conformer_ctc_small

en

3.6

8.1

3.7

8.1

stt_en_conformer_ctc_medium

en

2.5

5.8

2.6

5.9

stt_en_conformer_ctc_large

en

1.9

4.4

2.1

4.5

stt_en_conformer_ctc_xlarge

en

1.77 %

3.79 %

2.00 %

3.74 %

7.88 %

5.99 %

6.44 %

22.90 %

5.50 %

2.36 %

stt_en_conformer_ctc_small_ls

en

3.3

8.8

3.4

8.8

stt_en_conformer_ctc_medium_ls

en

2.7

7.4

3.0

7.3

stt_en_conformer_ctc_large_ls

en

2.4

6.2

2.7

6.0

stt_en_conformer_transducer_small

en

2.8

6.6

2.5

6.6

stt_en_conformer_transducer_medium

en

2.0

4.6

2.1

4.7

stt_en_conformer_transducer_large

en

1.6

3.5

1.7

3.7

stt_en_conformer_transducer_large_ls

en

2.1

5.0

2.3

5.1

stt_en_conformer_transducer_xlarge

en

1.48 %

2.95 %

1.62 %

3.01 %

6.46 %

4.59 %

5.32 %

5.70 %

6.47 %

21.32 %

2.05 %

1.17 %

stt_en_conformer_transducer_xxlarge

en

1.52 %

3.09 %

1.72 %

3.14 %

5.29 %

5.85 %

6.64 %

2.42 %

1.49 %

stt_en_fastconformer_hybrid_large_streaming_80ms (CTC)

en

3.5 %

8.1 %

10.2 %

7.2 %

3.5 %

2.3 %

stt_en_fastconformer_hybrid_large_streaming_480ms (CTC)

en

3.6 %

7.5 %

9.8 %

7.0 %

3.5 %

2.1 %

stt_en_fastconformer_hybrid_large_streaming_1040ms (CTC)

en

2.7 %

6.4 %

9.0 %

7.0 %

3.2 %

1.9 %

stt_en_fastconformer_hybrid_large_streaming_80ms (RNNT)

en

2.7 %

6.5 %

9.1 %

6.9 %

3.2 %

1.9 %

stt_en_fastconformer_hybrid_large_streaming_480ms (RNNT)

en

2.7 %

6.1 %

8.5 %

6.7 %

3.1 %

1.8 %

stt_en_fastconformer_hybrid_large_streaming_1040ms (RNNT)

en

2.3 %

5.5 %

8.0 %

6.6 %

2.9 %

1.6 %

stt_en_fastconformer_hybrid_large_streaming_multi (RNNT - 0ms)

en

7.0 %

stt_en_fastconformer_hybrid_large_streaming_multi (RNNT - 80ms)

en

6.4 %

stt_en_fastconformer_hybrid_large_streaming_multi (RNNT - 480)

en

5.7 %

stt_en_fastconformer_hybrid_large_streaming_multi (RNNT - 1040)

en

5.4 %

stt_en_fastconformer_hybrid_large_streaming_multi (CTC - 0ms)

en

8.4 %

stt_en_fastconformer_hybrid_large_streaming_multi (CTC - 80ms)

en

7.8 %

stt_en_fastconformer_hybrid_large_streaming_multi (CTC - 480)

en

6.7 %

stt_en_fastconformer_hybrid_large_streaming_multi (CTC - 1040)

en

6.2 %


Model Name

Language

EuroParl Test Set (en)

Fisher Test Set (en)

Librispeech Dev-Clean

Librispeech Dev-Other

Librispeech Test-Clean

Librispeech Test-Other

MCV Test-Set v11.0 (en)

MCV Test-Set v8.0 (en)

MLS Dev (en)

MLS Test (en)

NSC Part1

NSC Part6

Peoples Speech Test v1

SLR 83 Test

SPGI Test

VoxPopuli Test (en)

WSJ Dev 93

WSJ Eval 92

stt_en_fastconformer_ctc_large

en

1.9

4.2

2.1

4.2

stt_en_fastconformer_transducer_large

en

2.0

3.8

1.8

3.8

stt_en_fastconformer_hybrid_large_pc

en

8.0 %

10.3 %

2.0 %

4.1 %

8.2 %

4.5 %

4.6 %

2.3 %

4.5 %


BE#

Model Name

Language

MCV Test-Set v10 (be)

stt_be_conformer_ctc_large

be

4.7 %

stt_be_conformer_transducer_large

be

3.8 %


BY#

Model Name

Language

MCV Dev-Set v12.0 (be)

MCV Test-Set v12.0 (be)

stt_by_fastconformer_hybrid_large_pc

by

2.7 %

2.7 %


CA#

Model Name

Language

MCV Dev-Set (v??) (ca)

MCV Dev-Set v9.0 (ca)

MCV Test-Set v9.0 (ca)

stt_ca_conformer_ctc_large

ca

4.70

4.27

stt_ca_conformer_transducer_large

ca

4.43

3.85


DE#

Model Name

Language

MCV Dev-Set (v??) (de)

MCV Dev-Set v12.0 (de)

MCV Dev-Set v7.0 (de)

MCV Test-Set v12.0 (de)

MCV Test-Set v7.0 (de)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (de)

VoxPopuli Test (de)

stt_de_conformer_ctc_large

de

5.84

6.68

3.85

4.63

12.56

10.51

stt_de_conformer_transducer_large

de

4.75

5.36

3.46

4.19

11.21

9.14


Model Name

Language

MCV Dev-Set (v??) (de)

MCV Dev-Set v12.0 (de)

MCV Dev-Set v7.0 (de)

MCV Test-Set v12.0 (de)

MCV Test-Set v7.0 (de)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (de)

VoxPopuli Test (de)

stt_de_fastconformer_hybrid_large_pc

de

4.2 %

4.9 %

3.3 %

3.8 %

10.8 %

8.7 %


ENES#

Model Name

Language

Fisher-Dev-En

Fisher-Dev-Es

Fisher-Test-En

Fisher-Test-Es

Librispeech Dev-Clean

Librispeech Dev-Other

Librispeech Test-Clean

Librispeech Test-Other

MCV Dev-Set v7.0 (en)

MCV Dev-Set v7.0 (es)

MCV Test-Set v7.0 (en)

MCV Test-Set v7.0 (es)

MLS Dev (en)

MLS Dev (es)

MLS Test (en)

MLS Test (es)

VoxPopuli Dev (en)

VoxPopuli Dev (es)

VoxPopuli Test (en)

VoxPopuli Test (es)

stt_enes_conformer_ctc_large

enes

16.7 %

2.2 %

5.5 %

2.6 %

5.5 %

5.8 %

3.5 %

5.7 %

stt_enes_conformer_ctc_large_codesw

enes

16.51 %

16.31 %

2.22 %

5.36 %

2.55 %

5.38 %

5.00 %

5.51 %

3.46 %

3.73 %

5.58 %

6.63 %

stt_enes_conformer_transducer_large

enes

16.2 %

2.0 %

4.6 %

2.2 %

4.6 %

5.0 %

3.3 %

5.3 %

stt_enes_conformer_transducer_large_codesw

enes

15.70 %

15.66 %

1.97 %

4.54 %

2.17 %

4.53 %

4.51 %

5.06 %

3.27 %

3.67 %

5.28 %

6.54 %


EO#

Model Name

Language

MCV Dev-Set v11.0 (eo)

MCV Test-Set v11.0 (eo)

stt_eo_conformer_ctc_large

eo

2.9 %

4.8 %

stt_eo_conformer_transducer_large

eo

2.4 %

4.0 %


ES#

Model Name

Language

Call Home Dev Test (es)

Call Home Eval Test (es)

Call Home Train (es)

Fisher Dev Set (es)

Fisher Test Set (es)

MCV Dev-Set (v??) (es)

MCV Dev-Set v12.0 (es)

MCV Dev-Set v7.0 (es)

MCV Test-Set (v??) (es)

MCV Test-Set v12.0 (es)

MCV Test-Set v7.0 (es)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (es)

VoxPopuli Test (es)

stt_es_conformer_ctc_large

es

23.7 %

25.3 %

22.4 %

18.3 %

18.5 %

6.3 %

6.9 %

4.3 %

4.2 %

6.1 %

7.5 %

stt_es_conformer_transducer_large

es

18.0 %

19.4 %

17.2 %

14.7 %

14.8 %

4.6 %

5.2 %

2.7 %

3.2 %

4.7 %

6.0 %


Model Name

Language

Call Home Dev Test (es)

Call Home Eval Test (es)

Call Home Train (es)

Fisher Dev Set (es)

Fisher Test Set (es)

MCV Dev-Set (v??) (es)

MCV Dev-Set v12.0 (es)

MCV Dev-Set v7.0 (es)

MCV Test-Set (v??) (es)

MCV Test-Set v12.0 (es)

MCV Test-Set v7.0 (es)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (es)

VoxPopuli Test (es)

stt_es_fastconformer_hybrid_large_pc

es

29.4 %

28.9 %

7.1 %

7.5 %

10.6 %

11.8 %

8.6 %

9.8 %


FR#

Model Name

Language

MCV Dev-Set (v??) (fr)

MCV Dev-Set v7.0 (fr)

MCV Dev-Set v7.0 (fr) (No Hyphen)

MCV Test-Set v7.0 (fr)

MCV Test-Set v7.0 (fr) (No Hyphen)

MLS Dev (en)

MLS Dev (en) (No Hyphen)

MLS Test (en)

MLS Test (en) (No Hyphen)

stt_fr_conformer_ctc_large

fr

8.35

7.88

9.63

9.01

5.88

5.90

4.91

4.63

stt_fr_conformer_transducer_large

fr

6.85

7.95

5.05

4.10


HR#

Model Name

Language

ParlaSpeech Dev-Set v1.0 (hr)

ParlaSpeech Test-Set v1.0 (hr)

Parlaspeech Dev-Set (v??) (hr)

Parlaspeech Test-Set (v??) (hr)

stt_hr_conformer_ctc_large

hr

4.43

4.70

stt_hr_conformer_transducer_large

hr

4.56

4.69


Model Name

Language

ParlaSpeech Dev-Set v1.0 (hr)

ParlaSpeech Test-Set v1.0 (hr)

Parlaspeech Dev-Set (v??) (hr)

Parlaspeech Test-Set (v??) (hr)

stt_hr_fastconformer_hybrid_large_pc

hr

4.5 %

4.2 %


IT#

Model Name

Language

MCV Dev-Set (v??) (it)

MCV Dev-Set v11.0 (it)

MCV Dev-Set v12.0 (it)

MCV Test-Set v11.0 (it)

MCV Test-Set v12.0 (it)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (it)

VoxPopuli Test (it)

stt_it_conformer_ctc_large

it

5.38

5.92

13.16

10.62

13.43

16.75

stt_it_conformer_transducer_large

it

4.80

5.24

14.62

12.18

12.00

15.15


Model Name

Language

MCV Dev-Set (v??) (it)

MCV Dev-Set v11.0 (it)

MCV Dev-Set v12.0 (it)

MCV Test-Set v11.0 (it)

MCV Test-Set v12.0 (it)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (it)

VoxPopuli Test (it)

stt_it_fastconformer_hybrid_large_pc

it

5.2 %

5.8 %

13.6 %

11.5 %

12.7 %

15.6 %


KAB#

Model Name

Language

MCV Test-Set v10.0 (kab)

stt_kab_conformer_transducer_large

kab

18.86


NL#

Model Name

Language

MCV Test-Set v12.0 (nl)

MLS Test (nl)

stt_nl_fastconformer_hybrid_large_pc

nl

9.2 %

12.1 %


PL#

Model Name

Language

MCV Dev-Set (v??) (pl)

MCV Dev-Set v12.0 (pl)

MCV Test-Set v12.0 (pl)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (pl)

VoxPopuli Test (pl)

stt_pl_fastconformer_hybrid_large_pc

pl

6.0 %

8.7 %

7.1 %

5.8 %

11.3 %

8.5 %


RU#

Model Name

Language

GOLOS Crowd Test-Set (v??) (ru)

GOLOS Farfield Test-Set (v??) (ru)

Librispeech Test

MCV Dev-Set (v??) (ru)

MCV Dev-Set v10.0 (ru)

MCV Test-Set v10.0 (ru)

stt_ru_conformer_ctc_large

ru

2.8 %

7.1 %

13.5 %

3.9 %

4.3 %

stt_ru_conformer_transducer_large

ru

2.7%

7.6%

12.0%

3.5%

4.0%


RW#

Model Name

Language

MCV Test-Set v9.0 (rw)

stt_rw_conformer_ctc_large

rw

18.2 %

stt_rw_conformer_transducer_large

rw

16.2 %


UA#

Model Name

Language

MCV Test-Set v12.0 (ua)

stt_ua_fastconformer_hybrid_large_pc

ua

5.2 %


ZH#

Model Name

Language

AIShell Dev-Android v2

AIShell Dev-Ios v1

AIShell Dev-Ios v2

AIShell Dev-Mic v2

AIShell Test-Android v2

AIShell Test-Ios v1

AIShell Test-Ios v2

AIShell Test-Mic v2

stt_zh_conformer_transducer_large

zh

3.4

3.2

3.4

3.4

3.2

3.4


Scores with Punctuation and Capitalization#

EN with P&C#

Model Name

Language

EuroParl Test Set (en)

Fisher Test Set (en)

Librispeech Test-Clean

Librispeech Test-Other

MCV Test-Set v11.0 (en)

MLS Test (en)

NSC Part1

SPGI Test

VoxPopuli Test (en)

stt_en_fastconformer_hybrid_large_pc

en

12.5 %

19.0 %

7.3 %

9.2 %

10.1 %

12.7 %

7.2 %

5.1 %

6.7 %


BY with P&C#

Model Name

Language

MCV Dev-Set v12.0 (be)

MCV Test-Set v12.0 (be)

stt_by_fastconformer_hybrid_large_pc

by

3.8 %

3.9 %


DE with P&C#

Model Name

Language

MCV Dev-Set v12.0 (de)

MCV Test-Set v12.0 (de)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (de)

VoxPopuli Test (de)

stt_de_fastconformer_hybrid_large_pc

de

4.7 %

5.4 %

10.1 %

11.1 %

12.6 %

10.4 %


ES with P&C#

Model Name

Language

Fisher Dev Set (es)

Fisher Test Set (es)

MCV Dev-Set v12.0 (es)

MCV Test-Set v12.0 (es)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (es)

VoxPopuli Test (es)

stt_es_fastconformer_hybrid_large_pc

es

14.7 %

14.6 %

4.5 %

5.0 %

3.1 %

3.9 %

4.4 %

5.6 %


HR with P&C#

Model Name

Language

Parlaspeech Dev-Set (v??) (hr)

Parlaspeech Test-Set (v??) (hr)

stt_hr_fastconformer_hybrid_large_pc

hr

10.4 %

8.7 %


IT with P&C#

Model Name

Language

MCV Dev-Set v12.0 (it)

MCV Test-Set v12.0 (it)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (it)

VoxPopuli Test (it)

stt_it_fastconformer_hybrid_large_pc

it

7.8 %

8.2 %

26.4 %

22.5 %

16.8 %

19.6 %


NL with P&C#

Model Name

Language

MCV Test-Set v12.0 (nl)

MLS Test (nl)

stt_nl_fastconformer_hybrid_large_pc

nl

32.1 %

25.1 %


PL with P&C#

Model Name

Language

MCV Dev-Set v12.0 (pl)

MCV Test-Set v12.0 (pl)

MLS Dev (en)

MLS Test (en)

VoxPopuli Dev (pl)

VoxPopuli Test (pl)

stt_pl_fastconformer_hybrid_large_pc

pl

8.9 %

11.0 %

16.0 %

11.0 %

14.0 %

11.4 %


UA with P&C#

Model Name

Language

MCV Test-Set v12.0 (ua)

stt_ua_fastconformer_hybrid_large_pc

ua

7.3 %