Custom Models
Contents
Custom Models#
The following NLP tasks are supported in Riva:
text classification
token classification (named entity recognition)
joint intent and slots
question answering (extractive)
punctuation and capitalization
Custom, supported NLP models trained with NVIDIA NeMo can be deployed in Riva using the riva-build
and
riva-deploy
commands as documented in the Riva Build and Riva Deploy sections. In the simplest case, you can deploy an NLP pipeline as follows:
riva-build <task_name> \
<rmir_filename>:<encryption_key> \
<riva_filename>:<encryption_key> \
Where:
<task_name>
is the type of NLP pipeline to deploy. Supported values areintent_slot
,qa
,token_classification
,text_classification
andpunctuation
.<rmir_filename>
is the Rivarmir
file that is generated<riva_filename>
is the name of theriva
file to use as input<encryption_key>
is the key used to encrypt the files. The encryption key for the pre-trained Riva models uploaded on NGC istlt_encode
.
The three NLP classification tasks (that is, token_classification
, intent_slot
, and text_classification
) support an optional parameter
called --domain_name
that enables you to name your custom models. This is useful if you plan to deploy multiple models of the same task.
The punctuation and capitalization task (that is, <task_name>=punctuation
) supports an optional parameter called --language_code
which must be set to the BCP-47 (https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language code of the language the target model was trained on. When receiving ASR requests with the enable_automatic_punctuation
Boolean flag set to true
, the Riva server will look for a punctuation and capitalization model with the requested language code, and use it to add punctuation and capitalization to the ASR transcript.
Each of the tasks supports a set of arguments that enables you to configure your settings using the CLI. Use the
format riva-build <task name> -h
to view a list of available CLI inputs for each task.
Pretrained Models#
Task |
Architecture |
Language |
Dataset |
Domain |
Accuracy |
Link |
---|---|---|---|---|---|---|
QA |
BERT |
English |
SQuAD 2.0 |
EM: 71.24 F1: 74.32 |
||
QA |
Megatron |
English |
SQuAD 2.0 |
TBM |
||
Entity Recognition |
BERT |
English |
GMB (Groningen Meaning Bank) |
LOC, ORG, PER, GPE, TIME, MISC, O |
||
Punctuation/Capitalization |
BERT |
English |
Tatoeba sentences, Books from the Project Gutenberg, Transcripts from Fisher English Training Speech |
|||
Intent Detection & Slot Tagging |
BERT |
English |
Proprietary |
Weather |
||
Intent Detection & Slot Tagging |
DistilBERT |
English |
Proprietary |
Misty (weather, smalltalk, places of interest) |
||
Text Classification |
BERT |
English |
Proprietary |