Custom Models
Contents
Custom Models#
The following NLP tasks are supported in Riva:
- text classification 
- token classification (named entity recognition) 
- joint intent and slots 
- question answering (extractive) 
- punctuation and capitalization 
Custom NLP models trained with TAO Toolkit can be deployed in Riva using the riva-build and
riva-deploy commands as documented in the Riva Build and Riva Deploy sections. In the simplest case, you can deploy an NLP pipeline as follows:
riva-build <task_name> \
    <rmir_filename>:<encryption_key>  \
    <riva_filename>:<encryption_key>  \
Where:
- <task_name>is the type of NLP pipeline to deploy. Supported values are- intent_slot,- qa,- token_classification,- text_classificationand- punctuation.
- <rmir_filename>is the Riva- rmirfile that is generated
- <riva_filename>is the name of the- rivafile to use as input
- <encryption_key>is the key used to encrypt the files. The encryption key for the pre-trained Riva models uploaded on NGC is- tlt_encode.
The three NLP classification tasks (that is, token_classification, intent_slot, and text_classification) support an optional parameter
called --domain_name that enables you to name your custom models. This is useful if you plan to deploy multiple models of the same task.
The punctuation and capitalization task (that is, <task_name>=punctuation) supports an optional parameter called --language_code which must be set to the BCP-47 (https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language code of the language the target model was trained on. When receiving ASR requests with the enable_automatic_punctuation Boolean flag set to true, the Riva server will look for a punctuation and capitalization model with the requested language code, and use it to add punctuation and capitalization to the ASR transcript.
Each of the tasks supports a set of arguments that enables you to configure your settings using the CLI. Use the
format riva-build <task name> -h to view a list of available CLI inputs for each task.
Pretrained Models#
| Task | Architecture | Language | Dataset | Domain | Accuracy | Link | 
|---|---|---|---|---|---|---|
| QA | BERT | English | SQuAD 2.0 | EM: 71.24 F1: 74.32 | ||
| QA | Megatron | English | SQuAD 2.0 | TBM | ||
| Entity Recognition | BERT | English | GMB (Groningen Meaning Bank) | LOC, ORG, PER, GPE, TIME, MISC, O | ||
| Punctuation/Capitalization | BERT | English | Tatoeba sentences, Books from the Project Gutenberg, Transcripts from Fisher English Training Speech | |||
| Intent Detection & Slot Tagging | BERT | English | Proprietary | Weather | ||
| Intent Detection & Slot Tagging | DistilBERT | English | Proprietary | Misty (weather, smalltalk, places of interest) | ||
| Text Classification | BERT | English | Proprietary | 
