Question Answering
======================

Introduction
------------

With the Question Answering, or Reading Comprehension, task, given a question and a passage of
content (context) that may contain an answer for the question,
the model will predict the span within the text with a start and end position indicating
the answer to the question. For datasets like SQuAD 2.0, this model supports cases when the
answer is not contained in the content.

For every word in the context of a given question, the model will be trained to predict:

- The likelihood this word is the start of the span
- The likelihood this word is the end of the span

The model chooses the start and end words with maximal probabilities. When the content does not
contain the answer, we would like the start and end span to be set for the first token.

A pretrained BERT encoder with two span prediction heads is used for the prediction start and
the end position of the answer. The span predictions are token classifiers consisting of a single
linear layer.

TLT provides a sample notebook to outline the end-to-end workflow on how to train a Question
Answering model using TLT and deploy it in Jarvis format on `NGC resources`_.

.. _NGC resources: https://ngc.nvidia.com/catalog/resources/nvidia:tlt-jarvis:questionanswering_notebook

Downloading Sample Spec files
-----------------------------

Before proceeding, let's download sample spec files that we would need for the rest of the subtasks.

.. code::

    tlt question_answering download_specs -r /results/question_answering/default_specs/ \
                                          -o /specs/nlp/questions_answering

Data Format
-----------

This model expects the dataset in `SQuAD format`_ (i.e., a JSON file for each dataset split).
The code snippet below shows an example of the training file.
Each title has one or multiple paragraph entries, each consisting of the "context" and
question-answer entries. Each question-answer entry has:

- A question
- A globally unique id
- The Boolean flag "is_impossible", which shows whether a question is answerable or not
- (if the question is answerable) One answer entry containing the text span and its starting
  character index in the context.
- (if the question is not answerable) An empty "answers" list

.. _SQuAD format: https://rajpurkar.github.io/SQuAD-explorer/

The evaluation files (for validation and testing) follow the above format, except that it can
provide more than one answer to the same question. The inference file also follows the above format,
except that it does not require the "answers" and "is_impossible" keywords.

The following is an example of the data format (JSON file):

.. code::

    {
        "data": [
            {
                "title": "Super_Bowl_50",
                "paragraphs": [
                    {
                        "context": "Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24\u201310 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the \"golden anniversary\" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as \"Super Bowl L\"), so that the logo could prominently feature the Arabic numerals 50.",
                        "qas": [
                            {
                                "question": "Where did Super Bowl 50 take place?",
                                "is_impossible": "false",
                                "id": "56be4db0acb8001400a502ee",
                                "answers": [
                                    {
                                        "answer_start": "403",
                                        "text": "Santa Clara, California"
                                    }
                                ]
                            },
                            {
                                "question": "What was the winning score of the Super Bowl 50?",
                                "is_impossible": "true",
                                "id": "56be4db0acb8001400a502ez",
                                "answers": [
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    }


Dataset Conversion
------------------

To perform training of the QA model on the SQuAD dataset, you must first download it from `here
<https://rajpurkar.github.io/SQuAD-explorer/>`_. You can choose either SQuAD version 1.1, which
does not contain questions without the answer and has 100,000+ question-answer pairs on 500+
articles--or the newer SQuAD version 2.0, which combines the 100,000 questions from SQuAD 1.1 with
over 50,000 unanswerable questions. To do well with SQuAD2.0, a system must not only answer
questions when possible, but also determine when no answer is supported by the paragraph and
abstain from answering.

After downloading the files, you should have a :code:`squad` data folder that contains the
following four files for training and evaluation:

.. code::

    |--squad
         |-- v1.1/train-v1.1.json
         |-- v1.1/dev-v1.1.json
         |-- v2.0/train-v2.0.json
         |-- v2.0/dev-v2.0.json


Model Training
--------------

.. _model_training_question_answering:

The following is an example of the config spec for training (:code:`train.yaml`) file. You can
change any of these parameters and pass them to the training command.

.. code::

  trainer:
    max_epochs: 2

    # Name of the .tlt file where trained model will be saved.
    save_to: trained-model.tlt

  model:

    dataset:
        do_lower_case: true
        version_2_with_negative: true

    tokenizer:
        tokenizer_name: ${model.language_model.pretrained_model_name} # or sentencepiece
        vocab_file: null # path to vocab file
        tokenizer_model: null # only used if tokenizer is sentencepiece
        special_tokens: null

    language_model:
      pretrained_model_name: bert-base-uncased
      lm_checkpoint: null
      config_file: null # json file, precedence over config
      config: null

    token_classifier:
      num_layers: 1
      dropout: 0.0
      num_classes: 2
      activation: relu
      log_softmax: false
      use_transformer_init: true


  training_ds:
    file: ??? # e.g. squad/v1.1/train-v2.0.json
    batch_size: 12 # per GPU
    shuffle: true
    num_samples: -1

  validation_ds:
    file: ??? # e.g. squad/v1.1/dev-v2.0.json
    batch_size: 12 # per GPU
    shuffle: false
    num_samples: -1

  optim:
    # optimizer arguments
    name: adamw
    lr: 3e-5
    betas: [0.9, 0.999]
    weight_decay: 0.0
    # scheduler config override
    sched:
      name: SquareRootAnnealing
      warmup_steps: null
      warmup_ratio: 0.0
      last_epoch: -1

      # pytorch lightning args
      monitor: val_loss
      reduce_on_plateau: false


+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| **Parameter**                             | **Data Type**   |   **Default**                                                                    | **Description**                                                                                              |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| trainer.max_epochs                        | integer         | 2                                                                                | The number of epochs to train                                                                                |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| save_to                                   | string          | trained-model.tlt                                                                | The filename of the trained model                                                                            |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| data_dir                                  | string          | --                                                                               | The path to the data converted to the specified format                                                       |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.tokenizer.tokenizer_name            | string          | Will be filled automatically based on model.language_model.pretrained_model_name | The tokenizer name                                                                                           |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.tokenizer.vocab_file                | string          | null                                                                             | The path to tokenizer vocabulary                                                                             |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.tokenizer.tokenizer_model           | string          | null                                                                             | The path to tokenizer model (for sentencepiece tokenizer only)                                               |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.tokenizer.special_tokens            | string          | null                                                                             | Special tokens for the tokenizer (if they exist)                                                             |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.language_model.pretrained_model_name| string          | bert-base-uncased                                                                | The pre-trained language model name (choose from `bert-base-cased`, `bert-base-uncased`,                     |
|                                           |                 |                                                                                  | `megatron-bert-345m-cased` and `megatron-bert-345m-uncased`)                                                 |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.language_model.lm_checkpoint        | string          | null                                                                             | The path to the pre-trained language model checkpoint                                                        |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.language_model.config_file          | string          | null                                                                             | The path to the pre-trained language model config file                                                       |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.language_model.config               | dictionary      | null                                                                             | The config of the pre-trained language model                                                                 |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.token_classifier.num_layers         | integer         | 1                                                                                | The number of fully connected layers of the Classifier on top of the Bert model                              |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.token_classifier.dropout            | float           | 0.0                                                                              | The dropout ratio of the fully connected layers                                                              |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.token_classifier.num_classes        | integer         | 2                                                                                | The number of Classifiers (two for QA)                                                                       |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.token_classifier.activation         | string          | relu                                                                             | The activation function to use                                                                               |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| model.token_classifier.log_softmax        | boolean         | false                                                                            | A flag specifying whether to use log soft max                                                                |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| training_ds.file                          | string          | --                                                                               | The training file names                                                                                      |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| training_ds.batch_size                    | integer         | 12                                                                               | The training data batch size                                                                                 |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| training_ds.shuffle                       | bool            | true                                                                             | A flag specifying whether to shuffle the training data                                                       |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| training_ds.num_samples                   | integer         | -1                                                                               | The number of samples to use from the training dataset (use -1 to specify all samples)                       |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| validation_ds.file                        | string          | --                                                                               | The validation file names                                                                                    |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| validation_ds.batch_size                  | integer         | 12                                                                               | The validation data batch size                                                                               |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| validation_ds.shuffle                     | bool            | false                                                                            | A flag specifying whether to shuffle the validation data                                                     |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| validation_ds.num_samples                 | integer         | -1                                                                               | The number of samples to use from the validation dataset (use -1 to specify all samples)                     |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| optim.name                                | string          | adamw                                                                            | The optimizer to use for training                                                                            |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| optim.lr                                  | float           | 2e-5                                                                             | The learning rate to use for training                                                                        |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| optim.weight_decay                        | float           | 0.0                                                                              | The weight decay to use for training                                                                         |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| optim.sched.name                          | string          | SquareRootAnnealing                                                              | The warmup schedule                                                                                          |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| optim.sched.warmup_ratio                  | float           | 0.0                                                                              | The warmup ratio                                                                                             |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+

The following is an example of the command for training the model:

.. code::

    !tlt question_answering train -e /specs/nlp/question_answering/train.yaml \
                            data_dir=PATH_TO_DATA \
                            trainer.max_epochs=2 \
                            trainer.amp_level="O1" \
                            trainer.precision=16 \
                            -g 1

.. Note:: The first time you are performing training, it will take an extra 5-10 minutes to process
   the dataset for training. For future training runs, it will use the processed dataset, which is
   automatically cached in the files in the same directory as the data.

Required Arguments for Training
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* :code:`-e`: The experiment specification file to set up training.
* :code:`data_dir`: The dataset directory


Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`trainer.max_epochs`: The number of training epochs
* :code:`-g`: The number of GPUs to use for training
* :code:`trainer.amp_level` and :code:`trainer.precision`: These fields allow you to use 16-bit
  mixed precision to accelerate training.

.. Note:: You can use other arguments to override fields in the specification file.
   To do so, use the name of the config parameter with a desired value and pass it as a parameter in
   the script call (e.g., :code:`trainer.val_check_interval=0.25`).

Training Procedure
^^^^^^^^^^^^^^^^^^

At the start of evaluation, TLT will print out a log of the experiment specification, then load
and preprocess the trained data. For the SQuAD dataset, it can initially take several minutes to
Tokenize the content. For subsequent faster runs, the preprocessed dataset will be cached in the
files at the same directory as the original dataset. Then it will also display the detailed model
architecture.

As the model starts training, you should see a progress bar per epoch.
Since QA datasets like SQuAD are big, it is usually enough to train for two epochs.
If you want to better see the training progress, you can add the :code:`trainer.val_check_interval`
parameter to the script with a value less than one (e.g., :code:`trainer.val_check_interval=0.25`,
which specifies four evaluations on the validation dataset per 1 epoch of training).

At the end of training, TLT will save the best checkpoint on the validation dataset at the path
specified by the experiment spec file before finishing.

.. code::

  TPU available: None, using: 0 TPU cores
  LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]
  [NeMo W 2021-01-28 14:52:19 exp_manager:299] There was no checkpoint folder at checkpoint_dir :results/checkpoints. Training from scratch.
  [NeMo I 2021-01-28 14:52:19 exp_manager:186] Experiments will be logged at results
  ...
  Validating: 100%|███████████████████████████| 1020/1020 [01:00<00:00, 21.60it/s][NeMo I 2021-01-29 10:17:18 qa_model:175] val exact match 50.10528088941295
  [NeMo I 2021-01-29 10:17:18 qa_model:176] val f1 50.10528088941295
  Epoch 0:  25%|██▎      | 3770/15076 [09:18<27:54,  6.75it/s, loss=1.34, lr=3e-5]
        Epoch 0, global step 2748: val_loss reached 1.19158 (best 1.19158), saving model to...


Model Fine-tuning
-----------------

The following is an example spec for fine-tuning of the model:

.. code::

    trainer:
      max_epochs: 1

    # Name of the .tlt file where finetuned model will be saved.
    save_to: finetuned-model.tlt

    # Fine-tuning settings: training dataset.
    finetuning_ds:
      file: ??? # e.g. squad/v1.1/train-v1.1.json
      num_samples: 500 # DEMO purposes # -1 # number of samples to be considered, -1 means all the dataset

    # Fine-tuning settings: validation dataset.
    validation_ds:
      file: ??? # e.g. squad/v1.1/dev-v1.1.json
      num_samples: 500 # DEMO purposes # -1 # number of samples to be considered, -1 means all the dataset

    # Fine-tuning settings: different optimizer.
    optim:
      name: adamw
      lr: 5e-6


+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| **Parameter**                             | **Data Type**   |   **Default**                                                                    | **Description**                                                                                              |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| trainer.max_epochs                        | integer         | 2                                                                                | The number of epochs to train                                                                                |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| restore_from                              | string          | trained-model.tlt                                                                | The path to the pre-trained model                                                                            |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| save_to                                   | string          | finetuned-model.tlt                                                              | The path to save trained model to                                                                            |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| finetuning_ds.file                        | string          | --                                                                               | The data file for fine tuning                                                                                |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| finetuning_ds.num_samples                 | integer         | 500                                                                              | The number of samples to use from the fine-tuning dataset (use -1 to specify all samples)                    |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| validation_ds.file                        | string          | --                                                                               | The validation data file                                                                                     |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| validation_ds.num_samples                 | integer         | 500                                                                              | The number of samples to use from the validation dataset (use -1 to specify all samples)                     |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| optim.name                                | string          | adam                                                                             | The optimizer to use for training                                                                            |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| optim.lr                                  | float           | 1e-5                                                                             | The learning rate to use for training                                                                        |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+

Use the following command to fine-tune the model:

.. code::

    !tlt question_answering finetune \
                            -e /specs/nlp/question_answering/finetune.yaml \
                            -g 1 \
                            data_dir=PATH_TO_DATA


Required Arguments for Fine-tuning
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* :code:`-e`: The experiment specification file to set up fine-tuning
* :code:`data_dir`: The path to the data

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-g`: The number of GPUs to be use for evaluation in a multi-GPU scenario (default: 1)

.. Note:: You can use other arguments to override fields in the specification file.
   To do so, use the name of the config parameter with a desired value and pass it as a parameter in
   the script call (e.g. :code:`trainer.val_check_interval=0.25`).


Fine-tuning Procedure
^^^^^^^^^^^^^^^^^^^^^

Fine-tuning procedure and logs will look similar to described in the Model Training section, with the addition of the model
that is initially loaded from a previously trained checkpoint.


Model Evaluation
----------------

The following is an example spec to evaluate the pre-trained model:

.. code::

    # Test settings: dataset.
    test_ds:
      file: ??? # e.g. squad/v1.1/dev-v1.1.json
      batch_size: 32
      shuffle: false
      num_samples: 500 # DEMO purposes -1 # number of samples to be considered, -1 means the whole the dataset


+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| **Parameter**                             | **Data Type**   |   **Default**                                                                    | **Description**                                                                                              |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| test_ds.file                              | string          | --                                                                               | The evaluation data file                                                                                     |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| test_ds.batch_size                        | integer         | 32                                                                               | The training data batch size                                                                                 |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| test_ds.shuffle                           | bool            | false                                                                            | A flag specifying whether to shuffle the training data                                                       |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| test_ds.num_samples                       | integer         | 500                                                                              | The number of samples to use from the training dataset (use -1 to specify all samples)                       |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+


Use the following command to evaluate the model:

.. code::

    !tlt question_answering evaluate \
                            -e /specs/nlp/question_answering/evaluate.yaml \
                            data_dir=PATH_TO_DATA

Required Arguments for Evaluation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* :code:`-e`: The experiment specification file to set up evaluation
* :code:`data_dir`: The path to the pre-processed data to run evaluation on


Evaluation Procedure
^^^^^^^^^^^^^^^^^^^^

After the previously trained model is initialized, it will run evaluation against the provided test set.
With Extractive QA models, when the answer span is returned by the model, accuracy evaluation
uses two metrics: The exact match (EM) and F1 score of the returned answer spans compared
to the right answers. The overall EM and F1 scores are computed for a model by averaging
the individual example scores.

* :code:`Exact match`: If the answer span is exactly equal to the correct one, it returns 1;
  otherwise, it returns 0. When assessing against a negative example (SQuAD 2.0), if the model
  predicts any text at all, it automatically receives a 0 for that example.
* :code:`F1`: The F1 score is a common metric for classification problems and widely used in QA.
  It is appropriate when we care equally about precision and recall. In this case, it is computed
  over the individual words in the prediction against  those in the True Answer. The number of
  shared words between the prediction and the truth is the basis of the F1 score: Precision is the
  ratio of the number of shared words to the total number of words in the prediction, and recall is
  the ratio of the number of shared words to the total number of words in the ground truth.
  :code:`F1 = 2 *  (precision * recall) / (precision + recall)`


.. code::

    Testing: 100%|████████████████████████████████| 383/383 [01:36<00:00,  3.77it/s][NeMo I 2021-01-29 10:26:40 qa_model:175] test exact match 50.11370336056599
    [NeMo I 2021-01-29 10:26:40 qa_model:176] test f1 50.11370336056599
    Testing: 100%|████████████████████████████████| 383/383 [02:07<00:00,  3.01it/s]
    --------------------------------------------------------------------------------
    DATALOADER:0 TEST RESULTS
    {'test_exact_match': 50.11370336056599,
     'test_f1': 50.11370336056599,
     'test_loss': tensor(1.1229, device='cuda:0')}


Model Inference
----------------

.. code::

    # Name of  file containing data used as inputs during the inference.
    input_file: ??? # e.g. squad/v1.1/dev-v1.1.json

    # Name of output nbest list file to store predictions to
    output_nbest_file: nbest.txt

    # Name of output file to store predictions to
    output_prediction_file: prediction.txt


+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| **Parameter**                             | **Data Type**   |   **Default**                                                                    | **Description**                                                                                              |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| input_file                                | string          | --                                                                               | The file containing the data used as inputs during the inference                                             |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| output_nbest_file                         | string          | nbest.txt                                                                        | The name of the output nbest list file to store predictions in                                               |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| output_prediction_file                    | string          | prediction.txt                                                                   | The name of the output file to store predictions in                                                          |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+


The following example shows how to run inference:

.. code::

    !tlt question_answering infer \
                            -e /specs/nlp/question_answering/infer.yaml \
                            -m trained-model.tlt \

Required Arguments for Inference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* :code:`-e`: The experiment specification file to set up inference.
  This requires the :code:`input_batch` with the list of examples to run inference on.
* :code:`-m`: The path to the pre-trained model checkpoint from which to infer. The value should
  be a :code:`.tlt` file.


Inference Procedure
^^^^^^^^^^^^^^^^^^^

After the trained model is loaded, it will run on the input file, which is in the same format as
the file used for training and evaluation. It will create a :code:`prediction.txt` prediction output
file with prediction spans for each question in the input file.


Model Export
------------

The following is an example of the spec file for model export:

.. code::

    # Name of the .tlt EFF archive to be loaded/model to be exported.
    restore_from: trained-model.tlt

    # Set export format: ONNX | JARVIS
    export_format: ONNX

    # Output EFF archive containing ONNX.
    export_to: exported-model.eonnx

+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| **Parameter**                             | **Data Type**   |   **Default**                                                                    | **Description**                                                                                              |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| restore_from                              | string          | trained-model.tlt                                                                | The path to the pre-trained model                                                                            |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| export_format                             | string          | ONNX                                                                             | The export format (either "ONNX" or "JARVIS")                                                                |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+
| export_to                                 | string          | exported-model.eonnx                                                             | The path to the exported model                                                                               |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+

To export a pre-trained model, run the following:

.. code::

     ### For export to ONNX
    !tlt question_answering export \
                            -e /specs/nlp/question_answering/export.yaml \
                            -m finetuned-model.tlt \
                            -k $KEY


Required Arguments for Export
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* :code:`-e`: The experiment specification file to set up inference. This requires the
  :code:`input_batch` with a list of examples to run inference on.
* :code:`-m`: The path to the pre-trained model checkpoint from which to infer. The file should
  have a :code:`.tlt` extension.
* :code:`-k`: The encryption key


Model Deployment
----------------

You can use the Jarvis framework for the deployment of the trained model in the runtime.
For more details, refer to the `Jarvis documentation`_

.. _Jarvis documentation: https://docs.nvidia.com/deeplearning/jarvis/user-guide/docs/model-servicemaker.html