## Introduction

With the Question Answering, or Reading Comprehension, task, given a question and a passage of content (context) that may contain an answer for the question, the model will predict the span within the text with a start and end position indicating the answer to the question. For datasets like SQuAD 2.0, this model supports cases when the answer is not contained in the content.

For every word in the context of a given question, the model will be trained to predict:

• The likelihood this word is the start of the span

• The likelihood this word is the end of the span

The model chooses the start and end words with maximal probabilities. When the content does not contain the answer, we would like the start and end span to be set for the first token.

A pretrained BERT encoder with two span prediction heads is used for the prediction start and the end position of the answer. The span predictions are token classifiers consisting of a single linear layer.

TAO Toolkit provides a sample notebook to outline the end-to-end workflow on how to train a Question Answering model using TAO Toolkit and deploy it in Riva format on NGC resources.

Before proceeding, let’s download sample spec files that we would need for the rest of the subtasks.

tao question_answering download_specs -r /results/question_answering/default_specs/ \
-o /specs/nlp/questions_answering


## Data Format

This model expects the dataset in SQuAD format (i.e., a JSON file for each dataset split). The code snippet below shows an example of the training file. Each title has one or multiple paragraph entries, each consisting of the “context” and question-answer entries. Each question-answer entry has:

• A question

• A globally unique id

• The Boolean flag “is_impossible”, which shows whether a question is answerable or not

• (if the question is answerable) One answer entry containing the text span and its starting character index in the context.

The evaluation files (for validation and testing) follow the above format, except that it can provide more than one answer to the same question. The inference file also follows the above format, except that it does not require the “answers” and “is_impossible” keywords.

The following is an example of the data format (JSON file):

{
"data": [
{
"title": "Super_Bowl_50",
"paragraphs": [
{
"context": "Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24\u201310 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the\"golden anniversary\"with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as\"Super Bowl L\"), so that the logo could prominently feature the Arabic numerals 50.",
"qas": [
{
"question": "Where did Super Bowl 50 take place?",
"is_impossible": "false",
"id": "56be4db0acb8001400a502ee",
{
"text": "Santa Clara, California"
}
]
},
{
"question": "What was the winning score of the Super Bowl 50?",
"is_impossible": "true",
"id": "56be4db0acb8001400a502ez",
]
}
]
}
]
}
]
}


## Dataset Conversion

After downloading the files, you should have a squad data folder that contains the following four files for training and evaluation:

|--squad
|-- v1.1/train-v1.1.json
|-- v1.1/dev-v1.1.json
|-- v2.0/train-v2.0.json
|-- v2.0/dev-v2.0.json


## Model Training

The following is an example of the config spec for training (train.yaml) file. You can change any of these parameters and pass them to the training command.

trainer:
max_epochs: 2

# Name of the .tlt file where trained model will be saved.
save_to: trained-model.tlt

model:

dataset:
do_lower_case: true
version_2_with_negative: true

tokenizer:


### Required Arguments for Export

• -e: The experiment specification file to set up inference. This requires the input_batch with a list of examples to run inference on.

• -m: The path to the pre-trained model checkpoint from which to infer. The file should have a .tlt extension.

• -k: The encryption key

## Model Deployment

You can use the Riva framework for the deployment of the trained model in the runtime. For more details, refer to the Riva documentation