Virtual Assistant (with Google Dialogflow)

This Virtual Assistant (with Google Dialogflow) sample application demonstrates the integration of Google Dialogflow and Jarvis Speech Services in the form of a weather chatbot web application.

In this sample, we use Jarvis for ASR and TTS and Google Dialogflow for NLP and Dialog Management (DM).

Demo Video

To see how the Jarvis and Google Dialogflow weather chatbot service works, the demo video can be found here.

Implementation

At a high-level, the integration takes advantage of the native API support of Google Dialogflow and gRPC support in Jarvis. The Weatherbot Client coordinates the workflow with Jarvis Services and Dialogflow, then interacts with the end-user via a web UI. There are three primary parts to this solution; Jarvis AI Services, Dialogflow Weatherbot, and the Weatherbot Client application.

Here is the implementation at a high-level:

  • Jarvis AI Services

    • Exposes Speech Services (ASR/NLP/TTS) over gRPC endpoints.

    • Needs a GPU.

  • Jarvis and Dialogflow Chatbot

    • Dialogflow Weatherbot

      • Exposes API endpoints to communicate with the chatbot.

      • Takes user text as input and returns a response.

      • Responsible for fulfillment, when needed.

      • Runs on GCP.

    • Weatherbot Client application

      • Includes the Jarvis Client Python library.

      • Communicates with Jarvis AI Services and Dialogflow Weatherbot over gRPC and REST API endpoints respectively.

      • Pipelines ASR, NLP, TTS, and dialog manager functionalities.

      • Contains the Weatherbot Client application (web UI and web service).

      • Does not need GPUs.

Architecture

Jarvis and Dialogflow virtual assistant architecture

The above diagram shows the architecture of the Jarvis and Dialogflow Weatherbot. Audio input from the user is collected through the microphone by the web UI of the Weatherbot Client application. The input audio from the user is sent to Jarvis AI Services for ASR, by the client application. Jarvis AI Services returns the transcribed text back to the Client application. The transcribed text from Jarvis AI Services is then sent to the Dialogflow Weatherbot (running on GCP). The Dialogflow Weatherbot returns the appropriate response for the text. The response text is then sent to Jarvis AI Services for TTS. A voice response is returned back to the client application, which is then played on the user’s speakers by the web UI.

Code Structure

This section shows the high-level code structure of the Weatherbot Client application (in Jarvis and Dialogflow Chatbot).

  • asr.py

    • This file contains the functionality to make the gRPC call to Jarvis ASR, using the Jarvis Python Client libraries with the audio snippet and returns the text transcript.

    • ASR is used in streaming mode

  • dialogflow.py

    • This file contains the functionality to make an API call to Dialogflow, with the user input and sender ID and returns a text response obtained by Dialogflow.

  • tts.py and tts_stream.py

    • These files contain the functionality to make the gRPC call to Jarvis TTS, using the Jarvis Python Client libraries, with a text snippet, and returns the corresponding audio speech.

    • TTS can be used in either Batch or Streaming mode, depending on whether tts.py or tts_stream.py is used. This can be set by changing the import statements in lines 3 and 4 in the chatbot.py script.

  • chatbot.py

    • This file contains the Chatbot class which is responsible for pipelining all the ASR, TTS and Dialogflow operations.

    • Creates one instance of the Chatbot class per conversation.

Running the Demo

  1. Start Jarvis Speech Services per Jarvis Quick Start Guide.

  2. Run Jarvis and Dialogflow Virtual Assistant.

    1. Run the Jarvis Sample container.

      1. Create a directory to hold the code for the Jarvis and Dialogflow Virtual Assistant.

        mkdir jarvis-dialogflow-va-temp
        
      2. Pull the Jarvis Sample container.

        docker pull nvcr.io/nvidia/jarvis/jarvis-speech-client:1.3.0-beta-samples
        
      3. Run the Jarvis Sample container.

        docker run -it --rm -p 6006:6006 -v <Path to jarvis-dialogflow-va-temp>:/jarvis-dialogflow-va-temp nvcr.io/nvidia/jarvis/jarvis-speech-client:1.3.0-beta-samples /bin/bash
        
      4. Navigate to the samples/dialogflow-jarvis-va directory.

        cd samples/dialogflow-jarvis-va
        
      5. Modify the API endpoint setting per the Network Configuration section.

    2. Set up Google Dialogflow. The entire set up process for Dialogflow can take some time to complete. We have tried to complete as much of the set up as possible in the Docker container, however, the following steps must be completed.

      1. Read through Dialogflow Basics and About the Google Cloud Console to better understand the basics of Dialogflow.

      2. Follow the steps in Create a Project and Enable the API from the Dialogflow setup. Enable Billing and Enable audit logs are not needed for this demo.

      3. Follow the Set up Authentication instructions. When done, run the command in the Use the service account key file in your environment step in the Jarvis Samples container.

        export GOOGLE_APPLICATION_CREDENTIALS="<Path to key json file>"
        
      4. Install and initialize the Cloud SDK in the container, except for initilializing gcloud CLI. In the Jarvis Samples container, run:

        gcloud init
        

        During this command, you will need to provide your Project ID. To find your Project ID, perform the following steps:

        1. In the Google Cloud Platform (GCP) Dashboard, select your project from the top-left drop-down, found on the right side of the GCP banner.

        2. Under the DASHBOARD tab, the Project ID can be found in the Project Info section.

      5. Complete Test the SDK and authentication.

        gcloud auth application-default print-access-token
        
      6. Skip Install the Dialogflow client library. This step has been completed in the Jarvis Samples container, therefore, no action is needed.

      7. Update the config.py PROJECT_ID parameter with your project ID. To find your Project ID, perform the following steps:

        1. In the Google Cloud Platform (GCP) Dashboard, select your project from the top-left drop-down, found on the right side of the GCP banner.

        2. Under the DASHBOARD tab, the Project ID can be found in the Project Info section.

    3. Initialize and start the Dialogflow Weatherbot.

      1. Copy the code from the Samples container to the host system.

        cp -r /workspace/samples/dialogflow-jarvis-va/* /jarvis-dialogflow-va-temp/
        
      2. Follow the steps here to create an agent.

      3. Click the Setting button next to the agent name in the Dialogflow console. Under the Export and Import tab, choose Restore From ZIP and upload the zipped folder from your host at <Path to jarvis-dialogflow-va-temp>/dialogflow-weatherbot/dialogflow-weatherbot.zip.

      4. Add fullfillment.

        1. Open the Fulfillment section and enable the Inline Editor in the Dialogflow console.

        2. Copy and paste the contents of the <Path to jarvis-dialogflow-va-temp>/dialogflow-weatherbot/fulfillment/index.js into index.js under the Inline Editor.

        3. Copy and paste the contents of the <Path to jarvis-dialogflow-va-temp>/dialogflow-weatherbot/fulfillment/package.json into package.json under the Inline Editor.

      5. In index.js, at line 4, update the weatherstack_APIkey with your Weatherstack API key. A new Weatherstack API key can be obtained from here.

    4. Start the Jarvis and Dialogflow chatbot client.

      1. Activate the chatbot client Python environment.

        . /pythonenvs/dialogflow/bin/activate
        
      2. Navigate to the chatbot client folder.

        cd dialogflow-jarvis-weatherbot-webapp
        
      3. Start the chatbot web server.

        python3 main.py
        
    5. Open a Web UI on a web browser.

      https://<jarvis chatbot server host IP>:6006/jarvisWeather
      

      For example:

      https://0.0.0.0:6006/jarvisWeather
      

Network Configuration

Update the config.py script for inter-service communication.

# uncomment and populate the section below
jarvis_config = {
    ...
    "JARVIS_SPEECH_API_URL": "[jarvis speech service host IP]:50051",
    ...

In the case where one is using _localhost_, the _[dialogflow server host IP]_ must be set to the local machine’s IP, instead of _localhost_.

Sample Conversations

Sample 1

Virtual Assistant: Hi, welcome to JARVIS weather service. How can I help you?

User: How’s the weather?

Virtual Assistant: For which location?

User: In Seattle.

Virtual Assistant: It is Partly cloudy in Seattle at the moment. The temperature is 16 degrees, the humidity is 97 percent and the wind speed is 0 miles per hour.

Sample 2

Virtual Assistant: Hi, welcome to JARVIS weather service. How can I help you?

User: How’s the weather in Paris?

Virtual Assistant: It is Clear in Paris at the moment. The temperature is 20 degrees, the humidity is 73 percent and the wind speed is 9 miles per hour.

Sample 3

Virtual Assistant: Hi, welcome to JARVIS weather service. How can I help you?

User: Hi

Virtual Assistant: Hey! How can I help you?

User: What is your name?

Virtual Assistant: My friends call me Ja-Di.

Sample 4

Virtual Assistant: Hi, welcome to JARVIS weather service. How can I help you?

User: Good morning

Virtual Assistant: Hey! How can I help you?

User: See you around

Virtual Assistant: Bye

Limitations

  • The provided samples are not complete virtual assistants; but are intended as simple examples of how to build basic task-oriented chatbots with Jarvis. Consequently, the intent classifier and slot filling models have been trained with small amounts of data and are not expected to be highly accurate.

  • The Dialogflow Weatherbot sample supports intents for cloudy, humidity, rainfall, snow, sunny, temperature, weather and windy checks. It does not support general conversational queries or other domains.

  • The Dialogflow Weatherbot supports only 1 slot for city. It doesn’t take into account the day associated with the query while providing the response.

  • This sample supports up to four concurrent users. This restriction is not because of Jarvis, but because of the web framework (Flask and Flask-ScoketIO) that is being used by the client web application. The socket connection

to stream audio to (TTS) and from (ASR) the user is unable to sustain more than four concurrent socket connections.

License

End User License Agreement is included with the product. Licenses are also available along with the model application zip file. By pulling and using the Jarvis SDK container and download models, you accept the terms and conditions of these licenses.

The licenses for the libraries that are used in this sample are:

  1. google-cloud-dialogflow - The license for this library can be found here.

  2. actions-on-google - The license for this library can be found here.

  3. firebase-admin - The license for this library can be found here.

  4. firebase-functions - The license for this library can be found here.

  5. dialogflow - The license for this library can be found here.

  6. dialogflow-fulfillment - The license for this library can be found here.