.. _custom_models: Deploying Your Custom Model into Jarvis ================================== This section provides a brief overview on the two main tools used in the deploy process: 1. The build phase using ``jarvis-build``. 2. The deploy phase using ``jarvis-deploy``. Build process -------------- For your custom trained model, refer to the corresponding section (ASR, NLP, TTS) for your model type for the ``jarvis-build`` phase. At the end of this phase, you would have the Jarvis Model Intermediate Representation (JMIR) archive for your custom model. Deploy process -------------- At this point, you already have your Jarvis Model Intermediate Representation (JMIR) archive. Now, you have two options for deploying this JMIR. **Option 1:** Use the Quick Start scripts (``jarvis_init.sh`` and ``jarvis_start.sh``) with the appropriate parameters in ``config.sh``, **Option 2:** Manually run ``jarvis-deploy`` and then start ``jarvis-server`` with the target model repo. Option 1: Using Quick Start Scripts to Deploy Your Models (Recommended path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The Quick Start scripts (``jarvis_init.sh`` and ``jarvis_start.sh``) uses a particular directory for its operations. This directory is defined by the variable ``$jarvis_model_loc`` specified in ``config.sh``. By default this is set to use a Docker volume, however, you can specify any local directory to this variable. By default, the ``jarvis_init.sh`` Quick Start script performs the following: - Downloads the JMIRs defined and enabled in ``config.sh``, from NGC, into a subdirectory at ``$jarvis_model_loc``, specifically ``$jarvis_model_loc/jmir``. - Executes ``jarvis-deploy`` for each of the JMIRs at ``$jarvis_model_loc/jmir`` to generate their corresponding Triton Inference Server model repository at ``$jarvis_model_loc/models``. When you execute ``jarvis_start.sh``, it starts with the ``jarvis-speech`` container by mounting this ``$jarvis_model_loc`` directory to ``/data`` inside the container. To deploy your own custom JMIR, or set of JMIRs, you would simply need to place them inside the ``$jarvis_model_loc/jmir`` directory. Ensure that you have defined a directory (that you have access to) in the ``$jarvis_model_loc`` variable in ``config.sh``, since you will need to copy over your JMIRs in its subdirectory. If the subdirectory ``$jarvis_model_loc/jmir`` does not exist, then you'd need to create it and then copy your custom JMIRs there. If you would like to skip the downloading of default JMIRs from NGC, then you can set the variable ``$use_existing_jmirs`` to ``true``. After your custom JMIRs are inside this ``$jarvis_model_loc/jmir`` directory, you can run ``jarvis_init.sh`` which will execute ``jarvis-deploy`` on your custom JMIRs along with any other JMIRs that are present on that directory and generate the Triton Inferece Server model repo at ``$jarvis_model_loc/models``. Next, you can run ``jarvis_start.sh`` and it will start the ``jarvis-speech`` container and load your custom models along with any other models that are present at ``$jarvis_model_loc/models``. If you only want to load your specific models, ensure that ``$jarvis_model_loc/models`` is empty or the ``/models`` directory is not present before you run ``jarvis_init.sh``. The script ``jarvis_init.sh`` will create the subdirectories ``/jmir`` and ``/models`` if they are not already there. For more information about seeing logs and using client containers for testing your models, refer to the ``Server Deployment > Local (Docker)`` section. Option 2: Using ``jarvis-deploy`` and the Jarvis Speech Container (Advanced) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #. Execute ``jarvis-deploy``. Refer to the Deploy section in ``Services and Models > Overview`` for a brief overview on ``jarvis-deploy``. .. code-block:: bash :substitutions: jarvis-deploy -f : /data/models If your .jmir archives are encrypted you need to include : at the end of the JMIR filename. Otherwise this is unnecessary. otherwise. The above command will create the Triton Inference Server model repository at ``/data/models``. If you want to write to any other location other than ``/data/models``, this will require some more manual changes in the embedded artifact directories within the configs within some of the Triton Inference Server model repositories that has model specific artifacts such as class labels. Therefore, stick with ``/data/models`` unless you are familiar with Triton Inference Server Model repository configurations. #. Manually start the ``jarvis-server`` Docker container using ``docker run``. After the Triton Inference Server model repository for your custom model is generated, start the Jarvis server on that target repo. The following command assumes you generated the model repo at ``/data/models``. .. code-block:: bash :substitutions: docker run -d --gpus 1 --init --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 \ -v /data:/data \ -p 50051 \ -e '\''CUDA_VISIBLE_DEVICES=0'\'' \ --name jarvis-speech \ jarvis-api \ start-jarvis --jarvis-uri=0.0.0.0:50051 --nlp_service=true --asr_service=true --tts_service=true This will launch the Jarvis Speech Service API server similar to the Quick Start script ``jarvis_start.sh``. Example output: :: Starting Jarvis Speech Services > Waiting for Triton server to load all models...retrying in 10 seconds > Waiting for Triton server to load all models...retrying in 10 seconds > Waiting for Triton server to load all models...retrying in 10 seconds > Triton server is ready… #. Verify that the servers have started correctly and check that the output of ``docker logs jarvis-speech`` shows: :: I0428 03:14:50.440943 1 jarvis_server.cc:66] TTS Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440943 1 jarvis_server.cc:66] NLP Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440951 1 jarvis_server.cc:68] ASR Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440955 1 jarvis_server.cc:71] Jarvis Conversational AI Server listening on 0.0.0.0:50051