User Guide (24.04)
User Guide (24.04)

spark-rapids/user-guide/24.04/partials/tools-setup-db-azure.html

The tool currently only supports event logs stored on ABFS. The remote output storage is also expected to be ABFS (no DBFS paths).

  • Install Databricks CLI

    • Install the Databricks CLI version 0.200+. Follow the instructions on Install the CLI.

    • Set the configuration settings and credentials of the Databricks CLI:

    • Set up authentication by following these instructions

    • Verify that the access credentials are stored in the file ~/.databrickscfg on Unix, Linux, or macOS, or in another file defined by environment variable DATABRICKS_CONFIG_FILE.

    • If the configuration is not set to default values, then make sure to explicitly set some environment variables to be picked up by the tools cmd such as: DATABRICKS_CONFIG_FILE, DATABRICKS_HOST and DATABRICKS_TOKEN. See the description of the variables in environment variables docs.

  • Install Azure CLI

    • Install the Azure CLI. Follow the instructions on How to install the Azure CLI.

    • Set the configuration settings and credentials of the Azure CLI:

      • Set up the authentication by following these instructions.

      • Configure the Azure CLI by following these instructions.

        • location is used for retreving instance type description (default is westus).

        • output should use default of json in core section.

        • Verify that the configurations are stored in the file $AZURE_CONFIG_DIR/config where the default value of AZURE_CONFIG_DIR is $HOME/.azure on Linux or macOS.

    • If the configuration is not set to default values, then make sure to explicitly set some environment variables to be picked up by the tools cmd such as: AZURE_CONFIG_DIR and AZURE_DEFAULTS_LOCATION.

© Copyright 2024, NVIDIA. Last updated on Apr 23, 2024.