Deploy NVIDIA Parabricks

Get started by deploying Parabricks using a Docker image. Then you can run and customize Parabricks.

Getting the Parabricks Docker Image

Run the following command to obtain the image:

Copy
Copied!

            
            $ docker pull nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1

Running NVIDIA Parabricks

You can run Parabricks using either the command line or the base command platform.

Using the Command Line to Run Parabricks

After deploying Parabricks using a Docker image, you can begin customizing it. There are two parts to customizing a Parabricks run:

Customizing Docker container specific options: These are the options that are passed to the docker command before the name of the container. For example, the user should mount their data directories within the Docker container by passing the -v option to Docker. Refer to the Tutorials for more detailed examples.
Parabricks specific options: These options are passed to the Parabricks command line to customize the Parabricks run. For example, you can choose which tool to run and pass tool-specific options.

For example, use the following command to run the Parabricks fq2bam (BWA-MEM + GATK) tool using a Docker container. Refer to the tutorial for more information on how this command works.

Copy
Copied!

            
            $ docker run \
      --gpus all \
      --rm \
      --volume $(pwd):/workdir \
      --volume $(pwd):/outputdir \
    nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 \
    pbrun fq2bam \
      --ref /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
      --in-fq /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz \
      --out-bam /outputdir/fq2bam_output.bam

Sample data is free and available. Refer to the Getting The Sample Data section in the Tutorials for instructions on obtaining the sample data and a step-by-step guide to using both fq2bam and Haplotype Caller.

Some useful Docker options to consider:

--gpus all lets the Docker container use all the GPUs on the system. The GPUs available to Parabricks container can be limited using the --gpus "device=<list of GPUs>" option. Use nvidia-smi to see how many GPUs you have, and which one is which.
--rm tells Docker to terminate the image once the command has finished.
--volume $(pwd):/image/data mounts your current directory (a path on the server) on the Docker container in the /image/data directory (a path inside the Docker container). If your data is not in the current directory use an option similar to --volume /path/to/your/data:/image/data.
--workdir tells Docker what working directory to execute the commands from (inside the container).
The rest of the command is the Parabricks tool you want to run, followed by its arguments. For those familiar with pre-v4.0 versions of Parabricks and its pbrun command, this Docker invocation takes the place of pbrun.

Running Parabricks Using the Base Command Platform

An example command to launch a BaseCommand container on a single-GPU instance is:

Copy
Copied!

            
            ngc batch run --name "parabricks-germline" \
    --instance dgxa100.80g.1.norm \
    --commandline "pbrun germline \
--ref /workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
--in-fq /Data/HG002-NA24385-pFDA_S2_L002_R1_001-30x.fastq.gz /Data/HG002-NA24385-pFDA_S2_L002_R2_001-30x.fastq.gz \
--knownSites /workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
--out-bam output.bam \
--out-variants output.vcf \
--out-recal-file report.txt \
--run-partition \
--no-alt-contigs" \
    --result /results \
    --image "nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1"

Note

For other Parabricks commands, such as fq2bam, HaplotypeCaller, and DeepVariant, the

ngc batch run command is similar. Make sure to use the correct paths for your workplace or dataset that contains the data you intend to use.