GETTING STARTED

QUICKSTART GUIDE

Get NVIDIA Clara Parabricks Pipelines up and running on a server and start using it in 10 minutes

STEP 1: Make sure installation requirements are met

The following are required to install Parabricks:

  • Access to the internet

  • nvidia-driver that supports cuda-9.0 or higher

  • nvidia-driver that supports cuda-10.0 or higher if you want to run deepvariant or cnnscorevariants

  • nvidia-docker or singularity version 2.6.1 or higher

  • Python 3

  • curl (Most Linux systems will already have this installed)

  • The following are the hardware requirements

Run on any GPU that supports CUDA architecture 60, 61, 70, 75 and has 12GB GPU RAM or more. It has been tested on NVIDIA P100, NVIDIA V100, and NVIDIA T4 GPUs.

  • 1 GPU server should have 64GB CPU RAM, at least 16 CPU threads

  • 2 GPU server should have 100GB CPU RAM, at least 24 CPU threads

  • 4 GPU server should have 196GB CPU RAM, at least 32 CPU threads

  • 8 GPU server should have 392GB CPU RAM, at least 48 CPU threads

STEP 2: Downloading installation package

Request NVIDIA Parabricks access from https://developer.nvidia.com/clara-parabricks to get an installation package for your GPU server.

STEP 3: Install Parabricks suite

Install the Parabricks package to your system:

# Step 1: Unzip the package.
$ tar -xzf parabricks.tar.gz

# Step 2: Run the installer
$ sudo ./parabricks/installer.py

# Step 3: verify your installation.
# This should display the parabricks version number:
$ pbrun version

After installation, pbrun is the executable that will start any tool in the Parabricks software suite. During installation you can choose to create a link at /usr/bin/pbrun to make it available for system wide access. Otherwise, you can access pbrun from your local installation directory (default: /opt/parabricks/pbrun).

STEP 4: Example run

# Run the fq2bam tool, which aligns, co-ordinate sorts and marks duplicates # in a pair-ended fastq file. Ref.fa is the bwa-indexed reference file

$ pbrun fq2bam --ref Ref.fa --in-fq sample_1.fq.gz sample_2.fq.gz --out-bam output.bam

You can download a sample dataset using the following command:

$ wget -O parabricks_sample.tar.gz \
"https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz?Expires=1613069864&Signature=WxLeyitbvR%2B0rO4MX%2B0GohDw89g%3D&AWSAccessKeyId=AKIAJGDUNN2G2ZAH3Q3A"

To run the sample dataset:

$ tar -xvzf parabricks_sample.tar.gz
$ /parabricks/pbrun fq2bam --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz --out-bam output.bam

The above test should take under 250 seconds on a 4 V100 GPU system

PERFORMANCE TUNING

The goal of Parabricks software is to get the highest performance for bioinformatics and genomic analysis. There are a few key system options that a user can tune to achieve maximum performance.

Using fast local SSD for files

Parabricks software operates with two kinds of files:

  • Input/Output files specified by user

  • Temporary files created during execution and deleted at the end

Best performance is achieved when both kind of files mentioned above are on a fast local SSD. However, it is possible that the Input/Output files are placed on a fast network storage. But it is highly recommended that for tools and pipelines that use temporary files, a fast local storage such as SSD is used.

Users can specify the –tmp-dir option to specify where the temporary files will be stored.

Note

Empirically we have observed that you can run with up to 4 GPUs and still get good performance with lustre network for Input/Output files. If you plan to use more than 4 GPUs, we highly recommend using local SSDs for all kinds of files.

DGX Users

DGX comes with an SSD mounted generally on /raid. Please use that disk and use a directory on that disk as –tmp-dir. For initial testing you can even copy the Input files to this disk to eliminate variability in performance.

Specifying GPUs to use

You can choose the number of GPUs to run using the commandline option –num-gpus for certain tools and pipelines. To select specific GPUs, please also set the environment variable NVIDIA_VISIBLE_DEVICES

$ NVIDIA_VISIBLE_DEVICES="0,1" pbrun fq2bam --num-gpus 2 --ref Ref.fa --in-fq S1_1.fastq.gz --in-fq S1_2.fastq.gz

RUNNING NVIDIA CLARA PARABRICKS PIPELINES

Parabricks software can be run to use just a single tool or full pipelines. This page will introduce how to run the software to run either option. The first option to pbrun is either the name of the tool or the full pipeline to run.

The details of the tools can be found at Tools Overview and the details of the pipelines can be found at Pipeline Overview. Performance optimizations techniques can be found in the Performance Tuning page.

Running a standalone tool

The example below shows how you can run the fq2bam tool.

# Run the fq2bam tool, which aligns, co-ordinate sorts and marks duplicates
# in a pair-ended fastq file. Ref.fa is the bwa-indexed reference file

$ pbrun fq2bam --ref Ref.fa --in-fq sample_1.fq.gz sample_2.fq.gz --out-bam output.bam

Running a pipeline

The example below shows how you can run the germline pipeline.

# Run the germline pipeline for sample_1.fq.gz sample_2.fq.gz to generate
# variant calls using Ref.fa which is the bwa-indexed reference file.

$ pbrun germline --ref Ref.fa --in-fq sample_1.fq.gz sample_2.fq.gz --out-bam output.bam \
--knownSites dbsnp_146.vcf.gz --out-recal-file recal.txt --out-variants result.vcf

OUTPUT COMPARISON

Many users want to compare output generated by Parabricks software with other standard tools. We recommend the following way to compare output generated by Parabricks software and the counterpart non-accelerated software.

BAM COMPARISON

GATK4 sorts the SAMs based on QNAME, FLAG, RNAME, POS, MAPQ, MRNM/RNEXT, MPOS/PNEXT, and ISIZE. If all these fields are the same for two different SAMs they are considered equal for sorting purposes. Therefore, the way that we compare two sorted BAMs is by using BamUtil diff tool to compare these fields and there should be no difference reported.

$ bam diff --in1 mark_dups_gpu.bam --in2 mark_dups_cpu.bam --noCigar --isize --flag --mate --mapQual

The output of this comparison should result in no differences.

BQSR REPORT COMPARISON

The files generated by Parabricks and GATK4 should be exactly the same. There should be no output of the following command

$ diff -w recal_gpu.txt recal_cpu.txt

VCF COMPARISON

To compare vcf we use the GATK Concordance tools to get sensitivity and specificity of SNPs and INDELs. When the following command is run, variant accuracy results will be stored in out.txt

$ gatk Concordance --evaluation result_gpu.vcf --truth result_cpu.vcf --summary out.txt