Best Performance - NVIDIA Docs

Parabricks software can give very high performance when all the required computing resources are provided to it. It should meet all the requirements in Installation Requirements section. Here are a few examples of when Parabricks software gives best performance.

See the Performance Tuning section for basic performance enhancement ideas.

See the Hardware Requirements section for minimum hardware requirements.

Best Performance for Germline Pipeline

Copy
Copied!

            
            # This command assumes all the inputs are in INPUT_DIR and all the outputs go to OUTPUT_DIR.
$ docker run --rm --gpus all --volume INPUT_DIR:/workdir --volume OUTPUT_DIR:/outputdir \ \
    --workdir /workdir \
    --env TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=268435456 \
    nvcr.io/nvidia/clara/clara-parabricks:4.1.0-1 \
    pbrun germline \
    --ref /workdir/Homo_sapiens_assembly38.fasta \
    --in-fq /workdir/fastq1.gz /workdir/fastq2.gz \
    --out-bam /outputdir/fq2bam_output.bam \
    --tmp-dir /workdir \
    --num-cpu-threads 16 \
    --out-recal-file recal.txt \
    --knownSites /workdir/hg.known_indels.vcf \
    --out-variants /outputdir/out.vcf \
    --run-partition --no-alt-contigs \
    --gpusort \
    --gpuwrite \
    --read-from-tmp-dir

Best Performance for fq2bam

Copy
Copied!

            
            # This command assumes all the inputs are in INPUT_DIR and all the outputs go to OUTPUT_DIR.
$ docker run --rm --gpus all --volume INPUT_DIR:/workdir --volume OUTPUT_DIR:/outputdir \ \
   --workdir /workdir --env TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=268435456
   nvcr.io/nvidia/clara/clara-parabricks:4.1.0-1 \
    pbrun fq2bam \
    --ref /workdir/Homo_sapiens_assembly38.fasta \
    --in-fq /workdir/fastq1.gz /workdir/fastq2.gz \
    --out-bam /outputdir/fq2bam_output.bam \
    --tmp-dir /workdir \
    --num-cpu-threads 16 \
    --out-recal-file recal.txt \
    --knownSites /workdir/hg.known_indels.vcf \
    --gpusort \
    --gpuwrite

Best Performance for deepvariant

DeepVariant from Parabricks has the ability to use multiple streams on a GPU. The number of streams that can be used depends on the available resources. The default number of streams is set to two but can be increased up to a maximum of six to get better performance. This is something that has to be experimented with, before getting the optimal number on your system.

Copy
Copied!

            
            # This command assumes all the inputs are in INPUT_DIR and all the outputs go to OUTPUT_DIR.
$ docker run --rm --gpus all --volume INPUT_DIR:/workdir --volume OUTPUT_DIR:/outputdir \ \
    --workdir /workdir --env TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=268435456
    nvcr.io/nvidia/clara/clara-parabricks:4.1.0-1 \
    pbrun deepvariant \
    --ref /workdir/Homo_sapiens_assembly38.fasta \
    --in-bam /outputdir/fq2bam_output.bam \
    --out-variants /outputdir/out.vcf \
    --num-streams-per-gpu 4 \
    --run-partition \
    --gpu-num-per-partition 1