deepvariant - NVIDIA Docs

Run a GPU-accelerated DeepVariant algorithm.

This tool is a deep-learning-based germline variant caller that can apply different models trained for specific sample types (such as whole genome vs. whole exome sequencing) to gain a higher accuracy result.

Parabricks has accelerated the Google Deepvariant to extensively use GPUs and finish 30x WGS analysis in 25 minutes instead of hours. The Parabricks flavor of Deepvariant is more like other command line tools that users are familiar with: It takes a BAM and reference as inputs and produces variants as outputs.

Currently, Deepvariant is supported for T4, V100, and A100 GPUs out of the box. Please visit the Models for additional GPUs section for more details.

Note

In version 3.8 the --run-partition option was added, which can lead to a significant speed increase. However, using the --run-partition, --proposed-variants, and --gvcf options at the same time will lead to a substantial slowdown. A warning will be issued and the --run-partition option will be ignored.

Quick Start

Copy
Copied!

            
            # This command assumes all the inputs are in 
 
   and all the outputs go to 
  
   .
  
 
$ docker run --rm --gpus all --volume <INPUT_DIR>:/workdir --volume <OUTPUT_DIR>:/outputdir
    -w /workdir \
    nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1 \
    pbrun deepvariant \
    --ref /workdir/${REFERENCE_FILE} \
    --in-bam /workdir/${INPUT_BAM} \
    --out-variants /outputdir/${OUTPUT_VCF}

Compatible Google DeepVariant Commands

The commands below are the Google counterpart of the Parabricks command above. The output from these commands will be identical to the output from the above command. See the Output Comparison page for comparing the results.

Copy
Copied!

            
            sudo docker run \
--volume <INPUT_DIR>:/input \
--volume <OUTPUT_DIR>:/output \
google/deepvariant:1.4.0 \
/opt/deepvariant/bin/run_deepvariant \
--model_type WGS \
--ref /input/${REFERENCE_FILE} \
--reads /input/${INPUT_BAM} \
--output_vcf /output/${OUTPUT_VCF} \
--num_shards $(nproc) \
--make_examples_extra_args "ws_use_window_selector_model=true"

Models for additional GPUs

Parabricks Deep Variant supports the following models:

Deep Variant WGS
Deep Variant WES
Deep Trio
1. Parent
2. Child

Deep Variant models for T4, V100 and A100 GPUs ship with the software. Additional models for A10, A30, A40, and A6000 GPUs can be downloaded from this NGC resource.

Downloading the additional model files will create a sub-directory in your current directory, containing a single .tar file. Extract the content of that tar file:

Copy
Copied!

            
            $ ls -lF
drwx------  2 user user  4096 Aug 31 14:46 deepvariant_model_files_v4.0.0-1.extramodels/
$ cd deepvariant_model_files_v4.0.0-1.extramodels/
$ tar xvf deepvariant_extra_model_files_v4.0.0-1.tar.gz

and copy the file for your model GPU into the <INPUT_DIR>. Use

deepvariant.eng for WGS,
deepvariant_wes.eng for WES or
pacbio/deepvariant.eng for the PacBio model.

Tell pbrun to use the alternate model file by adding the --pb-model-file option:

Copy
Copied!

            
            # This command assumes all the inputs are in 
 
   and all the outputs go to 
  
   .
  
 
$ docker run --rm --gpus all --volume <INPUT_DIR>:/workdir --volume <OUTPUT_DIR>:/outputdir
    -w /workdir \
    nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1 \
    pbrun deepvariant \
    --ref /workdir/${REFERENCE_FILE} \
    --in-bam /workdir/${INPUT_BAM} \
    --pb-model-file /workdir/<NAME_OF_MODEL_FILE> \  # <--- Add this.
    --out-variants /outputdir/${OUTPUT_VCF}

deepvariant Reference

Run DeepVariant to convert BAM/CRAM to VCF.

Input/Output file options

--ref REF
--in-bam IN_BAM
--interval-file INTERVAL_FILE
--out-variants OUT_VARIANTS
--pb-model-file PB_MODEL_FILE
--proposed-variants PROPOSED_VARIANTS

Tool Options:

--disable-use-window-selector-model
--gvcf
--norealign-reads
--sort-by-haplotypes
--keep-duplicates
--vsc-min-count-snps VSC_MIN_COUNT_SNPS
--vsc-min-count-indels VSC_MIN_COUNT_INDELS
--vsc-min-fraction-snps VSC_MIN_FRACTION_SNPS
--vsc-min-fraction-indels VSC_MIN_FRACTION_INDELS
--min-mapping-quality MIN_MAPPING_QUALITY
--min-base-quality MIN_BASE_QUALITY
--mode MODE
--alt-aligned-pileup ALT_ALIGNED_PILEUP
--variant-caller VARIANT_CALLER
--add-hp-channel
--parse-sam-aux-fields
--use-wes-model
--run-partition
--include-med-dp
--normalize-reads
--channel-insert-size
--no-channel-insert-size
--max-read-size-512
--prealign-helper-thread
--max-reads-per-partition MAX_READS_PER_PARTITION
--partition-size PARTITION_SIZE
--track-ref-reads
--phase-reads
-L INTERVAL, --interval INTERVAL

Common options:

--logfile LOGFILE
--tmp-dir TMP_DIR
--with-petagene-dir WITH_PETAGENE_DIR
--keep-tmp
--no-seccomp-override
--version

GPU options:

--num-gpus NUM_GPUS