deepvariant - NVIDIA Docs

Run GPU-accelerated deepvariant algorithm.

Parabricks has accelerated Google Deepvariant to extensively use GPUs and finish 30x WGS analysis in 25 minutes instead of hours. The Parabricks flavor of Deepvariant is more like other command line tools that users are familiar with: It takes a BAM and reference as inputs and produces variants as outputs.

Currently, Deepvariant is supported for T4, V100, and A100 GPUs out of the box. Please visit Models for additional GPUs section for more details.

Note

In version 3.8, we are introducing the --run-partition option which can lead to significant speedups. However, using the three options --run-partition, --proposed-variants and --gvcf at the same time will lead to a substantial slowdown. A warning will be issued and the --run-partition option will be ignored.

Quick Start

Copy
Copied!

            
            $ pbrun deepvariant \
    --ref Ref/Homo_sapiens_assembly38.fasta \
    --in-bam mark_dups_gpu.bam \
    --out-variants output.vcf

Compatible Google DeepVariant Commands

The commands below are the Google counterpart of the Parabricks command above. The output from these commands will be identical to the output from the above command. See the Output Comparison page for comparing the results.

Copy
Copied!

            
            # Run make_examples in parallel
seq 0 $((N_SHARDS-1)) | \
parallel --eta --halt 2 --joblog "${LOGDIR}/log" --res "${LOGDIR}" \
sudo docker run \
-v ${HOME}:${HOME} \
gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/make_examples \
--mode calling \
--ref "${REF}" \
--reads "${BAM}" \
--examples "${OUTPUT_DIR}/examples.tfrecord@${N_SHARDS}.gz" \
--task {}

# Run call_variants in parallel
sudo docker run \
-v ${HOME}:${HOME} \
gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/call_variants \
--outfile "${CALL_VARIANTS_OUTPUT}" \
--examples "${OUTPUT_DIR}/examples.tfrecord@${N_SHARDS}.gz" \
--checkpoint "${MODEL}"

# Run postprocess_variants in parallel
sudo docker run \
-v ${HOME}:${HOME} \
gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/postprocess_variants \
--ref "${REF}" \
--infile "${CALL_VARIANTS_OUTPUT}" \
--outfile "${FINAL_OUTPUT_VCF}"

Models for additional GPUs

Parabricks Deep Variant supports the following models:

Deep Variant WGS
Deep Variant WES
Deep Trio
1. Parent
2. Child

Deep Variant models for T4, V100 and A100 GPUs ship with the software. Additional models for A10, A30, A40 and A6000 GPUs can be downloaded from this NGC resource.

deepvariant Reference

Run DeepVariant to convert BAM/CRAM to VCF.

Input/Output file options

--ref REF
--in-bam IN_BAM
--interval-file INTERVAL_FILE
--out-variants OUT_VARIANTS
--pb-model-file PB_MODEL_FILE
--proposed-variants PROPOSED_VARIANTS

Tool Options:

--disable-use-window-selector-model
--gvcf
--norealign-reads
--sort-by-haplotypes
--keep-duplicates
--vsc-min-count-snps VSC_MIN_COUNT_SNPS
--vsc-min-count-indels VSC_MIN_COUNT_INDELS
--vsc-min-fraction-snps VSC_MIN_FRACTION_SNPS
--vsc-min-fraction-indels VSC_MIN_FRACTION_INDELS
--min-mapping-quality MIN_MAPPING_QUALITY
--min-base-quality MIN_BASE_QUALITY
--mode MODE
--alt-aligned-pileup ALT_ALIGNED_PILEUP
--variant-caller VARIANT_CALLER
--add-hp-channel
--parse-sam-aux-fields
--use-hp-information
--use-wes-model
--run-partition
-L INTERVAL, --interval INTERVAL

Common options:

--logfile LOGFILE
--tmp-dir TMP_DIR
--with-petagene-dir WITH_PETAGENE_DIR
--keep-tmp
--license-file LICENSE_FILE
--no-seccomp-override
--version

GPU options:

--num-gpus NUM_GPUS
--gpu-devices GPU_DEVICES