deeptrio - NVIDIA Docs

Run GPU-accelerated DeepTrio for calling de novo variants.

Parabricks has accelerated Google Deeptrio to extensively use GPUs. The Parabricks flavor of Deepvariant is more like other command line tools that users are familiar with: It takes two or three BAM files and reference as inputs and produces both vcf and g.vcf as outputs.

Currently, Deepvariant is supported for T4, V100, and A100 GPUs only.

Quick Start

Copy
Copied!

            
            $ pbrun deeptrio \
    --ref Ref/Homo_sapiens_assembly38.fasta \
    --in-bam-child child.bam \
    --in-bam-parent1 parent1.bam \
    --in-bam-parent2 parent2.bam \
    --sample-name-child sample_child \
    --sample-name-parent1 sample_parent1 \
    --sample-name-parent2 sample_parent2 \
    --out-variants-child child.vcf \
    --out-variants-parent1 parent1.vcf \
    --out-variants-parent2 parent2.vcf \
    --out-variants-gvcf-child child.g.vcf.gz \
    --out-variants-gvcf-parent1 parent1.g.vcf.gz \
    --out-variants-gvcf-parent2 parent2.g.vcf.gz

Compatible Google DeepVariant Commands

The commands below are the Google counterpart of the Parabricks command above. The output from these commands will be identical to the output from the above command. See the Output Comparison page for comparing the results.

Copy
Copied!

            
            mkdir -p output
mkdir -p output/intermediate_results_dir

BIN_VERSION=1.1.0

time sudo docker run \
-v "${PWD}/input":"/input"   \
-v "${PWD}/output":"/output"  \
-v "${PWD}/reference":"/reference" \
google/deepvariant:deeptrio-"${BIN_VERSION}" \
/opt/deepvariant/bin/deeptrio/run_deeptrio \
--model_type WGS \
--ref /reference/Homo_sapiens_assembly38.fasta \
--reads_child /input/child.bam \
--reads_parent1 /input/parent1.bam \
--reads_parent2 /input/parent2.bam \
--output_vcf_child /output/child.vcf \
--output_vcf_parent1 /output/parent1.vcf \
--output_vcf_parent2 /output/parent2.vcf \
--sample_name_child sample_child \
--sample_name_parent1 sample_parent1 \
--sample_name_parent2 sample_parent2 \
--num_shards $(nproc)  \
--intermediate_results_dir /output/intermediate_results_dir \
--output_gvcf_child /output/child.g.vcf.gz \
--output_gvcf_parent1 /output/parent1.g.vcf.gz \
--output_gvcf_parent2 /output/parent2.g.vcf.gz \
--make_examples_extra_args "ws_use_window_selector_model=True"

deeptrio Reference

Run DeepTrio on 3 samples for de novo variant detection

Input/Output file options

--ref REF
--in-bam-child IN_BAM_CHILD
--in-bam-parent1 IN_BAM_PARENT1
--in-bam-parent2 IN_BAM_PARENT2
--interval-file INTERVAL_FILE
--out-variants-child OUT_VARIANTS_CHILD
--out-variants-parent1 OUT_VARIANTS_PARENT1
--out-variants-parent2 OUT_VARIANTS_PARENT2
--out-variants-gvcf-child OUT_VARIANTS_GVCF_CHILD
--out-variants-gvcf-parent1 OUT_VARIANTS_GVCF_PARENT1
--out-variants-gvcf-parent2 OUT_VARIANTS_GVCF_PARENT2
--pb-model-file-child PB_MODEL_FILE_CHILD
--pb-model-file-parent PB_MODEL_FILE_PARENT

Options specific to this tool

--disable-use-window-selector-model
--keep-duplicates
--vsc-min-count-snps VSC_MIN_COUNT_SNPS
--vsc-min-count-indels VSC_MIN_COUNT_INDELS
--vsc-min-fraction-snps VSC_MIN_FRACTION_SNPS
--vsc-min-fraction-indels VSC_MIN_FRACTION_INDELS
--min-mapping-quality MIN_MAPPING_QUALITY
--min-base-quality MIN_BASE_QUALITY
--sample-name-child SAMPLE_NAME_CHILD
--sample-name-parent1 SAMPLE_NAME_PARENT1
--sample-name-parent2 SAMPLE_NAME_PARENT2
--mode MODE
-L INTERVAL, --interval INTERVAL

Common options:

--logfile LOGFILE
--tmp-dir TMP_DIR
--with-petagene-dir WITH_PETAGENE_DIR
--keep-tmp
--license-file LICENSE_FILE
--no-seccomp-override
--version

GPU options:

--num-gpus NUM_GPUS
--gpu-devices GPU_DEVICES