VARIANT CALLERS - NVIDIA Docs

NVIDIA Clara Parabricks Pipelines accelerated variant callers

HAPLOTYPECALLER

GPU accelerated haplotypecaller.

This tool runs GPU accelerated haplotypecaller. Users can provide an optional BQSR report to fix the BAM similar to ApplyBQSR. In that case the updated base qualities will be used.

QUICK START

Copy
Copied!

            
            $ pbrun haplotypecaller --ref Ref/Homo_sapiens_assembly38.fasta \
--in-bam mark_dups_gpu.bam \
--in-recal-file recal_gpu.txt \
--out-variants result.vcf

The command below is the GATK4 counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!

            
            # Run ApplyBQSR Step
$ gatk ApplyBQSR --java-options -Xmx30g -R Ref/Homo_sapiens_assembly38.fasta \
-I=mark_dups_cpu.bam --bqsr-recal-file=recal_file.txt -O=cpu_nodups_BQSR.bam

#Run Haplotype Caller
$ gatk HaplotypeCaller --java-options -Xmx30g --input cpu_nodups_BQSR.bam --output \
result_cpu.vcf --reference Ref/Homo_sapiens_assembly38.fasta \
--native-pair-hmm-threads 16

OPTIONS

--ref
--in-bam
--out-variants
--in-recal-file
--haplotypecaller-options
--static-quantized-quals
--ploidy
--interval-file
--interval
--interval-padding
--interval
--gvcf
--batch
--disable-read-filter
--max-alternate-alleles
--annotation-group
--gvcf-gq-bands
--tmp-dir
--num-gpus
--gpu-devices

MUTECTCALLER

GPU accelerated mutect2.

mutectcaller supports tumor or tumor-normal variant calling. The figure below shows high level functionality of mutectcaller. All dotted boxes are optional with some constraints.

QUICK START

Copy
Copied!

            
            $ pbrun mutectcaller --ref Ref/Homo_sapiens_assembly38.fasta \
--in-tumor-bam tumor.bam \
--tumor-name foobar \
--out-vcf output.vcf

COMPATIBLE GATK4 COMMAND

The command below is the GATK4 counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!

            
            gatk Mutect2 -R ref.tar.gz --input tumor.bam --tumor-sample foobar --output result.vcf

OPTIONS

--ref
--in-tumor-bam
--tumor-name
--out-vcf
--in-tumor- recal-file
--in-normal-bam
--in-normal-recal-file
--normal-name
--ploidy
--interval-file
--interval
--interval-padding
--tmp-dir
--num-gpus
--gpu-devices

DEEPVARIANT

Run GPU-accelerated deepvariant algorithm.

Parabricks has accelerated Google Deepvariant to extensively use GPUs and finish 30x WGS analysis in 25 minutes. The Parabricks flavor of Deepvariant is more like other commandline tools that users are familiar with. It takes the BAM and reference as inputs and produces variants as outputs. In the next versions, we will allow users to choose the exact model to use.

QUICK START

Copy
Copied!

            
            $ pbrun deepvariant --ref Ref/Homo_sapiens_assembly38.fasta \
--in-bam mark_dups_gpu.bam \
--out-variants output.vcf

COMPATIBLE GOOGLE DEEPVARIANT COMMANDS

The command below is the GATK4 counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!

            
            # Run make_examples in parallel
seq 0 $((N_SHARDS-1)) | \
parallel --eta --halt 2 --joblog "${LOGDIR}/log" --res "${LOGDIR}" \
sudo docker run \
-v ${HOME}:${HOME} \
gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/make_examples \
--mode calling \
--ref "${REF}" \
--reads "${BAM}" \
--examples "${OUTPUT_DIR}/examples.tfrecord@${N_SHARDS}.gz" \
--regions '"chr20:10,000,000-10,010,000"' \
--task {}

# Run call_variants in parallel
sudo docker run \
-v ${HOME}:${HOME} \
gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/call_variants \
--outfile "${CALL_VARIANTS_OUTPUT}" \
--examples "${OUTPUT_DIR}/examples.tfrecord@${N_SHARDS}.gz" \
--checkpoint "${MODEL}"

# Run postprocess_variants in parallel
sudo docker run \
-v ${HOME}:${HOME} \
gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/postprocess_variants \
--ref "${REF}" \
--infile "${CALL_VARIANTS_OUTPUT}" \
--outfile "${FINAL_OUTPUT_VCF}"

OPTIONS

--ref
--in-bam
--out-variants
--pb-model-file
--interval-file
--interval
--interval-padding
--disable-use-window-selector-model
--gvcf
--tmp-dir
--num-gpus
--gpu-devices

CNVKIT

CPU accelerated Copy number variant calling.

Run CNVkit with accelerated coverage calculation from read depths. CNVkit is not available as part of the free for Covid19 program.

QUICK START

Copy
Copied!

            
            $ pbrun cnvkit --ref Ref/Homo_sapiens_assembly38.fasta \
--in-bam mark_dups_gpu.bam
--out-file output.vcf

OPTIONS

--ref
--in-bam
--out-file
--cnvkit-options
--generate-vcf