VARIANT PROCESSING OVERVIEW

ACCELERATED VARIANT MANIPULATION METHOD

GPU accelerated CNNScorevariants

Generate variant scores using a Convolutional Neural Network.

QUICK START

Copy
Copied!
            

$ pbrun cnnscorevariants --ref Ref.fa \ --in-bam sample.bam \ --in-vcf sample.vcf \ --out-vcf output.vcf


COMPATIBLE GATK4 COMMAND

Copy
Copied!
            

gatk CNNScoreVariants -R Ref.fa \ -I sample.bam \ -V sample.vcf \ -O output.vcf \ --tensor-type read_tensor


POST-ANALYSIS FILTERING

CNNScoreVariants generates an info field for each variant called CNN_2D. This field can be used to create filters for each variant by running the GATK4 tool FilterVariantTranches on the CNNScoreVariants output.

OPTIONS

--ref

(required) Path to the reference file.

--in-bam

(required) Path to the input bam file.

--in-vcf

(required) Path to the input vcf file.

--out-vcf

(required) Path to the output vcf file.

--pb-model-file

Path of a non-default parabricks model file for cnnscorevariants.

--num-gpus

Defaults to number of GPUs in the system.

Number of GPUs to use for a run.

--gpu-devices

Which GPU devices to use for a run. By default, all GPU devices will be used. To set specific GPU devices, enter a comma-separated list of GPU device numbers.

Accelerated variant filtration based on conditions

Filter a vcf using a boolean expression.

QUICK START

Copy
Copied!
            

$ pbrun variantfiltration --in-vcf sample.vcf \ --out-file output.vcf \ --expression "QD < 2.0 || ReadPosRankSum < -20.0" \ --filter-name FILTER


COMPATIBLE GATK4 COMMAND

Copy
Copied!
            

gatk VariantFiltration -V sample.vcf \ -O output.vcf \ --filter-expression "QD < 2.0 || ReadPosRankSum < -20.0" \ --filter-name FILTER


OPTIONS

--in-vcf

(required) Path to the input vcf file.

--out-file

(required) Path to the output variants file with an extension of either ‘.vcf’ or ‘.csv’.

--expression

(required) Boolean expression for filtering variants.

--filter-name

(required) Field value for variants that pass the filter expression.

--mode

Defaults to BOTH.

Type of variants to include in the filter. Possible values are SNP, INDEL, or BOTH.

Accelerated variant filteration using VQSR

Build a recalibration model to score variant quality and apply a score cutoff to filter variants.

QUICK START

Copy
Copied!
            

$ pbrun vqsr --in-vcf sample.vcf \ --out-vcf output.vcf --out-recal output.recal \ --out-tranches output.tranches \ --resource omni,known=false,training=true,truth=true,prior=12.0:1000G_omni2.5.hg38.vcf \ --annotation QD --annotation MQ --annotation MQRankSum -annotation ReadPosRankSum


COMPATIBLE GATK4 COMMAND

Copy
Copied!
            

gatk VariantRecalibrator -V sample.vcf \ -O output.recal \ --tranches-file output.tranches \ --resource omni,known=false,training=true,truth=true,prior=12.0:1000G_omni2.5.hg38.vcf \ -an QD -an MQ -an MQRankSum -an ReadPosRankSum \ --mode BOTH gatk ApplyVQSR -V sample.vcf \ --recal-file output.recal \ --tranches-file output.tranches \ -O output.vcf \ --mode BOTH


OPTIONS

--in-vcf

(required) Path to the input vcf file.

--out-vcf

(required) Path to the output vcf file.

--out-recal

(required) Path to the output recal file.

--out-tranches

(required) Path to the output tranches file.

--resource

(required) Known, truth, and training sets. The format string is

<set name>,known=<boolean>,training=<boolean>,truth=<boolean>,prior=<float>:<path to the vcf file>.

There must be at least one resource that is training and one resource that is truth. Any resource can be both. This option can be used multiple times.

--annotation

(required) Annotation which should be used for calculations. This option can be used multiple times.

--mode

Defaults to BOTH.

Type of variants to include in the recalibration. Possible values are SNP, INDEL, or BOTH.

--max-gaussians

Defaults to 8.

Max number of Gaussians for the positive model.

--truth-sensitivity-level

The truth sensitivity level at which to start filtering..

--lod-score-cutoff

The VQSLOD score below which to start filtering.

© Copyright 2020, NVIDIA. Last updated on Sep 22, 2020.