vqsr
Accelerated variant filtration using VQSR.
Build a recalibration model to score variant quality and apply a score cutoff to filter variants.
$ pbrun vqsr \
--in-vcf sample.vcf \
--out-vcf output.vcf
--out-recal output.recal \
--out-tranches output.tranches \
--resource omni,known=false,training=true,truth=true,prior=12.0:1000G_omni2.5.hg38.vcf \
--annotation QD \
--annotation MQ \
--annotation MQRankSum \
-annotation ReadPosRankSum
gatk VariantRecalibrator -V sample.vcf \
-O output.recal \
--tranches-file output.tranches \
--resource omni,known=false,training=true,truth=true,prior=12.0:1000G_omni2.5.hg38.vcf \
-an QD -an MQ -an MQRankSum -an ReadPosRankSum \
--mode BOTH
gatk ApplyVQSR -V sample.vcf \
--recal-file output.recal \
--tranches-file output.tranches \
-O output.vcf \
--mode BOTH
Build a recalibration model to score variant quality and apply a score cutoff to filter variants.
Input/Output file options
- --in-vcf IN_VCF
- --out-recal OUT_RECAL
- --out-tranches OUT_TRANCHES
- -r RESOURCE [RESOURCE ...], --resource RESOURCE [RESOURCE ...]
- --out-vcf OUT_VCF
Path to the input VCF file. (default: None)
Option is required.
Path to the output recal file. (default: None)
Option is required.
Path to the output tranches file. (default: None)
Option is required.
Known, truth, and training sets. The format string is "[set name],known=[boolean],training=[boolean],truth=[ boolean],prior=[float]:[path to the VCF file]". There must be at least one resource that is training and one resource that is truth. Any resource can be both (e.g. "--resource omni,known=false,training=true,truth=true, prior=12.0:1000G_omni2.5.hg38.vcf") (default: None)
Option is required.
Path to the output VCF file. (default: None)
Option is required.
Tool Options:
- -a ANNOTATION [ANNOTATION ...], --annotation ANNOTATION [ANNOTATION ...]
- -m MODE, --mode MODE Type of variants to include in the recalibration.
- -g MAX_GAUSSIANS, --max-gaussians MAX_GAUSSIANS
- -t TRUTH_SENSITIVITY_LEVEL, --truth-sensitivity-level TRUTH_SENSITIVITY_LEVEL
- -l LOD_SCORE_CUTOFF, --lod-score-cutoff LOD_SCORE_CUTOFF
Annotation which should be used for calculations (e.g. "-a QD"). (default: None)
Option is required.
Possible values are {SNP, INDEL, BOTH}. (default: BOTH)
Max number of Gaussians for the positive model. (default: 8)
The truth sensitivity level at which to start filtering. (default: None)
The VQSLOD score below which to start filtering.
(default: None)
Common options:
- --logfile LOGFILE
- --tmp-dir TMP_DIR
- --with-petagene-dir WITH_PETAGENE_DIR
- --keep-tmp
- --license-file LICENSE_FILE
- --no-seccomp-override
- --version
Path to the log file. If not specified, messages will only be written to the standard error output. (default: None)
Full path to the directory where temporary files will be stored.
Full path to the PetaGene installation directory. By default, this should have been installed at /opt/petagene. Use of this option also requires that the PetaLink library has been preloaded by setting the LD_PRELOAD environment variable. Optionally set the PETASUITE_REFPATH and PGCLOUD_CREDPATH environment variables that are used for data and credentials (default: None)
Do not delete the directory storing temporary files after completion.
Path to license file license.bin if not in the installation directory.
Do not override seccomp options for docker (default: None).
View compatible software versions.