What's New?

New Tools

In this release we have added the following tools:

fq2ubam

Converts FASTQs to to an unaligned BAM file.

duplexconsensusreads

Calls consensus sequences from reads with the same double-stranded source molecule.

bcftoolscsq

Predict consequences from variants in a VCF file.

Improvements

Tools with new command line options

consensusreads:

--error-rate-pre-umi ERROR_RATE_PRE_UMI

Set the Phred-scaled error rate for an error prior to the UMIs being integrated. (default: 45)

--error-rate-post-umi ERROR_RATE_POST_UMI

The Phred-scaled error rate for an error post the UMIs have been integrated. (default: 40)

--min-input-base-quality MIN_INPUT_BASE_QUALITY

Ignore bases in raw reads that have Q below this value. (default: 10)

--min-consensus-base-quality MIN_CONSENSUS_BASE_QUALITY

Mask (make ‘N’) consensus bases with quality less than this threshold. (default: 2)

cnvkit:

The CNVkit implementation now supports the batch, autobin and coverage subcommands.

denovo mutation:

--run-partition

Divide the whole genome into multiple partition and run multiple processes at the same time, each on 1 partition.

germline:

--run-partition

Divide the whole genome into multiple partition and run multiple processes at the same time, each on 1 partition.

--no-alt-contigs

Get rid of output records for alternate contigs.

groupreadbyumi:

--min-map-q MIN_MAP_Q

Minimum mapping quality (default: 30)

haplotypecaller:

--run-partition

Divide the whole genome into multiple partition and run multiple processes at the same time, each on 1 partition.

--no-alt-contigs

Get rid of output records for alternate contigs.

human_par:

--use-GRCh37-regions

Use the pseudoautosomal regions for GRCh37 reference types. This flag should be used for GRCh37 and UCSC hg19 references. By default, GRCh38 regions are used.

rna_gatk:

--run-partition

Divide the whole genome into multiple partition and run multiple processes at the same time, each on 1 partition.

--no-alt-contigs

Get rid of output records for alternate contigs.

samtoolsmpileup:

--anomalous-reads

Do not discard anomalous read pairs.

--output-mq

Output mapping quality.

--output-bp

Output base positions on reads.

smoove:

--exclude-bed-file

Optional bgzip-compressed/tabix-indexed BED file containing the set of regions to exclude.

umi_fgbio:

--min-map-q MIN_MAP_Q

Minimum mapping quality. (default: 30)

--error-rate-pre-umi ERROR_RATE_PRE_UMI

The Phred-scaled error rate for an error prior to the UMIs being integrated. (default: 45)

--error-rate-post-umi ERROR_RATE_POST_UMI

The Phred-scaled error rate for an error post the UMIs have been integrated. (default: 40)

--min-input-base-quality MIN_INPUT_BASE_QUALITY

Ignore bases in raw reads that have Q below this value. (default: 10)

--min-consensus-base-quality MIN_CONSENSUS_BASE_QUALITY

Mask (make ‘N’) consensus bases with quality less than this threshold. (default: 2)

vcfqcbybam:

--output-mq

Output mapping quality.

--output-bp

Output base positions on reads.

General

Added support for A10, A30, A40, A6000 GPUs for all tools in Clara Parabricks. Visit the Deepvariant documentation page for more details.

Improved tumor-only calling in LoFreq.

Added end-to-end support for germline on Hopper.

Improved CNNScoreVariants tool.

Added multi-TSV annotation database support.

Added consequence calling using bcftools 'csq' option.

Speedup in reading and writing VCF files.

Add mapquality, basequality and fwd/rev alleles to summary in vcfqcbybam.

Improve float comparison in vcfdiff.

Add two sample support to votebasedvcfmerger.

Update CNVKIT to v0.9.9.

bam2fq no longer requires a reference file if BAM input is used.

haplotypecaller now uses multiple CPUs to process multiple partitions.

Bug Fixes

The documentation erroneously reported that AWS S3 buckets and Google Cloud Storage objects could be used as input or output for several tools.

Fix a regular expression escape character bug in bamtagger.

Fixed how the reference type is determined for human_par.

In mutect2 if the tumor and normal BAM files had the same PU value the output VCF would be empty. Fixed this by adding prefix "tm_" to tumor PU and "nm_" to normal PU and differentiate them.

licenseinfo for Flexera licenses no longer requires a GPU to function.

vcfqcbybam could generate an incorrect mapping quality score.