What's New?

New Tools

We are providing a flexible end-to-end solution for analyzing Unique Molecular Indices (UMI) data. For this, we are accelerating the fgbio pipeline. The Clara Parabricks fgbio solution can be run with a single command or as individual steps.

annotatebamwithumis

Annotates existing BAM files with UMIs (Unique Molecular Indices) from a separate FASTQ file.

bamsort

Sort a BAM file. Five sort modes are supported:

  • Coordinate sort (Picard-compatible)

  • Coordinate sort (fgbio-compatible)

  • Queryname sort (Picard-compatible)

  • Queryname sort (fgbio-compatible)

  • Template coordinate sort (fgbio-compatible)

consensusreads

Calls consensus sequences from reads with the same unique molecular tag.

groupreadsbyumi

Groups reads together that appear to have come from the same original molecule.

setmateinfo

Adds and/or fixes mate information on paired-end reads.

We are also adding the following new tools in this release:

deeptrio

GPU-accelerated DeepTrio for calling de novo variants. This is an accelerated version of Google deepvariant team's deeptrio.

kallisto

Quantify abundances of transcripts from bulk and single-cell RNA-Seq data.

muse

The MuSE somatic caller tool has been added to the Parabricks toolkit and has a 10x acceleration compared to its original implementation. Muse is the fifth somatic caller in Parabricks. Muse utilizes a novel approach to mutation calling based on the F81 Markov substitution model for molecular evolution, which models the evolution of the reference allele to the allelic composition of the matched tumor and normal tissue at each genomic locus. You can read more here.

prepon

Generate a index for PON file, prerequisite of calling "--pon" during mutect.

postpon

Annotate variants based on a Panel of Normals (PON) file, modify the "INFO" field of input vcf file. This is the post process of calling "--pon" in mutect. After the mutect2 vcf is generated, this is a needed step if your are using PON.

snpswift

Annotate variants in a VCF file with VCF or GTF databases.

votebasedvcfmerger

Run votebasedvcfmerger to create union and intersection VCFs based on a minimum number of variant callers supporting a variant. This was previously called vbvm.

Improvements

General

  • genotypegvcf now supports .gz files.

  • Problems in triocombinegvcf and genotypegvcf with deepvariant's gvcfs files are fixed.

Germline/Somatic

  • Strelka workflow now accepts interval files.

RNA

  • STAR is roughly 2x faster for specific sets of data.

  • splitncigar is significantly faster than before.

Bug Fixes

Germline/Somatic

  • Fixed a CRAM support bug for fq2bam.

  • Fixed a CRAM support bug for human_par.

  • Two sources of deadlock in lofreq are fixed.

RNA

  • STAR deadlock bug is fixed.

  • Fix an assertion failure in rna_fq2bam: ReadAlign_outputTranscriptCIGARp.cpp:81:string chimericDetector::outputTranscriptCIGARp_pb(const chimericTrans&, PBWindow*): Assertion P.readFilesIn.size() > 1 failed.

  • Fix a possibility of a deadlock in rna_fq2bam.

  • Remove duplicate @HD lines in the output of rna_fq2bam.

CollectMultipleMetrics

  • Output of collectmultiplemetrics is now correctly tab separated, instead of using spaces.

  • Use of the --gen-insert-size option would cause the code to fail.

  • The --gen-all-metrics option failed to create the sequencing artifact report. It now correctly generates all available reports.

  • Fix a report generation bug for collectmultiplemetrics when --gen-alignment or --gen-insert-size was specified.