What's New?
Contents
We are providing a flexible end-to-end solution for analyzing Unique Molecular Indices (UMI) data. For this, we are accelerating the fgbio pipeline. The Clara Parabricks fgbio solution can be run with a single command or as individual steps.
annotatebamwithumis
Annotates existing BAM files with UMIs (Unique Molecular Indices) from a separate FASTQ file.
bamsort
Sort a BAM file. Five sort modes are supported:
Coordinate sort (Picard-compatible)
Coordinate sort (fgbio-compatible)
Queryname sort (Picard-compatible)
Queryname sort (fgbio-compatible)
Template coordinate sort (fgbio-compatible)
consensusreads
Calls consensus sequences from reads with the same unique molecular tag.
groupreadsbyumi
Groups reads together that appear to have come from the same original molecule.
setmateinfo
Adds and/or fixes mate information on paired-end reads.
We are also adding the following new tools in this release:
deeptrio
GPU-accelerated DeepTrio for calling de novo variants. This is an accelerated version of Google deepvariant team's deeptrio.
kallisto
Quantify abundances of transcripts from bulk and single-cell RNA-Seq data.
muse
The MuSE somatic caller tool has been added to the Parabricks toolkit and has a 10x acceleration compared to its original implementation. Muse is the fifth somatic caller in Parabricks. Muse utilizes a novel approach to mutation calling based on the F81 Markov substitution model for molecular evolution, which models the evolution of the reference allele to the allelic composition of the matched tumor and normal tissue at each genomic locus. You can read more here.
prepon
Generate a index for PON file, prerequisite of calling "--pon" during mutect.
postpon
Annotate variants based on a Panel of Normals (PON) file, modify the "INFO" field of input vcf file. This is the post process of calling "--pon" in mutect. After the mutect2 vcf is generated, this is a needed step if your are using PON.
snpswift
Annotate variants in a VCF file with VCF or GTF databases.
votebasedvcfmerger
Run votebasedvcfmerger to create union and intersection VCFs based on a minimum number of variant callers supporting a variant. This was previously called vbvm.
General
genotypegvcf now supports .gz files.
Problems in triocombinegvcf and genotypegvcf with deepvariant's gvcfs files are fixed.
Germline/Somatic
Strelka workflow now accepts interval files.
RNA
STAR is roughly 2x faster for specific sets of data.
splitncigar is significantly faster than before.
Germline/Somatic
Fixed a CRAM support bug for fq2bam.
Fixed a CRAM support bug for human_par.
Two sources of deadlock in lofreq are fixed.
RNA
STAR deadlock bug is fixed.
Fix an assertion failure in rna_fq2bam: ReadAlign_outputTranscriptCIGARp.cpp:81:string chimericDetector::outputTranscriptCIGARp_pb(const chimericTrans&, PBWindow*): Assertion P.readFilesIn.size() > 1 failed.
Fix a possibility of a deadlock in rna_fq2bam.
Remove duplicate @HD lines in the output of rna_fq2bam.
CollectMultipleMetrics
Output of collectmultiplemetrics is now correctly tab separated, instead of using spaces.
Use of the --gen-insert-size option would cause the code to fail.
The --gen-all-metrics option failed to create the sequencing artifact report. It now correctly generates all available reports.
Fix a report generation bug for collectmultiplemetrics when --gen-alignment or --gen-insert-size was specified.