4.5.0-1 Release Notes
Highlights:
Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120.
As of v4.5 we are dropping support for the Volta (V100) line of GPUs.
Splice-aware support and performance improvements for minimap2.
Performance improvements for giraffe.
Marking duplicates is now on by default for giraffe. Use
--no-markdups
to turn off marking duplicates.Faster rna_fq2bam with
--num-streams-per-gpu
.Performance improvements for force-calling mode in haplotypecaller and mutectcaller.
Tool Updates
fq2bam and fq2bam_meth:
Added
--in-fq-list
and--in-se-fq-list
support. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information.Recovery mode has been improved to avoid falling back to CPU—enhancing performance. This may happen often when an input FASTQ produces a large amount of SMEMs (SuperMaximal Exact Matches) per read.
Peak host memory use can be indirectly modified by changing
--bwa-normalized-queue-capacity
.Improved performance on devices with compute capability SM_90 (e.g. H100, H200) and SM_100 (e.g. B200).
Improved performance across the board.
Marking duplicates is now enabled by default. The option
--no-markdups
has been added to skip the marking duplicates step.Added support for
--ref-paths
. This option allows to specify an ordered list of paths in the graph and helps supporting pangenome graphs containing multiple reference paths in variant calling pipelines.Added
--in-fq-list
and--in-se-fq-list
support. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information.Added
--copy-comment
which allows the user to copy FASTQ comments to BAM output via the auxiliary tag. Similar to-C
in fq2bam and BWA-MEM.
Added
--markdups-single-ended-start-end
to mark duplicate on single-ended reads by 5' and 3' end.
Added support for
--preset
optionssplice
andsplice:hq
.Performance improvements.
Added
--max-queue-reads
and--nstreams
performance options.
Added
--num-streams-per-gpu
to support multiple GPU streams.
Faster force-variant call
--mutect-alleles
in mutectcaller.
Faster force-variant call
--htvc-alleles
in haplotypecaller.
Added support for
--keep_legacy_allele_counter_behavior
.
Added
--use-tf32
to utilize tensor cores during inference to achieve better performance on Ampere+ GPUs. Note the accuracy might be slightly impacted because of tf32 lower precision.
General
Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120.
Improved error messaging when Parabricks receives signals from the OS, such as out-of-memory (OOM) killer events.
Fixed crash which occurred when the input FASTQ did not contain read comments.
Fixed a hash table overflow in De Bruijn graph.
Fixed a crash which occurred when the input BAM did not have a read group line. Error will be handled at the start.
Fixed correctness issue when the read queryname had non-numeric characters in the colon-delimited end of the name.
For further information see the Parabricks datasheet.