4.5.0-1 Release Notes
Highlights:
Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120.
As of v4.5 we are dropping support for the Volta (V100) line of GPUs.
Splice-aware support and performance improvements for minimap2.
Performance improvements for giraffe.
Marking duplicates is now on by default for giraffe. Use
--no-markdups
to turn off marking duplicates.Faster rna_fq2bam with
--num-streams-per-gpu
.Performance improvements for force-calling mode in haplotypecaller and mutectcaller.
Tool Updates
fq2bam and fq2bam_meth:
-
Added
--in-fq-list
and--in-se-fq-list
support. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information. -
Recovery mode has been improved to avoid falling back to CPU—enhancing performance. This may happen often when an input FASTQ produces a large amount of SMEMs (SuperMaximal Exact Matches) per read.
-
Peak host memory use can be indirectly modified by changing
--bwa-normalized-queue-capacity
. -
Improved performance on devices with compute capability SM_90 (e.g. H100, H200) and SM_100 (e.g. B200).
-
Improved performance across the board.
-
Marking duplicates is now enabled by default. The option
--no-markdups
has been added to skip the marking duplicates step. -
Added support for
--ref-paths
. This option allows to specify an ordered list of paths in the graph and helps supporting pangenome graphs containing multiple reference paths in variant calling pipelines. -
Added
--in-fq-list
and--in-se-fq-list
support. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information. -
Added
--copy-comment
which allows the user to copy FASTQ comments to BAM output via the auxiliary tag. Similar to-C
in fq2bam and BWA-MEM.
-
Added
--markdups-single-ended-start-end
to mark duplicate on single-ended reads by 5' and 3' end.
-
Added support for
--preset
optionssplice
andsplice:hq
. -
Performance improvements.
-
Added
--max-queue-reads
and--nstreams
performance options.
-
Added
--num-streams-per-gpu
to support multiple GPU streams.
-
Faster force-variant call
--mutect-alleles
in mutectcaller.
-
Faster force-variant call
--htvc-alleles
in haplotypecaller.
-
Added support for
--keep_legacy_allele_counter_behavior
.
-
Added
--use-tf32
to utilize tensor cores during inference to achieve better performance on Ampere+ GPUs. Note the accuracy might be slightly impacted because of tf32 lower precision.
General
Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120.
Improved error messaging when Parabricks receives signals from the OS, such as out-of-memory (OOM) killer events.
-
Fixed crash which occurred when the input FASTQ did not contain read comments.
-
Fixed a hash table overflow in De Bruijn graph.
-
Fixed a crash which occurred when the input BAM did not have a read group line. Error will be handled at the start.
-
Fixed correctness issue when the read queryname had non-numeric characters in the colon-delimited end of the name.
For further information see the Parabricks datasheet.