4.5.0-1 Release Notes

Highlights:

Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120.
As of v4.5 we are dropping support for the Volta (V100) line of GPUs.
Splice-aware support and performance improvements for minimap2.
Performance improvements for giraffe.
Marking duplicates is now on by default for giraffe. Use --no-markdups to turn off marking duplicates.
Faster rna_fq2bam with --num-streams-per-gpu.
Performance improvements for force-calling mode in haplotypecaller and mutectcaller.

Improvements

Added --in-fq-list and --in-se-fq-list support. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information.
Recovery mode has been improved to avoid falling back to CPU—enhancing performance. This may happen often when an input FASTQ produces a large amount of SMEMs (SuperMaximal Exact Matches) per read.
Peak host memory use can be indirectly modified by changing --bwa-normalized-queue-capacity.
Improved performance on devices with compute capability SM_90 (e.g. H100, H200) and SM_100 (e.g. B200).

Improved performance across the board.
Marking duplicates is now enabled by default. The option --no-markdups has been added to skip the marking duplicates step.
Added support for --ref-paths. This option allows to specify an ordered list of paths in the graph and helps supporting pangenome graphs containing multiple reference paths in variant calling pipelines.
Added --in-fq-list and --in-se-fq-list support. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information.
Added --copy-comment which allows the user to copy FASTQ comments to BAM output via the auxiliary tag. Similar to -C in fq2bam and BWA-MEM.

Added --markdups-single-ended-start-end to mark duplicate on single-ended reads by 5' and 3' end.

Added --use-tf32 to utilize tensor cores during inference to achieve better performance on Ampere+ GPUs. Note the accuracy might be slightly impacted because of tf32 lower precision.

Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120.
Improved error messaging when Parabricks receives signals from the OS, such as out-of-memory (OOM) killer events.

Fixed a crash which occurred when the input BAM did not have a read group line. Error will be handled at the start.
Fixed correctness issue when the read queryname had non-numeric characters in the colon-delimited end of the name.

For further information see the Parabricks datasheet.