4.5.0-1 Release Notes
Highlights:
- Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120. 
- As of v4.5 we are dropping support for the Volta (V100) line of GPUs. 
- Splice-aware support and performance improvements for minimap2. 
- Performance improvements for giraffe. 
- Marking duplicates is now on by default for giraffe. Use - --no-markdupsto turn off marking duplicates.
- Faster rna_fq2bam with - --num-streams-per-gpu.
- Performance improvements for force-calling mode in haplotypecaller and mutectcaller. 
Tool Updates
fq2bam and fq2bam_meth:
- 
   Added --in-fq-listand--in-se-fq-listsupport. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information.
- 
   Recovery mode has been improved to avoid falling back to CPU—enhancing performance. This may happen often when an input FASTQ produces a large amount of SMEMs (SuperMaximal Exact Matches) per read. 
- 
   Peak host memory use can be indirectly modified by changing --bwa-normalized-queue-capacity.
- 
   Improved performance on devices with compute capability SM_90 (e.g. H100, H200) and SM_100 (e.g. B200). 
- 
   Improved performance across the board. 
- 
   Marking duplicates is now enabled by default. The option --no-markdupshas been added to skip the marking duplicates step.
- 
   Added support for --ref-paths. This option allows to specify an ordered list of paths in the graph and helps supporting pangenome graphs containing multiple reference paths in variant calling pipelines.
- 
   Added --in-fq-listand--in-se-fq-listsupport. You can specify a file containing a list of paired-end or single-end FASTQ file paths along with their corresponding read group information.
- 
   Added --copy-commentwhich allows the user to copy FASTQ comments to BAM output via the auxiliary tag. Similar to-Cin fq2bam and BWA-MEM.
- 
   Added --markdups-single-ended-start-endto mark duplicate on single-ended reads by 5' and 3' end.
- 
   Added support for --presetoptionsspliceandsplice:hq.
- 
   Performance improvements. 
- 
   Added --max-queue-readsand--nstreamsperformance options.
- 
   Added --num-streams-per-gputo support multiple GPU streams.
- 
   Faster force-variant call --mutect-allelesin mutectcaller.
- 
   Faster force-variant call --htvc-allelesin haplotypecaller.
- 
   Added support for --keep_legacy_allele_counter_behavior.
- 
   Added --use-tf32to utilize tensor cores during inference to achieve better performance on Ampere+ GPUs. Note the accuracy might be slightly impacted because of tf32 lower precision.
General
- Support for the following NVIDIA Blackwell GPU architectures: SM_100 and SM_120. 
- Improved error messaging when Parabricks receives signals from the OS, such as out-of-memory (OOM) killer events. 
- 
   Fixed crash which occurred when the input FASTQ did not contain read comments. 
- 
   Fixed a hash table overflow in De Bruijn graph. 
- 
   Fixed a crash which occurred when the input BAM did not have a read group line. Error will be handled at the start. 
- 
   Fixed correctness issue when the read queryname had non-numeric characters in the colon-delimited end of the name. 
For further information see the Parabricks datasheet.