Software Overview

Parabricks is a software suite for genomic analysis. It delivers major improvements in throughput time for common analytical tasks in genomics, including germline and somatic analysis. The core of the Parabricks software is its data pipeline, which takes raw data and transforms it according to the user's requirements.

Parabricks supports the tools shown below:

Parabricks supports the pipelines shown below:

The Parabricks software can be configured to run specific accelerated tools or run full pipelines that are commonly used. The standalone tools page covers individual tools and the pipelines page discuses how to run commonly used pipelines.

NVIDIA Parabricks pipelines have been tested on Dell, HPE, IBM, and NVIDIA servers at Amazon Web Services, Google Cloud, Oracle Cloud Infrastructure, and Microsoft Azure.

Software Tools

The following standalone tools can be used with the NVIDIA Clara Parabricks Pipelines software. Click on a tool name for tool-specific options.

Standalone Tools Overview

Tool	Details
annotatebamwithumis	Annotates existing BAM files with UMIs (Unique Molecular Indices) from a separate FASTQ file
applybqsr	Apply BQSR report to a bam file and generate new bam file
arriba	Tool for the detection of gene fusions from RNA-Seq data
bam2fq	Convert a BAM to FASTQ
bammetrics	Collect WGS Metrics on a bam file
bamsort	Sort BAM file
bcftoolscall	Call variants from mpileup output
bcftoolscsq	Consequence prediction for genomic variants
bcftoolsmpileup	Generate BCF/VCF pileup for one or multiple BAM files
bqsr	Collect BQSR report on a BAM file
cnnscorevariants	Generate variant scores using a Convolutional Neural Network
cnvkit	Run CNVkit with accelerated coverage calculation from read depths
collectmultiplemetrics	Collect multiple classes of metrics on a bam file
consensusreads	Calls consensus sequences from reads with the same unique molecular tag
dbsnp	Annotate variants based on a dbsnp
deeptrio	Run GPU-DeepTrio for calling de novo variants
deepvariant	Run GPU-DeepVariant for calling germline variants
demuxfastqs	Perform sample demultiplexing on FASTQs
duplexconsensusreads	Calls consensus sequences from reads with the same double-stranded source molecule
expansionhunter	A tool for estimating large repeats in the bam
fq2bam	Run bwa mem, co-ordinate sorting, marking duplicates and Base Quality Score Recalibration
fq2ubam	Convert FASTQs to an unaligned BAM file
frequencyfiltration	Filter a VCF by allele frequency or allele count
genotypegvcf	Convert a GVCF to VCF
glnexus	Merge and joint-call input gVCF files, emitting multi-sample BCF
groupreadsbyumi	Groups reads together that appear to have come from the same original molecule
haplotypecaller	Run GPU-HaplotypeCaller for calling germline variants
indexgvcf	Index a GVCF file
kallisto	Quantify abundances of transcripts from bulk and single-cell RNA-Seq data
lofreq	Call variants with high sensitivity, predicting variants below the average base-call quality
lofreq_call	Call variants from BAM file
manta	Analyze germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs
muse	Call somatic variants with accelerated MuSE variant caller
mutectcaller	Run GPU-Mutect2 for tumor-normal analysis
postpon	Generate the final vcf output of doing mutect pon
prepon	Build an index for pon file, which is the prerequisite to do mutect pon
rna_fq2bam	Run RNA-seq data through the fq2bam pipeline
samtoolsmpileup	Generate text pileup for one or multiple BAM files
setmateinfo	Adds and/or fixes mate information on paired-end reads
smoove	Call and genotype SVs for short reads
snpswift	Annotate variants in a VCF file with VCF or GTF databases
somaticsniper	Identify single nucleotide positions that are different between tumor and normal BAM files
somaticsniper_workflow	Run the somaticsniper variant caller workflow
splitncigar	Split reads in a BAM file that contain Ns in their cigar string
starfusion	Identify candidate fusion transcripts supported by Illumina reads
strelka	Analyze germline variation in small cohorts and somatic variation in tumor/normal sample pairs
strelka_workflow	Run the strelka variant caller workflow
triocombinegvcf	Combine GVCF of 2 or 3 samples
umi_fgbio	This UMI pipeline is based on Fulcrum Genomics toolkit, processes sequencing reads with molecular barcodes (also known as Unique Molecular Indices, UMIs), which provide impressive error correction and increased accuracy using a sequencing consensus read level
variantfiltration	Filter a VCF using a boolean expression
vcfanno	Annotate a VCF using dbsnp and annotation files
vcfqc	Generate QC plots on a VCF file
vcfqcbybam	Generate a summaryfile using samtoolsmpileup that can be used for plotting/report generation
votebasedvcfmerger	Create union and intersection VCFs based on a minimum number of variant callers supporting a variant
vqsr	Build a recalibration model to score variant quality and apply a score cutoff to filter variants

Pipelines

In Clara Parabricks, each pipeline is a collection of several individual tools that are commonly used together, all wrapped up as a single tool. For example, the deepvariant_germline takes FASTA and FASTQ files as input and produces a VCF and BAM file as output. Internally, it runs BWA mem alignment, performs coordinate sorting, marks duplicates, and then runs DeepVariant.

The following standalone pipelines can be used with the NVIDIA Clara Parabricks Pipelines software. Click on a tool name for tool-specific options.

Pipeline Tools Overview

Tool	Details
deepvariant_germline	Run the germline pipeline from FASTQ to VCF using a deep neural network analysis
denovomutation	(BETA) Run the de novo mutation pipeline with three samples for de novo variant detection
germline	Run the germline pipeline from FASTQ to VCF
human_par	Run the germline pipeline from FASTQ to VCF with correct ploidy values for human sex chromosome handling
rna_gatk	Run the GATK Best Practices pipeline for RNA-seq data from FASTQ to VCF
somatic	Run the somatic pipeline from FASTQ to VCF

Compatible CPU Software Versions

Clara Parabricks produces the same results as the following tools:

Tool	Version
arriba	2.1.0
bcftools	1.10.2
BWA	0.7.15
cnvkit	0.9.7
Deepvariant	1.1
Expansion Hunter	5.0.0
fgbio	1.4.0
GATK	4.2.0.0
glnexus	1.2.7
Kallisto	0.46.2
lofreq	2.1.5
manta	1.6.0
samtools	1.10
somaticsniper	1.0.5.0
STAR	2.7.2a
STAR-Fusion	1.7.0
strelka	2.9.0