JOINT CALLING OVERVIEW

NVIDIA Clara Parabricks Pipelines accelerated tools for joint calling

TRIO COMBINE VGCF

Combing 2 or 3 GVCF samples in a fast way

QUICK START

$ pbrun triocombinegvcf --ref Ref.fa --in-gvcf father.g.vcf \
                        --in-gvcf mother.g.vcf --in-gvcf child.g.vcf \
                        --out-variants combined.g.vcf

COMPATIBLE CPU GATK4 COMMAND

$ gatk CombineGVCFs -R Ref.fa -V father.g.vcf -V mother.g.vcf \
                    -V child.g.vcf -O combined.g.vcf

OPTIONS

--ref

(required) The reference file in fasta format.

--in-gvcf

(required) Path to g.vcf or g.vcf.gz file. Option can be used 2 or 3 times.

--out-variants

(required) Path to output merged g.vcf file.

CREATE GENOMICS DB

Start a genomic database for multiple samples.

QUICK START

$ pbrun creategenomicsdb --dir database_dir

OPTIONS

--dir

(required) Path to the directory which will serve as the genomic database.

IMPORT GVCFTO DB

Add samples to a genomic database.

QUICK START

$ pbrun importgvcftodb --db-dir database_dir --in-gvcf input.g.vcf

OPTIONS

--db-dir

(required) Path to the directory which has the genomic database already initialized.

--in-gvcf

(required) Path to input g.vcf.gz file to be added to the database. This option can be repeated multiple times.

--num-threads

Defaults to 4.

Number of threads for worker.

SELECT VARIANTS

Select variants from a database and create a gvcf.

QUICK START

CLI

$ pbrun selectvariants --ref Ref.fa \
                       --db-dir database_dir \
                       --out-gvcf-dir gvcf_dir

OPTIONS

--ref

(required) Path to the reference file.

--db-dir

(required) Path to the directory which is the genomic database.

--out-gvcf-dir

(required) Path to output directory which will have all the gvcf with selected variants.

--num-threads

Defaults to 4.

Number of threads for worker.