POPULATION STUDIES PIPELINE

Use the NVIDIA Clara Parabricks Pipelines Genomics Database tool to perform population studies. Create a genomic database for multiple samples and import data into it.

The population studies pipeline can be used as shown below. Optionally the Germline step can be removed, if you already have all the g.vcf.gz generated during the variant calls.

parabricks-web-graphics-1259949-r2-fq2bam.svg

Copy
Copied!
            

# Create a genomics database pbrun creategenomicsdb –dir <genomics db address> # Populate the database with data: pbrun importgvcftodb –dir < genomics db address> --in-gvcf <input GVCF> --in-gvcf <input GVCF> -- in-gvcf <input GVCF> # Select variants from the database $ pbrun selectvariants --ref <Reference Genome> -dir < genomics db address> --out-gvcf <output GVCF>

-dir

(required) Path to directory where the database will be stored.

-dir

(required) Directory of the database to which the gvcf data will be imported.

--in-gvcf

(required) It should be gvcf.gz format ( It should be either generated by Parabricks germline pipeline or bzip).

CLI

--ref

(required) Reference human genome in fasta format. We assume that the indexing required to run bwa has been completed by the user.

-dir

(required) Location of the genomics database which will be used to select variants.

--out-gvcf

Path to the file where the merged GVCF result will be stored.

© Copyright 2020, NVIDIA. Last updated on Sep 21, 2020.