VARIANT CALLERS

NVIDIA Clara Parabricks Pipelines accelerated variant callers

Accelerated bcftools call.

bcftools-call calls variants from mpileup output

QUICK START

Copy
Copied!
            

$ pbrun bcftoolscall --in-file pileup.bcf \ --out-file output.vcf


COMPATIBLE CPU COMMAND

The command below is the CPU counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

bcftools call pileup.bcf -c -o output.vcf


OPTIONS

--in-file

Path to the input mpileup file (default: None)

--out-file

Path of output file. If this option is not used, it will write to standard output (default: None)

--num-threads

Number of threads for worker (default: 1)

--variant-sites

Output variant sites only (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

GPU accelerated haplotypecaller.

This tool runs GPU accelerated haplotypecaller. Users can provide an optional BQSR report to fix the BAM similar to ApplyBQSR. In that case the updated base qualities will be used.

parabricks-web-graphics-1259949-r2-haplotypecaller.svg

QUICK START

Copy
Copied!
            

$ pbrun haplotypecaller --ref Ref/Homo_sapiens_assembly38.fasta \ --in-bam mark_dups_gpu.bam \ --in-recal-file recal_gpu.txt \ --out-variants result.vcf


COMPATIBLE GATK4 COMMAND

The command below is the GATK4 counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

# Run ApplyBQSR Step $ gatk ApplyBQSR --java-options -Xmx30g -R Ref/Homo_sapiens_assembly38.fasta \ -I=mark_dups_cpu.bam --bqsr-recal-file=recal_file.txt -O=cpu_nodups_BQSR.bam #Run Haplotype Caller $ gatk HaplotypeCaller --java-options -Xmx30g --input cpu_nodups_BQSR.bam --output \ result_cpu.vcf --reference Ref/Homo_sapiens_assembly38.fasta \ --native-pair-hmm-threads 16


OPTIONS

--ref

(required) The reference genome in fasta format.

--in-bam

(required) Path to the input BAM/CRAM file.

--out-variants

(required) Path of .vcf, g.vcf, or g.vcf.gz file.

--in-recal-file

Path to the input BQSR report. Only required if ApplyBQSR step is needed.

--haplotypecaller-options

Pass supported haplotype caller options as one string. Current original haplotypecaller supported options: -min-pruning, -standard-min-confidence-threshold-for-calling, -max-reads-per-alignment-start, -min-dangling-branch-length, and -pcr-indel-model .

--static-quantized-quals

Use static quantized quality scores to a given number of levels. Repeat this option multiple times for multiple bins.

--ploidy

Defaults to 2.

Ploidy assumed for the bam file. Currently only haploid (ploidy 1) and diploid (ploidy 2) are supported.

--interval-file

Path to an interval file for BQSR step with possible formats: Picard-style (.interval_list or .picard), GATK-style (.list or .intervals), or BED file (.bed). This option can be used multiple times (default: None)

--interval

(-L) Interval within which to call variants from the input reads. All intervals will have a padding of 100 to get read records and overlapping intervals will be combined. Interval files should be passed using the –interval-file option. This option can be used multiple times.

e.g. "-L chr1 -L chr2:10000 -L chr3:20000+ -L chr4:10000-20000" (default: None)

--interval-padding

(-ip) Padding size (in base pairs) to add to each interval you are including (default: None)

--gvcf

Defaults to False.

Generate variant calls in gVCF format. When using this option –out-variants file should end with g.vcf or g.vcf.gz. If the --out-variants file ends in gz, the tool will generate gvcf.gz and index for it.

--batch

Given an input list of BAMs, run the variant calling of each BAM using one GPU, and process BAMs in parallel based on how many GPUs the system has.

--disable-read-filter

Disable the read filters for bam entries. Currently supported read filters that can be disabled are: MappingQualityAvailableReadFilter, MappingQualityReadFilter, and NotSecondaryAlignmentReadFilter. This option can be repeated multiple times.

--max-alternate-alleles

Maximum number of alternate alleles to genotype (default: None)

--annotation-group

(-G) Which groups of annotations to add to the output variant calls. Currently supported annotation groups: StandardAnnotation, StandardHCAnnotation, AS_StandardAnnotation (default: None)

--gvcf-gq-bands

(-GQB) Exclusive upper bounds for reference confidence GQ bands. Must be in the range [1, 100] and specified in increasing order (default: None)

--dont-use-soft-clipped-bases

Dont’ use fot clipped bases for variant calling

--haplotypecaller-options

Pass supported haplotype caller options as one string. Currently supported original haplotypecaller options:

-min-pruning <int>

-standard-min-confidence-threshold-for-calling <int>

-max-reads-per-alignment-start <int>

-min-dangling-branch-length <int>

-pcr-indel-model <NONE, HOSTILE, AGGRESSIVE, CONSERVATIVE>


e.g. –haplotypecaller-options=”-min-pruning 4 -standard-min-confidence-threshold-for-calling 30”

--rna

Run haplotypecaller optimized for RNA Data.

--read-from-tmp-dir

Read from the temporary files generated by fq2bam. Only supported on ampere containers (default: None)

--num-gpus NUM_GPUS

Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used. If you are using flexera, please include –gpu-devices too.

--gpu-devices GPU_DEVICES

Which GPU devices to use for a run. By default, all GPU devices will be used. To use specific GPU devices enter a comma-separated list of GPU device numbers. Possible device numbers can be found by examining the output of the nvidia-smi command. For example, using –gpu-devices 0,1 would only use the first two GPUs.

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

GPU accelerated mutect2.

mutectcaller supports tumor or tumor-normal variant calling. The figure below shows the high level functionality of mutectcaller. All dotted boxes indicate optional data, with some constraints.

parabricks-web-graphics-1259949-r2-mutecaller.svg

QUICK START

Copy
Copied!
            

$ pbrun mutectcaller --ref Ref/Homo_sapiens_assembly38.fasta \ --in-tumor-bam tumor.bam \ --tumor-name foobar \ --out-vcf output.vcf


COMPATIBLE GATK4 COMMAND

The command below is the GATK4 counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

gatk Mutect2 -R ref.tar.gz --input tumor.bam --tumor-sample foobar --output result.vcf


OPTIONS

--ref

(required) The reference genome in fasta format. We assume that the indexing required to run bwa has been completed by the user.

--in-tumor-bam

(required) Path of BAM/CRAM file for tumor reads.

--tumor-name

(required) Name of sample for tumor reads.

--out-vcf

(required) Path to the VCF output file.

--in-tumor-recal-file

Path of BQSR report for tumor sample.

--in-normal-bam

Path of BAM/CRAM file for normal reads.

--in-normal-recal-file

Path of BQSR report for normal sample.

--normal-name

Name of sample for normal reads.

--ploidy

Ploidy assumed for the input file. Currently only haploid (ploidy 1) and diploid (ploidy 2) are supported.

--interval-file

Path to an interval file for BQSR step with possible formats: Picard-style (.interval_list or .picard), GATK-style (.list or .intervals), or BED file (.bed). This option can be used multiple times (default: None)

--interval

(-L) Interval within which to call variants from the input reads. All intervals will have a padding of 100 to get read records and overlapping intervals will be combined. Interval files should be passed using the --interval-file option. This option can be used multiple times.

e.g. "-L chr1 -L chr2:10000 -L chr3:20000+ -L chr4:10000-20000" (default: None)

--interval-padding

(-ip) Padding size (in base pairs) to add to each interval you are including (default: None)

--mutectcaller-options

Pass supported mutectcaller options as one string. Currently supported original mutectcaller options:

-pcr-indel-model <NONE, HOSTILE, AGGRESSIVE, CONSERVATIVE>


e.g. –mutectcaller-options=”-pcr-indel-model HOSTILE” (default: None)

--max-mnp-distance

Two or more phased substitutions separated by this distance or less are merged into MNPs. (default: 1)

--num-gpus NUM_GPUS

Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used. If you are using flexera, please include –gpu-devices too.

--gpu-devices GPU_DEVICES

Which GPU devices to use for a run. By default, all GPU devices will be used. To use specific GPU devices enter a comma-separated list of GPU device numbers. Possible device numbers can be found by examining the output of the nvidia-smi command. For example, using –gpu-devices 0,1 would only use the first two GPUs.

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

Accelerated Somatic Sniper.

Somatic sniper supports tumor-normal variant calling. Parabricks has Somatic Sniper as a standalone tool or you can use the Somatic Sniper workflow (sniperworkflow) to generate a VCF file from BAM/CRAM.

QUICK START

Copy
Copied!
            

$ pbrun somaticsniper --ref Ref/Homo_sapiens_assembly38.fasta --in-tumor-bam tumor.bam --in-normal-bam normal.bam --out-file output.vcf


COMPATIBLE GATK4 COMMAND

The command below is the GATK4 counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

bam-somaticsniper -q 1 -G -L -F vcf -f Ref/Homo_sapiens_assembly38.fasta tumor.bam normal.bam output.vcf


OPTIONS

--ref

Path to the reference file (default: None)

--in-tumor-bam

Path of BAM file for tumor reads. Path can be a Google Cloud Storage object (default: None)(no CRAM support yet)

--in-normal-bam

Path of BAM file for normal reads. Path can be a Google Cloud Storage object (default: None)(no CRAM support yet)

--out-file

Path of output file (default: None)

--num-threads

Number of threads for worker (default: 1)

--min-mapq

Filtering reads with mapping quality less than this value (default: 0)

--out-format

Type of output format. Possible values are {classic, vcf} (default: classic)

--correct

Fix baseline bugs. If this option is not passed, the same output will be generated as baseline (default: None)

--no-gain

Do not report Gain of Reference variants as determined by genotypes (default: None)

--no-loh

Do not report LOH variants as determined by genotypes (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

Somatic sniper workflow to generate VCF from BAM/CRAM input files.

sniper.png

QUICK START

Copy
Copied!
            

$ pbrun somaticsniper_workflow --ref Ref/Homo_sapiens_assembly38.fasta \ --in-tumor-bam tumor.bam \ --in-normal-bam normal.bam \ --out-prefix output


COMPATIBLE CPU COMMAND

The command below is the CPU counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

bam-somaticsniper -q 1 -G -L -F vcf -f Ref/Homo_sapiens_assembly38.fasta tumor.bam normal.bam output.vcf bcftools mpileup -A -B -d 2147483647 -Ou -f Ref/Homo_sapiens_assembly38.fasta tumor.bam | bcftools call -c | vcfutils.pl varFilter -Q 20 | awk 'NR > 55 {print}' > output.indel_pileup_Tum.pileup perl snpfilter.pl --snp-file output.vcf --indel-file output.indel_pileup_Tum.pileup perl prepare_for_readcount.pl --snp-file output.vcf.SNPfilter bam-readcount -b 15 -f Ref/Homo_sapiens_assembly38.fasta -l output.vcf.SNPfilter.pos tumor.bam > output.readcounts.rc perl fpfilter.pl -snp-file output.vcf.SNPfilter -readcount-file output.readcounts.rc perl highconfidence.pl -snp-file output.vcf.SNPfilter.fp_pass.vcf


OPTIONS

--ref

(required) The reference genome in fasta format. We assume that the indexing required to run bwa has been completed by the user.

--in-tumor-bam

(required) Path of BAM file for tumor reads.(no CRAM support yet)

--in-normal-bam

Path of BAM file for normal reads.(no CRAM support yet)

--out-prefix

Prefix filename for output data (default: None)

--num-threads

Number of threads for worker (default: 1)

--min-mapq

Filtering reads with mapping quality less than this value (default: 1)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

Run GPU-accelerated deepvariant algorithm.

Parabricks has accelerated Google Deepvariant to extensively use GPUs and finish 30x WGS analysis in 25 minutes instead of hours. The Parabricks flavor of Deepvariant is more like other command line tools that users are familiar with. It takes the BAM and reference as inputs and produces variants as outputs. Currently, Deepvariant is supported for T4, V100, and A100 GPUs.

QUICK START

Copy
Copied!
            

$ pbrun deepvariant --ref Ref/Homo_sapiens_assembly38.fasta \ --in-bam mark_dups_gpu.bam \ --out-variants output.vcf


COMPATIBLE GOOGLE DEEPVARIANT COMMANDS

The command below is the Google counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

# Run make_examples in parallel seq 0 $((N_SHARDS-1)) | \ parallel --eta --halt 2 --joblog "${LOGDIR}/log" --res "${LOGDIR}" \ sudo docker run \ -v ${HOME}:${HOME} \ gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \ /opt/deepvariant/bin/make_examples \ --mode calling \ --ref "${REF}" \ --reads "${BAM}" \ --examples "${OUTPUT_DIR}/examples.tfrecord@${N_SHARDS}.gz" \ --task {} # Run call_variants in parallel sudo docker run \ -v ${HOME}:${HOME} \ gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \ /opt/deepvariant/bin/call_variants \ --outfile "${CALL_VARIANTS_OUTPUT}" \ --examples "${OUTPUT_DIR}/examples.tfrecord@${N_SHARDS}.gz" \ --checkpoint "${MODEL}" # Run postprocess_variants in parallel sudo docker run \ -v ${HOME}:${HOME} \ gcr.io/deepvariant-docker/deepvariant:"${BIN_VERSION}" \ /opt/deepvariant/bin/postprocess_variants \ --ref "${REF}" \ --infile "${CALL_VARIANTS_OUTPUT}" \ --outfile "${FINAL_OUTPUT_VCF}"


OPTIONS

--ref

(required) The reference genome in fasta format.

--in-bam

(required) Path to the input BAM/CRAM file.

--out-variants

(required) Name of output VCF file.

--pb-model-file

Path of a non-default parabricks model file for deepvariant.

--mode

Value can be one of [shortread, pacbio, ont]. By default, it is shortread. If mode is set to pacbio, the following defaults are used: –norealign-reads, –alt-aligned-pileup diff_channels, –vsc-min-fraction-indels 0.12. If mode is set to ont, the following defaults are used: -norealign-reads, –variant-caller VCF_CANDIDATE_IMPORTER (default: shortread)

--proposed-variants

Path of vcf file which has proposed variants for make examples stage (default: None)

--interval-file

Path to an interval file for BQSR step with possible formats: Picard-style (.interval_list or .picard), GATK-style (.list or .intervals), or BED file (.bed). This option can be used multiple times (default: None)

--interval

(-L) Interval within which to call variants from the input reads. Overlapping intervals will be combined. Interval files should be passed using the --interval-file option. This option can be used multiple times.

e.g. "-L chr1 -L chr2:10000 -L chr3:20000+ -L chr4:10000-20000" (default: None)

--disable-use-window-selector-model

Change the window selector model from Allele Count Linear to Variant Reads. This option will increase the accuracy and run time (default: Allele Count Linear)

--gvcf

Generate variant calls in gVCF format.

--norealign-reads

Do not locally realign reads before calling variants. Reads longer than 500 bp are never realigned (default: None)

--sort-by-haplotypes

Reads are sorted by haplotypes (using HP tag) (default: None)

--keep-duplicates

Keep reads that are duplicate (default: None)

--vsc-min-count-snps

SNP alleles occurring at least this many times in our AlleleCount will be advanced as candidates (default:2)

--vsc-min-count-indels

Indel alleles occurring at least this many times in our AlleleCount will be advanced as candidates (default: 2)

--vsc-min-fraction-snps

SNP alleles occurring at least this fraction of all counts in our AlleleCount will be advanced as candidates (default: 0.12)

--vsc-min-fraction-indels

Indel alleles occurring at least this fraction of all counts in our AlleleCount will be advanced as candidates (default: None)

--min-mapping-quality

By default, reads with any mapping quality are kept. Setting this field to a positive integer i will only keep reads that have a MAPQ >= i. Note this only applies to aligned reads (default: 5)

--min-base-quality

Minimum base quality. This field indicates that we are enforcing a minimum base quality score for alternate alleles. Alternate alleles will only be considered if all bases in the allele have a quality greater than min_base_quality (default: 10)

--alt-aligned-pileup

Value can be one of [none, diff_channels]. Include alignments of reads against each candidate alternate allele in the pileup image. Default is none which turns this feature off (default: None)

--variant-caller

Value can be one of [VERY_SENSITIVE_CALLER, VCF_CANDIDATE_IMPORTER]. The caller to use to make examples. If you use VCF_CANDIDATE_IMPORTER, it implies force calling. Default is VERY_SENSITIVE_CALLER

--num-gpus NUM_GPUS

Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used. If you are using flexera, please include –gpu-devices too.

--gpu-devices GPU_DEVICES

Which GPU devices to use for a run. By default, all GPU devices will be used. To use specific GPU devices enter a comma-separated list of GPU device numbers. Possible device numbers can be found by examining the output of the nvidia-smi command. For example, using –gpu-devices 0,1 would only use the first two GPUs.

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

CPU accelerated Copy number variant calling. You need to pass “–extra-tools” to the installer to use this tool.

Run CNVkit with accelerated coverage calculation from read depths.

QUICK START

Copy
Copied!
            

$ pbrun cnvkit --ref Ref/Homo_sapiens_assembly38.fasta \ --in-bam mark_dups_gpu.bam --out-file output.vcf


OPTIONS

--ref

(required) Path to the reference file.

--in-bam

(required) Path to the BAM/CRAM file.

--output-dir

Path to the directory that will contain all of the generated files.

--cnvkit-options

Pass supported cnvkit options as one string. Currently supported options are –count-reads and –drop-low-coverage.

e.g. --cnvkit-options="--count-reads --drop-low-coverage".

--generate-vcf

Export the output cns to VCF after running batch (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

Call and genotype SVs for short reads using smoove (Original Smoove Project). This tool is not accelerated and original precompiled binary will run on the server.

QUICK START

Copy
Copied!
            

$ pbrun smoove --ref Ref/Homo_sapiens_assembly38.fasta \ --in-bam in.bam \ --output-dir output \ --name SM


COMPATIBLE CPU COMMAND

The command below is the original CPU counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

./smoove call --fasta Ref/Homo_sapiens_assembly38.fasta \ --outdir output \ --name SM \ in.bam


OPTIONS

--ref

(required) Path to the reference file (default: None)

--in-bam

(required) Path to the input BAM/CRAM file for variant calling (default: None)

--output-dir

(required) Path to the directory that will contain all of the generated files (default: None)

--name

(required) Input sample name (default: None)

--smoove-options

Pass supported smoove options as one string. e.g. –smoove-options=”–excludechroms chr4 –noextrafilters” (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

Call variants with high sensitivity, predicting variants below the average base-call quality (Original Lofreq Project). The call part is accelerated.

QUICK START

Copy
Copied!
            

$ pbrun lofreq --ref Ref/Homo_sapiens_assembly38.fasta \ --in-tumor-bam tumor.bam \ --in-normal-bam normal.bam \ --output-dir output


COMPATIBLE CPU COMMAND

The command below is the original CPU counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

lofreq somatic -n normal.bam -t tumor.bam \ -o output -f /data/Ref/GRCh38.d1.vd1.fa \ --baq-off --no-src-qual --call-rlx-extra-args "@d 2147483647"


OPTIONS

--ref

(required) Path to the reference file (default: None)

--in-tumor-bam

(required) Path of bam file for tumor reads. This option is required (default: None)(no CRAM support yet)

--in-normal-bam

(required) Path of bam file for normal reads. This option is required (default: None)(no CRAM support yet)

--output-dir

(required) Directory for output data (default: None)

--in-dbsnp-file

Path to an input dbsnp file containing known germline variants. Must be in vcf.gz format with its tabix index (default: None)

--ignore-vcf

Path to an input VCF file containing variants that will be ignored for source quality computation in tumor. If this option is not used, stringently filtered predictions in normal sample will be used by default (default: None)

--num-threads

Number of threads per GPU for each call (default: 4)

--tumor-mtc

Type of multiple testing correction for tumor. Possible values are {bonf,holm-bonf,fdr}. Default value is bonf (default: bonf)

--tumor-mtc-alpha

Multiple testing correction alpha for tumor. Default value is 1.000000 (default: 1.0)

--min-cov MIN_COV

Minimum coverage for somatic calls. Default value is 7 (default: 7)

--germline

Also list germline calls in separate file (default: None)

--use-orphan

Use orphaned/anomalous reads from pairs in all samples (default: None)

--baq-off

Switch use of BAQ off in all samples (default: None)

--no-src-qual

Disable use of source quality in tumor (default: None)

--num-gpus NUM_GPUS

Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used. If you are using flexera, please include –gpu-devices too.

--gpu-devices GPU_DEVICES

Which GPU devices to use for a run. By default, all GPU devices will be used. To use specific GPU devices enter a comma-separated list of GPU device numbers. Possible device numbers can be found by examining the output of the nvidia-smi command. For example, using –gpu-devices 0,1 would only use the first two GPUs.

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

Structural variant (SV) and indel caller from mapped paired-end sequencing reads. This tool is not accelerated and original precompiled binary will run on the server.

QUICK START

Copy
Copied!
            

$ pbrun manta --ref Ref/Homo_sapiens_assembly38.fasta \ --in-tumor-bam tumor.bam \ --in-normal-bam normal.bam \ --out-prefix output


OPTIONS

--ref

Path to the reference file (default: None)

--in-tumor-bam

Path of BAM/CRAM file for tumor reads (default: None)

--in-normal-bam

Path of BAM/CRAM file for normal reads. This option can be used multiple times (default: None)

--bed

Optional bgzip-compressed/tabix-indexed BED file containing the set of regions to call (default: None)

--out-prefix

Prefix filename for output data (default: None)

--num-threads

Number of threads for worker (default: 1)

--manta-options

Pass supported manta options as one string. e.g. –manta-options=”–rna –unstrandedRNA” (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

SNP and indel caller from mapped paired-end sequencing reads. This tools is not accelerated and original precompiled binary will run on the server.

QUICK START

Copy
Copied!
            

$ pbrun manta --ref Ref/Homo_sapiens_assembly38.fasta \ --in-tumor-bam tumor.bam \ --in-normal-bam normal.bam \ --indel-candidates candidates.vcf \ --out-prefix output


OPTIONS

--ref

Path to the reference file (default: None)

--in-tumor-bam

Path of BAM/CRAM file for tumor reads (default: None)

--in-normal-bam

Path of BAM/CRAM file for normal reads. This option can be used multiple times (default: None)

--indel-candidates

Path to a VCF of candidate indel alleles. Must be in vcf/vcf.gz format. This option can be used multiple times (default: None)

--bed

Optional bgzip-compressed/tabix-indexed BED file containing the set of regions to call (default: None)

--out-prefix

Prefix filename for output data (default: None)

--num-threads

Number of threads for worker (default: 1)

--strelka-options

Pass supported strelka options as one string. e.g. –strelka-options=”–exome” (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

Strelka workflow to generate VCF from BAM/CRAM input files.

strelka.png

QUICK START

Copy
Copied!
            

$ pbrun strelka_workflow --ref Ref/Homo_sapiens_assembly38.fasta \ --in-tumor-bam tumor.bam \ --in-normal-bam normal.bam \ --out-prefix output


COMPATIBLE GATK4 COMMAND

The command below is the GATK4 counterpart of the Parabricks command above. The output from these commands will generate the exact same results as the output from the above command. Please look at Output Comparison page on how you can compare the results.

Copy
Copied!
            

mkdir -p manta_work python $MANTA_DIR/bin/configManta.py --referenceFasta Ref/Homo_sapiens_assembly38.fasta \ --normalBam ${NORMAL} --tumorBam tumor.bam \ --runDir manta_work cd manta_work python ./runWorkflow.py -m local -j ${MAX_NUM_PROCESSORS} cd .. mkdir -p strelka_work python $STRELKA_PATH/configureStrelkaSomaticWorkflow.py \ --referenceFasta Ref/Homo_sapiens_assembly38.fasta \ --normalBam normal.bam --tumorBam tumor.bam \ --indelCandidates ${WORK_PATH}/manta_work/results/variants/candidateSmallIndels.vcf.gz \ --runDir strelka_work cd strelka_work python ./runWorkflow.py -m local -j ${MAX_NUM_PROCESSORS} cd ..


OPTIONS

--ref

(required) The reference genome in fasta format. We assume that the indexing required to run bwa has been completed by the user.

--in-tumor-bam

(required) Path of BAM/CRAM file for tumor reads.

--in-normal-bam

Path of BAM/CRAM file for normal reads.

--out-prefix

Prefix filename for output data (default: None)

--num-threads

Number of threads for worker (default: 1)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--seccomp-override

Do not override seccomp options for docker

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory where bin/ and species/ folders are located.

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in installation directory.

--version

View compatible software versions.

© Copyright 2020, NVIDIA. Last updated on Jun 28, 2021.