bqsr

Performs Base Quality Score Recalibration (BQSR) in a standalone fashion.

Quick Start

$ pbrun bqsr \
    --ref Ref/Homo_sapiens_assembly38.fasta \
    --in-bam mark_dups_gpu.bam \
    --knownSites Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
    --out-recal-file recal_gpu.txt \

Compatible GATK4 Command

The command below is the GATK4 counterpart of the Parabricks command above. The output from this command will be identical to the output from the above command.

$ gatk BaseRecalibrator --java-options -Xmx30g --input mark_dups_gpu.bam --output \
recal_cpu.txt --known-sites Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
--reference Ref/Homo_sapiens_assembly38.fasta

bqsr Reference

Run bqsr on a BAM file to generate bqsr report.

Input/Output file options

--ref REF

Path to the reference file. (default: None)

Option is required.

--in-bam IN_BAM

Path to the BAM file. (default: None)

Option is required.

--knownSites KNOWNSITES

Path to a known indels file. Must be in vcf/vcf.gz format. This option can be used multiple times. (default: None)

Option is required.

--interval-file INTERVAL_FILE

Path to an interval file with possible formats: Picard-style (.interval_list or .picard), GATK-style (.list or .intervals), or BED file (.bed). This option can be used multiple times. (default: None)

--out-recal-file OUT_RECAL_FILE

Output Report File. (default: None)

Option is required.

Options specific to this tool

-L INTERVAL, --interval INTERVAL

Interval within which to call bqsr from the input reads. All intervals will have a padding of 100 to get read records and overlapping intervals will be combined. Interval files should be passed using the --interval-file option. This option can be used multiple times. e.g. "-L chr1 -L chr2:10000 -L chr3:20000+ -L chr4:10000-20000". (default: None)

-ip INTERVAL_PADDING, --interval-padding INTERVAL_PADDING

Amount of padding (in base pairs) to add to each interval you are including. (default: None)

Common options:

--logfile LOGFILE

Path to the log file. If not specified, messages will only be written to the standard error output. (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory. By default, this should have been installed at /opt/petagene. Use of this option also requires that the PetaLink library has been preloaded by setting the LD_PRELOAD environment variable. Optionally set the PETASUITE_REFPATH and PGCLOUD_CREDPATH environment variables that are used for data and credentials (default: None)

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in the installation directory.

--no-seccomp-override

Do not override seccomp options for docker (default: None).

--version

View compatible software versions.

GPU options:

--num-gpus NUM_GPUS

Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used.

--gpu-devices GPU_DEVICES

GPU devices to use for a run. By default, all GPU devices will be used. To use specific GPU devices, enter a comma-separated list of GPU device numbers. Possible device numbers can be found by examining the output of the nvidia-smi command. For example, using --gpu-devices 0,1 would only use the first two GPUs.