NVIDIA Clara Parabricks v4.0.1

Sort BAM files.

This tool can sort the reads within a BAM file in a variety of ways, including by position in the genome (coordinate) or read name (queryname). This enables compatibility with the requirements of different downstream tools.

Five sort modes are supported:

  • coordinate (Picard-compatible)

  • coordinate (fgbio-compatible)

  • queryname (Picard-compatible)

  • queryname (fgbio-compatible)

  • template coordinate sort (fgbio-compatible)

Allowed values for --sort-order are as follows:

  • coordinate [default]

  • queryname

  • templatecoordinate

Allowed values for --sort-compatibility are as follows:

  • picard [default]

  • fgbio

coordinate and queryname sorting can be done in either picard or fgbio mode. templatecoordinate can only be done in fgbio mode.


# This command assumes all the inputs are in INPUT_DIR and all the outputs go to OUTPUT_DIR. $ docker run --rm --gpus all --volume INPUT_DIR:/workdir --volume OUTPUT_DIR:/outputdir \ --workdir /workdir \ \ pbrun bamsort \ --ref /workdir/${REFERENCE_FILE} \ --in-bam /workdir/${INPUT_BAM} \ --out-bam /outputdir/${OUTPUT_BAM} \ --sort-order coordinate

The command below is the Picard counterpart of the Parabricks command above. The output from this command will be identical to the output from the above command.


java -Xmx30g -jar picard.jar SortSam \ I=<INPUT_DIR>/${INPUT_BAM} \ O=<OUTPUT_DIR>/${OUTPUT_BAM}

Sort BAM files. There are five modes: Coordinate sort (Picard-compatible), Coordinate sort (fgbio-compatible), queryname sort (Picard-compatible), queryname sort (fgbio-compatible), and template coordinate sort (fgbio- compatible).

Input/Output file options

--in-bam IN_BAM

Path of BAM/CRAM for sorting. This option is required. (default: None)

Option is required.

--out-bam OUT_BAM

Path of BAM file after sorting. (default: None)

Option is required.

--ref REF

Path to the reference file. (default: None)

Option is required.

Pipeline Options:

--num-zip-threads NUM_ZIP_THREADS

Number of CPUs to use for zipping BAM files in a run (default 16 for coordinate sorts and 10 otherwise). (default: None)

--num-sort-threads NUM_SORT_THREADS

Number of CPUs to use for sorting in a run (default 10 for coordinate sorts and 16 otherwise). (default: None)

--max-records-in-ram MAX_RECORDS_IN_RAM

Maximum number of records in RAM when using a queryname or template coordinate sort mode; lowering this number will decrease maximum memory usage. (default: 65000000)

--sort-order SORT_ORDER

Type of sort to be done. Possible values are {coordinate,queryname,templatecoordinate}. (default: coordinate)

--sort-compatibility SORT_COMPATIBILITY

Sort comparator compatibility to be used for compatibility with other tools. Possible values are {picard,fgbio}. TemplateCoordinate will only use fgbio. (default: picard)

Common options:

--logfile LOGFILE

Path to the log file. If not specified, messages will only be written to the standard error output. (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory. By default, this should have been installed at /opt/petagene. Use of this option also requires that the PetaLink library has been preloaded by setting the LD_PRELOAD environment variable. Optionally set the PETASUITE_REFPATH and PGCLOUD_CREDPATH environment variables that are used for data and credentials (default: None)


Do not delete the directory storing temporary files after completion.


Do not override seccomp options for docker (default: None).


View compatible software versions.

GPU options:

--num-gpus NUM_GPUS

Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used.

© Copyright 2023, Nvidia. Last updated on Jun 28, 2023.