fq2ubam

Convert FASTQs to an unaligned BAM file.

Quick Start

$ pbrun fq2ubam \
    --in-fq Data/sample_1.fq.gz Data/sample_2.fq.gz \
    --sample-name sample_1 \
    --out-bam output.bam \
    --sort-order unsorted

Compatible CPU-based BWA-MEM, GATK4 Commands

$ gatk FastqToSam -F1 Data/sample_1.fq.gz -F2 Data/sample_2.fq.gz -SM sample_1 -O output.bam

fq2ubam Reference

Run fq2ubam to convert FASTQs to an unaligned BAM file

Input/Output file options

--in-fq IN_FQ

IN_FQ Path to pair ended FASTQ files. Files can be in fastq or fastq.gz format. (default: None)

Option is required.

--ref REF

Path to the reference file. Required if --sort-order is not unsorted. (default: None)

--out-bam OUT_BAM

Path to the output BAM file. (default: None)

Option is required.

Tool Options:

--sample-name SAMPLE_NAME

Sample name to insert into the read group header. (default: None)

Option is required.

--num-threads NUM_THREADS

Number of worker threads. (default: 6)

--comment COMMENT

Comment(s) to include in the merged output file's header. Option can be used more than once. (default: None)

--description DESCRIPTION

Inserted into the read group header. (default: None)

--library-name LIBRARY_NAME

The library name to place into the LB attribute in the read group header. (default: None)

--platform PLATFORM

The platform type (e.g. illumina, solid) to insert into the read group header. (default: None)

--platform-model PLATFORM_MODEL

Platform model to insert into the group header (free-form text providing further details of the platform/technology used). (default: None)

--platform-unit PLATFORM_UNIT

The platform unit (often run_barcode.lane) to insert into the read group header. (default: None)

--predicted-insert-size PREDICTED_INSERT_SIZE

Predicted median insert size, to insert into the read group header. (default: None)

--program-group PROGRAM_GROUP

Program group to insert into the read group header. (default: None)

--read-group-name READ_GROUP_NAME

Read group name to insert into the read group header and added as an attribute to each output read. (default: A)

--run-date RUN_DATE

Date the run was produced, to insert into the read group header. Must be in ISO 8601 format (YYYY-MM-DD). (default: None)

--sequencing-center SEQUENCING_CENTER

The sequencing center from which the data originated. (default: None)

--sort-order SORT_ORDER

The sort order for the output BAM file. Possible values are {unsorted,queryname,coordinate}. (default: queryname)

--quality-format QUALITY_FORMAT

A value describing how the quality values are encoded in the input FASTQ file. Possible values are either Solexa (phred scaling + 66), Illumina (phred scaling + 64), or Standard (phred scaling + 33). (default: Standard)

--min-q MIN_Q

Minimum quality allowed in the input fastq. Value must be >= 0. (default: 0)

--max-q MAX_Q

Maximum quality allowed in the input fastq. Value must be <= 93. (default: 93)

--num-zip-threads NUM_ZIP_THREADS

Number of CPUs to use for zipping bam files in a run (default 16 for coordinate sorts and 10 otherwise) (default: None)

--num-sort-threads NUM_SORT_THREADS

Number of CPUs to use for sorting in a run (default 10 for coordinate sorts and 16 otherwise) (default: None)

--max-records-in-ram MAX_RECORDS_IN_RAM

Maximum number of records in RAM when using a queryname or template coordinate sort mode; lowering this number will decrease maximum memory usage. (default: 65000000)

Common options:

--logfile LOGFILE

Path to the log file. If not specified, messages will only be written to the standard error output. (default: None)

--tmp-dir TMP_DIR

Full path to the directory where temporary files will be stored.

--with-petagene-dir WITH_PETAGENE_DIR

Full path to the PetaGene installation directory. By default, this should have been installed at /opt/petagene. Use of this option also requires that the PetaLink library has been preloaded by setting the LD_PRELOAD environment variable. Optionally set the PETASUITE_REFPATH and PGCLOUD_CREDPATH environment variables that are used for data and credentials (default: None)

--keep-tmp

Do not delete the directory storing temporary files after completion.

--license-file LICENSE_FILE

Path to license file license.bin if not in the installation directory.

--no-seccomp-override

Do not override seccomp options for docker (default: None).

--version

View compatible software versions.

Note

The --in-fq option takes the names of two FASTQ files, optionally followed by a quoted read group. The FASTQ filenames must not start with a hyphen.