Accelerated SplitNCigarReads functionality from GATK. This tool splits reads that contain Ns in their cigar string (e.g. spanning splicing events in RNAseq data).
$ pbrun splitncigar \
--ref Ref.fa \
--in-bam in.bam \
--out-bam out.bam
The commands below are the GATK counterpart of the Parabricks command above. The output from these commands will be identical to the output from the above command.
gatk SplitNCigarReads --reference Ref.fa --input in.bam --output tmp.bam
gatk SortSam --java-options -Xmx30g --MAX_RECORDS_IN_RAM=5000000 -I=tmp.bam \
-O=out.bam --SORT_ORDER=coordinate --TMP_DIR=/raid/myrun
Split reads in a BAM file that contain Ns in their cigar string.
Input/Output file options
- --ref REF
-
Path to the reference file. (default: None)
Option is required.
- --in-bam IN_BAM
-
Path to the BAM file. (default: None)
Option is required.
- --knownSites KNOWNSITES
-
Path to a known indels file. Must be in vcf/vcf.gz format. This option can be used multiple times. (default: None)
- --out-bam OUT_BAM
-
Output BAM file. (default: None)
Option is required.
- --out-recal-file OUT_RECAL_FILE
-
Path of report file after Base Quality Score Recalibration. (default: None)
Options specific to this tool
- --num-cpu-threads NUM_CPU_THREADS
-
Number of CPU threads to traverse separate chromosomes in splitncigar. (default: 6)
- --no-ignore-mark
-
Do not ignore marked reads in sorted output. (default: None)
Common options:
- --logfile LOGFILE
-
Path to the log file. If not specified, messages will only be written to the standard error output. (default: None)
- --tmp-dir TMP_DIR
-
Full path to the directory where temporary files will be stored.
- --with-petagene-dir WITH_PETAGENE_DIR
-
Full path to the PetaGene installation directory. By default, this should have been installed at /opt/petagene. Use of this option also requires that the PetaLink library has been preloaded by setting the LD_PRELOAD environment variable. Optionally set the PETASUITE_REFPATH and PGCLOUD_CREDPATH environment variables that are used for data and credentials (default: None)
- --keep-tmp
-
Do not delete the directory storing temporary files after completion.
- --license-file LICENSE_FILE
-
Path to license file license.bin if not in the installation directory.
- --no-seccomp-override
-
Do not override seccomp options for docker (default: None).
- --version
-
View compatible software versions.
GPU options:
- --num-gpus NUM_GPUS
-
Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used.
- --gpu-devices GPU_DEVICES
-
GPU devices to use for a run. By default, all GPU devices will be used. To use specific GPU devices, enter a comma-separated list of GPU device numbers. Possible device numbers can be found by examining the output of the
nvidia-smi
command. For example, using --gpu-devices 0,1 would only use the first two GPUs.