splitncigar
Accelerated SplitNCigarReads functionality from GATK. This tool splits reads that contain Ns in their cigar string (e.g. spanning splicing events in RNAseq data).
$ pbrun splitncigar \
--ref Ref.fa \
--in-bam in.bam \
--out-bam out.bam
The commands below are the GATK counterpart of the Parabricks command above. The output from these commands will be identical to the output from the above command.
gatk SplitNCigarReads --reference Ref.fa --input in.bam --output tmp.bam
gatk SortSam --java-options -Xmx30g --MAX_RECORDS_IN_RAM=5000000 -I=tmp.bam \
-O=out.bam --SORT_ORDER=coordinate --TMP_DIR=/raid/myrun
Split reads in a BAM file that contain Ns in their cigar string.
Input/Output file options
- --ref REF
- --in-bam IN_BAM
- --knownSites KNOWNSITES
- --out-bam OUT_BAM
- --out-recal-file OUT_RECAL_FILE
Path to the reference file. (default: None)
Option is required.
Path to the BAM file. (default: None)
Option is required.
Path to a known indels file. The file must be in vcf.gz format. This option can be used multiple times. (default: None)
Output BAM file. (default: None)
Option is required.
Path of the report file after Base Quality Score Recalibration. (default: None)
Tool Options:
- --num-cpu-threads NUM_CPU_THREADS
- --no-ignore-mark
Number of CPU threads to traverse separate chromosomes in splitncigar. (default: 6)
Do not ignore marked reads in sorted output. (default: None)
Common options:
- --logfile LOGFILE
- --tmp-dir TMP_DIR
- --with-petagene-dir WITH_PETAGENE_DIR
- --keep-tmp
- --license-file LICENSE_FILE
- --no-seccomp-override
- --version
Path to the log file. If not specified, messages will only be written to the standard error output. (default: None)
Full path to the directory where temporary files will be stored.
Full path to the PetaGene installation directory. By default, this should have been installed at /opt/petagene. Use of this option also requires that the PetaLink library has been preloaded by setting the LD_PRELOAD environment variable. Optionally set the PETASUITE_REFPATH and PGCLOUD_CREDPATH environment variables that are used for data and credentials (default: None)
Do not delete the directory storing temporary files after completion.
Path to license file license.bin if not in the installation directory.
Do not override seccomp options for docker (default: None).
View compatible software versions.
GPU options:
- --num-gpus NUM_GPUS
- --gpu-devices GPU_DEVICES
Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used.
GPU devices to use for a run. By default, all GPU devices will be used.
To use specific GPU devices, enter a comma-separated list of GPU device
numbers. Possible device numbers can be found by examining the output of
the nvidia-smi
command. For example, using --gpu-devices 0,1
would only use the first two GPUs.