Troubleshoot NeMo Retriever Extraction
Use this documentation to troubleshoot issues that arise when you use NeMo Retriever extraction.
Can't process malformed input files
When you run a job you might see errors similar to the following:
- Failed to process the message
- Failed to extract image
- File may be malformed
- Failed to format paragraph
These errors can occur when your input file is malformed. Verify or fix the format of your input file, and try resubmitting your job.
Can't start new thread error
In rare cases, when you run a job you might an see an error similar to can't start new thread
.
This error occurs when the maximum number of processes available to a single user is too low.
To resolve the issue, set or raise the maximum number of processes (-u
) by using the ulimit command.
Before you change the -u
setting, consider the following:
- Apply the
-u
setting directly to the user (or the Docker container environment) that runs your ingest service. - For
-u
we recommend 10,000 as a baseline, but you might need to raise or lower it based on your actual usage and system configuration.
ulimit -u 10,000
Extract method nemoretriever-parse doesn't support image files
Currently, extraction with nemoretriever-parse doesn't support image files, only scanned PDFs.
To work around this issue, convert image files to PDFs before you use extract_method="nemoretriever_parse"
.
Too many open files error
In rare cases, when you run a job you might an see an error similar to too many open files
or max open file descriptor
.
This error occurs when the open file descriptor limit for your service user account is too low.
To resolve the issue, set or raise the maximum number of open file descriptors (-n
) by using the ulimit command.
Before you change the -n
setting, consider the following:
- Apply the
-n
setting directly to the user (or the Docker container environment) that runs your ingest service. - For
-n
we recommend 10,000 as a baseline, but you might need to raise or lower it based on your actual usage and system configuration.
ulimit -n 10,000
Triton server INFO messages incorrectly logged as errors
Sometimes messages are incorrectly logged as errors, when they are information. When this happens, you can ignore the errors, and treat the messages as information. For example, you might see log messages that look similar to the following.
ERROR 2025-04-24 22:49:44.266 nimutils.py:68] tritonserver: /usr/local/lib/libcurl.so.4: ...
ERROR 2025-04-24 22:49:44.268 nimutils.py:68] I0424 22:49:44.265292 98 cache_manager.cc:480] "Create CacheManager with cache_dir: '/opt/tritonserver/caches'"
ERROR 2025-04-24 22:49:44.431 nimutils.py:68] I0424 22:49:44.431796 98 pinned_memory_manager.cc:277] "Pinned memory pool is created at '0x7f8e4a000000' with size 268435456"
ERROR 2025-04-24 22:49:44.432 nimutils.py:68] I0424 22:49:44.432036 98 cuda_memory_manager.cc:107] "CUDA memory pool is created on device 0 with size 67108864"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] I0424 22:49:44.433448 98 model_config_utils.cc:753] "Server side auto-completed config: "
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] name: "yolox"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] platform: "tensorrt_plan"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] max_batch_size: 32
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] input {
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] name: "input"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] data_type: TYPE_FP32
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] dims: 3
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] dims: 1024
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] dims: 1024
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] }
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] output {
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] name: "output"
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] data_type: TYPE_FP32
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] dims: 21504
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] dims: 9
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] }
...