Troubleshoot Data Designer#

Learn how to troubleshoot common issues with the NeMo Data Designer microservice deployed using Docker Compose.

Check Service Health#

  1. View running containers:

    docker ps --filter "name=nemo-microservices-"
    

    All services should show as “Up” with healthy status. The STATUS column should show:

    • Up X minutes (healthy) for services with health checks

    • Up X minutes for services without explicit health checks

  2. Check for failed containers:

    docker ps -a --filter "name=nemo-microservices-" --filter "status=exited"
    

    This shows NeMo containers that have stopped running. Check the “exit code” - a non-zero exit code indicates the container failed.

  3. View all NeMo services (running and stopped):

    docker ps -a --filter "name=nemo-microservices-"
    

Check Service Logs#

  1. View logs for a specific service:

    docker logs <container-name>
    

    Replace <container-name> with the actual container name (for example, nemo-microservices-data-designer-1).

  2. Follow logs in real time:

    docker logs -f <container-name>
    
  3. View recent logs with timestamps:

    docker logs --since=10m -t <container-name>
    
  4. Check logs for all Data Designer services:

    docker compose --profile data-designer logs
    
  5. Follow all service logs:

    docker compose --profile data-designer logs -f
    

Common Issues#

  • Port conflicts: Ensure port 8080 is not in use by other applications.

  • Memory issues: A container with status exited (137) indicates the host ran out of memory. Verify system resources or reduce parallel load.

  • Authentication: Confirm you are logged in to the NGC registry with valid credentials.

  • Network connectivity: Ensure Docker can access external registries and LLM endpoints.

  • No healthy upstream (503): Receiving a 503 error code when making API requests may mean that not all services have started. Ensure that all services are in a started state and try your request again.