Clients may fail to start, returning ERROR_OUT_OF_MEMORY when the first CUDA context is created, even though there are fewer client contexts than the hard limit of 16.
Comments: When creating a context, the client tries to reserve virtual address space for the Unified Virtual Addressing memory range. On certain systems, this can clash with the system linker and the dynamic shared libraries loaded by it. Ensure that CUDA initialization (e.g., cuInit() , or any cuda*() Runtime API function) is one of the first functions called in your code. To provide a hint to the linker and to the Linux kernel that you want your dynamic shared libraries higher up in the VA space (where it won’t clash with CUDA’s UVA range), compile your code as PIC (Position Independent Code) and PIE (Position Independent Executable). Refer to your compiler manual for instructions on how to achieve this.
Memory allocation API calls (including context creation) may fail with the following message in the server log: MPS Server failed to create/open SHM segment.
Comments: This is most likely due to exhausting the file descriptor limit on your system. Check the maximum number of open file descriptors allowed on your system and increase if necessary. We recommend setting it to 16384 and higher. Typically this information can be checked via the command ‘ulimit –n’; refer to your operating system instructions on how to change the limit.