On NVIDIA H100, creation of multiple compute instances after deletion of existing compute instances fails
Description
After compute instances are created and deleted on an NVIDIA H100 GPU, creation of multiple instances in a single nvidia-smi command fails. For example, the command nvidia-smi mig -cci 0,1,2 fails with the following error message:
Unable to create a compute instance on GPU 0 GPU instance ID 0 using profile 0: Invalid Argument
Failed to create compute instances: Invalid Argument
Workaround
Create each compute instance in a separate nvidia-smi command, for example:
$ nvidia-smi mig -cci 0
$ nvidia-smi mig -cci 1
$ nvidia-smi mig cci 2
Status
Open
Ref. #
3829786