Frequently Asked Questions#
What are the different task variants available in Legate?#
Legate offers three different task variants: CPU, OMP, and GPU. A task variant determines the type of processor Legate chooses to perform the computations.
What is the difference between Legate and cuPyNumeric?#
Legate is a task-based runtime software stack that enables development of scalable and composable libraries for distributed and accelerated computing.
cuPyNumeric is one of the foundational libraries built using Legate and aspires to be a distributed and accelerated drop-in replacement library for NumPy, an array programming library widely used in scientific computing. cuPyNumeric scales idiomatic NumPy programs to multiple GPUs and CPUs and seamlessly interoperates with other Legate libraries.
Check out this blog post to learn more about cuPyNumeric.
When to use python vs legate?#
The legate
launcher affords comman line options for configurtion, while
using python
requires configuring via LEGATE_CONFIG
. When running
local applications, it is mostly a matter of preference. When running in
multi-node situations, legate
has some additional command line options
that may make usage simpler.
What if I don’t have a GPU?#
If you don’t have a GPU, you can either use the CPU or the OMP variant. See Resource allocation for informations on how to use the respective variants.
What does this warning mean?#
RuntimeWarning: cuPyNumeric has not implemented <API> and is falling back to canonical NumPy. You may notice significantly decreased performance for this function call.
This means that the NumPy <API> has not been implemented in cuPyNumeric and that the Legate runtime is falling back to using NumPy’s implementation which will be single-threaded execution and can lead to decreased performance for that function call.
[0 - 7f0524da9740] 0.000028 {4}{threads}: reservation ('dedicated worker (generic) #1') cannot be satisfied
or
[0 - 7fe90fa7d740] 0.000029 {4}{threads}: reservation ('utility proc 1d00000000000001') cannot be satisfied
This indicates that the runtime was unable to pin threads onto available cores, which usually means that the available CPU cores were oversubscribed because the user has requested more cores than is available.
If the user does not specify which type of processor to run on, legate will use
4 CPUs to execute the program. Legate will also need one core to perform the
dependency analysis and schedule the tasks. If there are fewer than five cores
on the machine, try reducing the number of cores (--cpus
) passed to legate.
This warning is currently expected on MacOS.
How to determine available memory?#
On Linux, running the following command will display the amount of available system memory:
cat /proc/meminfo | grep MemAvailable
Available GPU memory (for each GPU) can be displayed by running:
nvidia-smi --query-gpu memory.free --format=csv
Both of these represent the available amount of memory, which may be shared with other processes or libraries. You may need to reduce these amounts to account for these, or to reflect the actual size of your problem more closely.
If you do not have access to run the commands above, then refer to published machine specs or cluster documentation.
How to handle Out-Of-Memory errors?#
[0 - 7fda18f26000] 0.805182 {5}{cunumeric.mapper}: Failed to allocate 8388608 bytes on memory 1e00000000000000 (of kind SYSTEM_MEM) for region requirement(s) 1 of Task cupynumeric::BinaryOpTask[oom.py:24] (UID 18)
The above error indicates that the application ran out of memory during execution. More granular details on the type of memory, the task that triggered the error, and what was using up the available memory are provided in the error message. If possible, try increasing the amount of system memory or framebuffer memory allocated to the program, or decrease the problem size.
Reducing the --eager-alloc-percentage
to, say, 10 or less can also help
since this reduces the amount of available memory available to the eager memory
pool and will consequently increase the memory reserved for the deferred memory
pool.
Why are the results different from NumPy?#
While a majority of the APIs will give the same result as NumPy, some APIs might be implemented differently from that of NumPy which might lead to differences in results. One such example is, Reshape returns a copy instead of view, which returns a copy of the array in cuPyNumeric but returns a view in NumPy. Another example is astype which does not return a copy by default, where NumPy does.
Such differences in implementation are noted in the documentation of the cuPyNumeric APIs, please review them before opening an issue on the cuPyNumeric issue tracker.
Why doesn’t Legate use my GPU?#
If you explicitly asked legate to use the GPU but find that the GPU is not
being used, it is possible that your problem size is too small to be run on
GPU and be performant. Either increase your problem size significantly or set
the environment variable LEGATE_TEST
to 1 and run. Setting this environment
variable tells Legate to always use the prescribed resources regardless of the
problem size.
What are the anti-patterns in a NumPy code?#
Check out our Best practices to avoid some of the anti-patterns commonly encountered in applications.
How do I time the execution of my application?#
Check out the Performance Benchmarking section for information on how to accurately measure cuPyNumeric execution.
Why is cuPyNumeric slower than NumPy on my laptop?#
For small problem sizes, cuPyNumeric might be slower than NumPy. We suggest you increase the problem size and correspondingly increase the resources needed for the problem size as described in the Usage section. Take a look at our Best practices on how to do that.
Why is cuPyNumeric slower than CuPy on my laptop?#
For small problem sizes, cuPyNumeric might be slower than CuPy. We suggest you increase the problem size and correspondingly increase the resources needed for the problem size as described in the Usage section. Take a look at performance Best practices.
How do I use Jupyter Notebooks?#
How to pass Legion and Realm arguments?#
See Advanced topics.
What is the version of legate?#
Use legate-issue
to know more about the version of Legate, Legion and
several other key packages.
You can also run legate –verbose ./script.py <script-options>
to get
verbose output.
What are the defaults?#
The default values for several input arguments to Legate are mentioned in Legate’s documentation.
Where I can read more about cuPyNumeric?#
Check out this blog post or this tutorial to learn more about cuPyNumeric.
Questions?#
For technical questions about cuPyNumeric and Legate-based tools, please visit the community discussion forum.
If you have other questions, please contact us at legate@nvidia.com.