GPUDirect Storage Parameters
This section describes the JSON configuration parameters used by GDS.
When GDS is installed, the /etc/cufile.json
parameter file is installed with default values. The implementation allows for generic GDS settings and parameters specific to a file system or storage partner.
Consider compat_mode
for systems or mounts that are not yet set up with GDS support.
Workload/application-specific parameters can be set by using the CUFILE_ENV_PATH_JSON
environment variable that is set to point to an alternate cufile.json
file, for example, CUFILE_ENV_PATH_JSON=/home/gds_user/my_cufile.json
.
There are two mode types that you can set in the
cufile.json
configuration file:
- Poll Mode
The cuFile API set includes an interface to put the driver in polling mode. Refer to
cuFileDriverSetPollMode()
in the cuFile API Reference Guide for more information. When the poll mode is set, a read or write issued that is less than or equal toproperties:poll_mode_max_size_kb
(4KB by default) will result in the library polling for IO completion, rather than blocking (sleep). For small IO size workloads, enabling poll mode may reduce latency. - Compatibility Mode
There are several possible scenarios where GDS might not be available or supported, for example, when the GDS software is not installed, the target file system is not GDS supported,
O_DIRECT
cannot be enabled on the target file, and so on. When you enable compatibility mode, and GDS is not functional for the IO target, the code that uses the cuFile APIs fall backs to the standard POSIX read/write path. To learn more about compatibility mode, refer to cuFile Compatibility Mode.
From a benchmarking and performance perspective, the default settings work very well across a variety of IO loads and use cases. We recommended that you use the default values for max_direct_io_size_kb
, max_device_cache_size_kb
, and max_device_pinned_mem_size_kb
unless a storage provider has a specific recommendation, or analysis and testing show better performance after you change one or more of the defaults.
The cufile.json
file has been designed to be extensible such that parameters can be set that are either generic and apply to all supported file systems (fs:generic
), or file system specific (fs:lustre
). The fs:generic:posix_unaligned_writes
parameter enables the use of the POSIX write path when unaligned writes are encountered. Unaligned writes are generally sub-optimal, as they can require read-modify-write operations.
If the target workload generates unaligned writes, you might want to set posix_unaligned_writes
to true, as the POSIX path for handling unaligned writes might be more performant, depending on the target filesystem and underlying storage. Also, in this case, the POSIX path will write to the page cache (system memory).
When the IO size is less than or equal to posix_gds_min_kb
, the fs:lustre:posix_gds_min_kb
setting invokes the POSIX read/write path rather than cuFile path. When using Lustre, for small IO sizes, the POSIX path can have better (lower) latency.
The GDS parameters are among several elements that factor into delivered storage IO performance. It is advisable to start with the defaults and only make changes based on recommendations from a storage vendor or based on empirical data obtained during testing and measurements of the target workload.
The following is the JSON schema:
# /etc/cufile.json { "logging": { // log directory, if not enabled will create log file // under current working directory //"dir": "/home/<xxxx>", // ERROR|WARN|INFO|DEBUG|TRACE (in decreasing order of priority) "level": "ERROR" }, "profile": { // nvtx profiling on/off "nvtx": false, // cufile stats level(0-3) "cufile_stats": 0 }, "execution" : { // max number of workitems in the queue; "max_io_queue_depth": 128, // max number of host threads per gpu to spawn for parallel IO "max_io_threads" : 4, // enable support for parallel IO "parallel_io" : true, // minimum IO threshold before splitting the IO "min_io_threshold_size_kb" :8192, // maximum parallelism for a single request "max_request_parallelism" : 4 }, "properties": { // max IO size (4K aligned) issued by cuFile to nvidia-fs driver(in KB) "max_direct_io_size_kb" : 16384, // device memory size (4K aligned) for reserving bounce buffers // for the entire GPU (in KB) "max_device_cache_size_kb" : 131072, // limit on maximum memory (4K aligned) that can be pinned // for a given process (in KB) "max_device_pinned_mem_size_kb" : 33554432, // true or false (true will enable asynchronous io submission to nvidia-fs driver) "use_poll_mode" : false, // maximum IO request size (4K aligned) within or equal // to which library will poll (in KB) "poll_mode_max_size_kb": 4, // allow compat mode, this will enable use of cufile posix read/writes "allow_compat_mode": false, // client-side rdma addr list for user-space file-systems // (e.g ["10.0.1.0", "10.0.2.0"]) "rdma_dev_addr_list": [ ] }, "fs": { "generic": { // for unaligned writes, setting it to true // will use posix write instead of cuFileWrite "posix_unaligned_writes" : false }, "lustre": { // IO threshold for read/write (4K aligned)) equal to or below // which cufile will use posix reads (KB) "posix_gds_min_kb" : 0 } }, "blacklist": { // specify list of vendor driver modules to blacklist for nvidia-fs "drivers": [ ], // specify list of block devices to prevent IO using libcufile "devices": [ ], // specify list of mount points to prevent IO using libcufile // (e.g. ["/mnt/test"]) "mounts": [ ], // specify list of file-systems to prevent IO using libcufile // (e.g ["lustre", "wekafs", "vast"]) "filesystems": [ ] } // Application can override custom configuration via // export CUFILE_ENV_PATH_JSON=<filepath> // e.g : export CUFILE_ENV_PATH_JSON="/home/<xxx>/cufile.json" }