Buffer Allocation
Every hardware engine inside NVIDIA hardware can have a different buffer constraints depending on how the buffer is interpreted by the engine. Hence, sharing a buffer across various engines requires that the allocated buffer satisfy the constraints of all engines that will access that buffer. The existing allocation APIs provided by NvMedia or CUDA only consider the constraints of the engines managed by them.
NvSciBuf is a buffer allocation module that can enable applications to allocate a buffer shareable across various hardware engines that are managed by different engine APIs.
Memory Buffer Basics
The following describe basic information about memory buffers.
Allocation Model
The buffer allocation model of NvSciBuf is summarized as follows:
If two or more hardware engines wants to access a common buffer (e.g., one engine is writing data into the buffer and the other engine is reading from the buffer), then:
1. Applications create an attribute list for each accessor.
2. Set the attributes to define the properties of the buffer they intend to create in the respective attribute list.
For NvMedia, applications must set all datatypes (e.g., NvMediaImage datatype, etc.) attributes using Nvmedia-NvSciBuf APIs and must also set the NvSciBuf General attributes directly in the NvMedia attribute list.
For CUDA, applications must set all the required attributes.
Applications must ensure they set the NvSciBufGeneralAttrKey_GpuId attribute on the CUDA side to specify the IDs of all the GPUs that access the buffer.
3. Reconcile these multiple attribute lists. The process of reconciliation guarantees that a common buffer is allocated that satisfies the constraints of all the accessors.
4. Allocate the buffer using the reconciled attribute list. The reconciled attribute list used for allocating the object is associated with the object until the lifetime of the object.
5. Share the buffer with all the accessors.
Types of Buffers
The hardware engine constraints depend on the type of buffer allocated. The different types of buffers supported by NvSciBuf (applications can choose to allocate one of the following types):
RawBuffer: Raw memory that is used by an application for storing data.
Image: Memory used to store image data.
ImagePyramid: Memory used to store ImagePyramid, a group of images arranged in multiple levels, with each level of image scaled to a specific scaling factor.
NvSciBufArray: Memory used to store a group of units, where each unit represents data of various basic types like int, float etc.
Tensor: Memory used to store Tensor data.
Memory Domain Allocation
Applications can choose to allocate the memory from the following domains:
System memory
Types of Buffer Attributes
The NvSciBuf attribute can be categorized into the following types:
Datatype attributes: Attributes that are specific to one of the buffer types mentioned in the previous section. If the buffer type in the attribute list is one type, then setting the attributes of another type returns an error.
General attributes: Attributes that are not specific to any buffer type and describes the general properties of the buffer. Some of the examples:
NvSciBufGeneralAttrKey_Types: Defines the type of the buffer.
NvSciBufGeneralAttrKey_NeedCpuAccess: Defines whether the buffer is accessed by the CPU.
NvSciBufGeneralAttrKey_RequiredPerm: Defines the access permissions expected by this buffer accessor.
NvSciBuf Module
You must open an NvSciBufModule before invoking other NvSciBuf API. The NvSciBuf module is the library's instance created for that application. All NvSciBuf resources created within an application are associated with the NvSciBufModule of the application.
NvSciBufModule module = NULL;
NvSciError err;
err = NvSciBufModuleOpen(&module);
if (err != NvSciError_Success) {
goto fail;
/* ... */
Attribute Lists
NvSciBuf attribute lists are categorized into the following types:
Unreconciled Attribute List
Application can create an unreconciled attribute list and do set/get operations per attribute on an unreconciled attribute list.
The set operation is allowed only one time per attribute.
The attribute list returned by NvSciBufAttrListCreate and NvSciBufAttrListIpcImportUnreconciled API is an unreconciled attribute list.
Applications cannot use unreconciled attribute lists to allocate an NvSciBuf object.
Reconciled Attribute List
Reconciled attribute list is the outcome of reconciliation (i.e., merging and validation) of various unreconciled lists that defines final layout/allocation properties of the buffer. The application can use this list to allocate an NvSciBuf Object. Refer to the Reconciliation section below for more details about reconciliation process.
The attribute list returned by a successful NvSciBufAttrListReconcile, NvSciBufAttrListIpcImportReconciled, or NvSciBufObjGetAttrList is a reconciled attribute list.
Applications are not allowed to do set operations on a reconciled attribute list.
Conflict Attribute List
Attribute list returned by an unsuccessful reconciliation process of NvSciBufAttrListReconcile API is a conflict attribute list.
Applications are not allowed to do set/get operations on a conflict attribute list.
Application can only use conflict attribute lists to dump its content using NvSciBufAttrListDump API.
An application can initiate the process of reconciliation on one or more attribute lists by invoking NvSciBufAttrListReconcile API. The process of NvSciBuf attribute list reconciliation:
1. Merging: Values from multiple attribute lists are merged. The process of merging is explained in the flow-chart below.
Datatype(List): Datatype of the attribute list list.
List[i]: Attribute key named i of the attribute list list.
Value(List[i]): Value corresponding to attribute key i in the attribute list list.
UNSPECIFIED: Value of an attribute key is ignored and hence unspecified.
KEYCOUNT(datatype): Number of attribute keys for the given datatype.
MERGEVALUES(value1, value2): This function merges value1 and value2.
For most attributes, merging is successful only if value1 equals value2.
For attributes like alignment, if value1 is not equal to value 2, then the maximum of both is used as the merged value, provided both of them are a power of 2.
2. Validation: After merging attributes from multiple lists successfully, validation of merged attributes occurs on the reconciled attribute list, which validates whether all the required attributes are set and the values of all the attributes are valid. For example, after the merging of attributes, if plane-count is set to 2 but the reconciled list contains more values for plane color-format, then validation is unsuccessful.
3. Output Attributes Computation: Post validation, output attributes like size, alignment, etc. are computed in the reconciled attribute list.
NvSciBufType bufType = NvSciBufType_RawBuffer;
uint64_t rawsize = (128 * 1024); // Allocate 128K Raw-buffer
uint64_t align = (4 * 1024); //Buffer Alignment of 4K
bool cpuaccess_flag = false;
NvSciBufAttrKeyValuePair rawbuffattrs[] = {
{ NvSciBufGeneralAttrKey_Types, &bufType, sizeof(bufType) },
{ NvSciBufRawBufferAttrKey_Size, &rawsize, sizeof(rawsize) },
{ NvSciBufRawBufferAttrKey_Align, &align, sizeof(align) },
{ NvSciBufGeneralAttrKey_NeedCpuAccess, &cpuaccess_flag,
sizeof(cpuaccess_flag) },
/* Created attrlist1 will be associated with bufmodule */
err = NvSciBufAttrListCreate(bufmodule, &attrlist1);
if (err != NvSciError_Success) {
goto fail;
err = NvSciBufAttrListSetAttrs(umd1attrlist, rawbuffattrs,
NvSciBuf Reconciliation
NvSciBufAttrList unreconciledList[2] = {NULL};
NvSciBufAttrList reconciledList = NULL;
NvSciBufAttrList ConflictList = NULL;
unreconciledList[0] = AttrList1;
unreconciledList[1] = AttrList2;
/* Reconciliation will be successful if and only all the
* unreconciledLists belong to same NvSciBufModule and the
* outputs of this API(i.e either reconciled attribute list
* or conflict list will also be associated with the same
* module with which input unreconciled lists belong to.
err = NvSciBufAttrListReconcile(
unreconciledList, /* array of unreconciled lists */
2, /* size of this array */
&reconciledList, /* output reconciled list */
&ConflictList); /* conflict description filled in case of reconciliation failure */
if (err != NvSciError_Success) {
goto fail;
/* ... */
NvSciSyncAttrListFree(reconciledList); // In case of successful reconciliation.
NvSciSyncAttrListFree(ConflictList); // In case of failed reconciliation.
Buffer Management
The following sections describe buffer management.
Applications can use the reconciled attribute list to create any number of NvSciBufObjs. Each NvSciBufObj represents a buffer and the reconciled attribute list is associated with each object until the object is freed. Applications can create NvMedia/CUDA datatypes out of the allocated buffer using NvMedia/CUDA API and can operate on the buffers by submitting to the NvMedia/CUDA hardware engine using appropriate APIs. Applications wanting to access the buffer from the CPU can set the NvSciBufGeneralAttrKey_NeedCpuAccess attribute to true and get the CPU address using either NvSciBufObjGetCpuPtr or NvSciBufObjGetConstCpuPtr API.
Invoking NvSciBufObjGetCpuPtr on a read-only buffer or a buffer that doesn't have CPU access returns an error.
NvSciBuf Object
/* Allocate a Buffer using reconciled attribute list and the
* created NvSciBufObj will be associated with the module to
* which reconciledAttrlist belongs to.
err = NvSciBufAttrListObjAlloc(reconciledAttrlist,
if (err != NvSciError_Success) {
goto fail;
/* ..... */
/* Get the associated reconciled attrlist of the object. */
err = NvSciBufObjGetAttrList(nvscibufobj,
if (err != NvSciError_Success) {
goto fail;
/* ..... */
err = NvSciBufObjGetCpuPtr(nvscibufobj, &va_ptr);
if (err != NvSciError_Success) {
goto fail;
/* ..... */
If applications involve multiple processes, the exchange of NvSciBuf structures must only go through NvSciIpc channels. Each application must open its own NvSciIpc endpoint.
NvSciIpc Init
NvSciIpcEndpoint ipcEndpoint = 0;
err = NvSciIpcInit();
if (err != NvSciError_Success) {
goto fail;
err = NvSciIpcOpenEndpoint("ipc_endpoint", &ipcEndpoint);
if (err != NvSciError_Success) {
goto fail;
/* ... */
Applications connected through an NvSciIpcEndpoint can exchange NvSciBuf structures (NvSciBufAttrList or NvSciBufObj) using the export/import APIs provided by NvSciBuf.
NvSciBuf Export APIs return an appropriate export descriptor for the specified NvSciIpcEndpoint.
Applications are responsible for transporting the export descriptor returned by NvSciBuf using the same NvSciIpcEndpoint.
NvSciBuf Import APIs return the respective NvSciBuf structure for the specified export descriptor.
NvSciBuf provides different APIs for transporting reconciled and unreconciled attribute lists. For reconciled attribute lists, applications can optionally validate the reconciled list against one or more unreconciled attribute lists to ensure that the reconciled attribute list satisfies the parameters of the importing process' unreconciled lists. This can be done by either passing unreconciled lists to NvSciBufAttrListIpcImportReconciled API while importing or by invoking NvSciBufAttrListValidateReconciled API after importing.
Export/Import NvSciBuf AttrLists
/* --------------------------App Process1 ----------------------------------*/
NvSciBufAttrList AttrList1 = NULL;
void* ListDesc = NULL;
size_t ListDescSize = 0U;
/* creation of the attribute list, receiving other lists from other listeners */
err = NvSciBufAttrListIpcExportUnreconciled(
&AttrList1, /* array of unreconciled lists to be exported */
1, /* size of the array */
ipcEndpoint, /* valid and opened NvSciIpcEndpoint intended to send the descriptor through */
&ListDesc, /* The descriptor buffer to be allocated and filled in */
&ListDescSize ); /* size of the newly created buffer */
if (err != NvSciError_Success) {
goto fail;
/* send the descriptor to the process2 */
/* wait for process 1 to reconcile and export reconciled list */
err = NvSciBufAttrListIpcImportReconciled(
module, /* NvSciBuf module using which this attrlist to be imported */
ipcEndpoint, /* valid and opened NvSciIpcEndpoint on which the descriptor is received */
ListDesc, /* The descriptor buffer to be imported */
ListDescSize, /* size of the descriptor buffer */
&AttrList1, /* array of unreconciled lists to be used for validating the reconciled list */
1, /* Number or unreconciled lists */
&reconciledAttrList, /* Imported reconciled list */
if (err != NvSciError_Success) {
goto fail;
/* --------------------------App Process2 ----------------------------------*/
void* ListDesc = NULL;
size_t ListDescSize = 0U;
NvSciBufAttrList unreconciledList[2] = {NULL};
NvSciBufAttrList reconciledList = NULL;
NvSciBufAttrList newConflictList = NULL;
NvSciBufAttrList AttrList2 = NULL;
NvSciSyncAttrList importedUnreconciledAttrList = NULL;
/* create the local AttrList */
/* receive the descriptor from the other process */
err = NvSciBufAttrListIpcImportUnreconciled(module, ipcEndpoint,
ListDesc, ListDescSize,
if (err != NvSciError_Success) {
goto fail;
/* gather all the lists into an array and reconcile */
unreconciledList[0] = AttrList2;
unreconciledList[1] = importedUnreconciledAttrList;
err = NvSciBufAttrListReconcile(unreconciledList, 2, &reconciledList,
if (err != NvSciError_Success) {
goto fail;
err = NvSciBufAttrListIpcExportReconciled(
&AttrList1, /* array of unreconciled lists to be exported */
ipcEndpoint, /* valid and opened NvSciIpcEndpoint intended to send the descriptor through */
&ListDesc, /* The descriptor buffer to be allocated and filled in */
&ListDescSize ); /* size of the newly created buffer */
if (err != NvSciError_Success) {
goto fail;
Export/Import NvSciBufObj
/* process1 */
void* objAndList;
size_t objAndListSize;
err = NvSciBufIpcExportAttrListAndObj(
bufObj, /* bufObj to be exported (the reconciled list is inside it) */
NvSciBufAccessPerm_ReadOnly, /* permissions we want the receiver to have */
ipcEndpoint, /* IpcEndpoint via which the object is to be exported */
&objAndList, /* descriptor of the object and list to be communicated */
&objAndListSize); /* size of the descriptor */
/* send via Ipc */
/* process2 */
void* objAndList;
size_t objAndListSize;
err = NvSciBufIpcImportAttrListAndObj(
module, /* NvSciBufModule use to create original unreconciled lists in the waiter */
ipcEndpoint, /* ipcEndpoint from which the descriptor was received */
objAndList, /* the desciptor of the buf obj and associated reconciled attribute list received from the signaler */
objAndListSize, /* size of the descriptor */
&AttrList1, /* the array of original unreconciled lists prepared in this process */
1, /* size of the array */
NvSciBufAccessPerm_ReadOnly, /* permissions expected by this process */
10000U, /* timeout in microseconds. Some primitives might require time to transport all needed resources */
&bufObj); /* buf object generated from the descriptor */
/* use the buf object */
NvSciBuf API
For information about NvSciBuf API, see Buffer Allocation APIs.
UMD Access
The following sections describe UMD access.
NvMedia supports different datatypes, such as NvMediaImage, NvMediaTensor, NvMediaArray, etc. NvMedia provides a set of interfaces for each data type to interact with NvSciBuf.
This section describes NvMediaImage and NvSciBuf interaction. The following steps show the typical flow to allocate an NvSciBuf object, which can be mapped into a NvMediaImage:
1. The application creates a NvSciBufAttrList.
2. The application queries NvMedia to fill in the NvSciBufAttrList by passing a set of NvMediaImage allocation attributes and NvMediaSurfaceType as input to NvMediaImageFillNvSciBufAttrs API.
3. The application may choose to set any of the public NvSciBufAttributes, which are not set by NvMedia.
4. If the same NvSciBuf object is shared with other UMDs, then the application can get the corresponding NvSciBufAttrList from the respective UMD.
5. The application asks NvSciBuf to reconcile all the filled NvSciBufAttrLists and then allocates an NvSciBuf object.
6. The application then queries NvMedia to create an NvMediaImage from the allocated NvSciBuf object by calling NvMediaImageCreateFromNvSciBuf API.
7. An NvMediaImage can be passed as input/output to any of the NvMedia APIs that accept NvMediaImage as an argument.
Usage of NvSciBuf and NvMediaImage API
NvMediaDevice *device ;
NvMediaStatus status;
NvSciError err;
NvSciBufModule module;
NvSciBufAttrList attrlist, conflictlist;
NvSciBufObj bufObj;
NvMediaImage *image;
NvMediaSurfaceType nvmsurfacetype;
NvMediaSurfAllocAttr surfAllocAttrs[8];
/*NvMedia related initialization */
device = NvMediaDeviceCreate();
status = NvMediaImageNvSciBufInit();
/*NvSciBuf related initialization*/
err = NvSciBufModuleOpen(&module);
/*Create NvSciBuf attribute list*/
err = NvSciBufAttrListCreate(module, &attrlist);
/*Initialize surfFormatAttrs and surfAllocAttrs as required */
/* Get NvMediaSurfaceType */
nvmsurfacetype = NvMediaSurfaceFormatGetType(surfFormatAttrs, NVM_SURF_FMT_ATTR_MAX);
/*Ask NvMedia to fill NvSciBufAttrs corresponding to nvmsurfacetype and surfAllocAttrs*/
status = NvMediaImageFillNvSciBufAttrs(device, nvmsurfacetype, surfAllocAttrs, numsurfallocattrs, 0, attrlist);
/*Reconcile the NvSciBufAttrs and then allocate a NvSciBufObj */
err = NvSciBufAttrListReconcileAndObjAlloc(&attrlist, 1, bufobj, &conflictlist);
/*Create NvMediaImage from NvSciBufObj */
status = NvMediaImageCreateFromNvSciBuf(device, bufobj, &image);
/*Free the NvSciBufAttrList which is no longer required */
err = NvSciBufAttrListFree(attrlist);
/*Use the image as input or output for any of the Nvmedia */
/*Free the resources after use*/
/*Destroy NvMediaImage */
/*NvMedia related Deinit*/
/*NvSciBuf related deinit*/
CUDA supports the import of an NvSciBufObj into CUDA as CUDA external memory of type NvSciBuf. Once imported, use CUDA API to get a CUDA pointer/array from the imported memory object, which can be passed to CUDA kernels. Applications must query NvSciBufObj for the attributes required to fill descriptors, which are passed as parameters to the import/map APIs.
If the NvSciBuf object imported into CUDA is also mapped by other drivers, then the application must use CUDA external semaphore APIs described here as appropriate barriers to maintain coherence between CUDA and the other drivers.
The CUDA API spec for the external memory interfaces is available here.
NvSciBufObj  CUDA pointer / Array
The following section describes the steps required to use NvSciBufObj as a CUDA pointer/array.
1. Allocate NvSciBufObj.
The application creates NvSciBufAttrList and sets the NvSciBufGeneralAttrKey_GpuId attribute to specify the ID of the GPU that shares the buffer, along with other attributes.
If the same object is used by other UMDs, the corresponding attribute lists must be created and reconciled. The reconciled list must be used to allocate the NvSciBufObj.
The attribute list and NvSciBuf objects must be maintained by the application.
2. Query NvSciBufObj attributes to fill CUDA descriptors.
The application must query the allocated NvSciBufObj for required attributes to fill the CUDA external memory descriptor.
3. NvSciBuf object registration with CUDA.
cudaImportExternalMemory() must be used to register the allocated NvSciBuf object with CUDA by filling up cudaExternalMemoryHandleDesc for type cudaExternalMemoryHandleTypeNvSciBuf.
cudaDestroyExternalMemory() API must be used to free the CUDA external memory. CUDA mappings created from external memory must be freed before invoking this API.
4. Getting CUDA pointer/array from imported external memory.
cudaExternalMemoryGetMappedBuffer() maps a buffer onto an imported memory object and returns a CUDA device pointer. The properties of the buffer must be described in the CUDA ExternalMemory buffer description by querying attributes from NvSciBufObj. The returned pointer device pointer must be freed using cudaFree.
cudaExternalMemoryGetMappedMipmappedArray() maps a CUDA mipmapped array onto an external object and returns a handle to it. The properties of the buffer must be described in the CUDA ExternalMemory MipmappedArray desc by querying attributes from NvSciBufObj. The returned CUDA mipmapped array must be freed using cudaFreeMipmappedArray.
All the APIs mentioned in the sections above are CUDA-runtime APIs. Each of them has an equivalent driver API. The syntax and usage of both versions are the same.
NvSciBuf-CUDA interop
/*********** Allocate NvSciBuf object ************/
// Raw Buffer Attributes for CUDA
NvSciBufType bufType = NvSciBufType_RawBuffer;
uint64_t rawsize = SIZE;
uint64_t align = 0;
bool cpuaccess_flag = true;
NvSciBufAttrValAccessPerm perm = NvSciBufAccessPerm_ReadWrite;
uint64_t gpuId[] = {};
cuDeviceGetUuid(&uuid, dev));
gpuid[0] = (uint64_t)uuid.bytes;
// Fill in values
NvSciBufAttrKeyValuePair rawbuffattrs[] = {
{ NvSciBufGeneralAttrKey_Types, &bufType, sizeof(bufType) },
{ NvSciBufRawBufferAttrKey_Size, &rawsize, sizeof(rawsize) },
{ NvSciBufRawBufferAttrKey_Align, &align, sizeof(align) },
{ NvSciBufGeneralAttrKey_NeedCpuAccess, &cpuaccess_flag,
sizeof(cpuaccess_flag) },
{ NvSciBufGeneralAttrKey_RequiredPerm, &perm, sizeof(perm) },
{ NvSciBufGeneralAttrKey_GpuId, &gpuid, sizeof(gpuId) },
// Create list by setting attributes
err = NvSciBufAttrListSetAttrs(attrListBuffer, rawbuffattrs,
NvSciBufAttrListCreate(NvSciBufModule, &attrListBuffer);
// Reconcile And Allocate
NvSciBufAttrListReconcile(&attrListBuffer, 1, &attrListReconciledBuffer, &attrListConflictBuffer)
NvSciBufObjAlloc(attrListReconciledBuffer, &bufferObjRaw);
/*************** Query NvSciBuf Object **************/
NvSciBufAttrKeyValuePair bufattrs[] = {
{NvSciBufRawBufferAttrKey_Size, NULL, 0},
NvSciBufAttrListGetAttrs(retList, bufattrs, sizeof(bufattrs)/sizeof(NvSciBufAttrKeyValuePair)));
ret_size = *(static_cast<const uint64_t*>(bufattrs[0].value));
/*************** NvSciBuf Registration With CUDA **************/
cudaExternalMemoryHandleDesc memHandleDesc;
memset(&memHandleDesc, 0, sizeof(memHandleDesc));
memHandleDesc.type = cudaExternalMemoryHandleTypeNvSciBuf;
memHandleDesc.handle.nvSciBufObject = bufferObjRaw;
memHandleDesc.size = ret_size;
cudaImportExternalMemory(&extMemBuffer, &memHandleDesc);
/************** Mapping to CUDA ******************************/
cudaExternalMemoryBufferDesc bufferDesc;
memset(&bufferDesc, 0, sizeof(bufferDesc));
bufferDesc.offset = offset = 0;
bufferDesc.size = ret_size;
cudaExternalMemoryGetMappedBuffer(&dptr, extMemBuffer, &bufferDesc);
/************** CUDA Kernel ***********************************/
// Run CUDA Kernel on dptr
/*************** Free CUDA mappings *****************************/
/***************** Free NvSciBuf **********************************/