cuPHY
0.1
CUDA PHY Layer Acceration Library
|
This section describes the downlink rate matching functions of the cuPHY application programming interface. More...
Data Structures | |
struct | PdschPerTbParams |
Struct that tracks configuration information at a per TB (Transport Block) granularity for the downlink shared channel (PDSCH). More... | |
struct | PerTbParams |
Struct that tracks configuration information at a per TB (Transport Block) granularity. More... | |
Functions | |
cuphyStatus_t | cuphyDlRateMatchingGetDescrInfo (size_t *pDescrSizeBytes, size_t *pDescrAlignBytes) |
: Compute descriptor buffer size and alignment for rate matching. More... | |
size_t | cuphyDlRateMatchingWorkspaceSize (int num_TBs) |
: Return workspace size, in bytes, needed for all configuration parameters of the rate matching component. Does not allocate any space. More... | |
cuphyStatus_t | cuphySetTBParams (PdschPerTbParams *tb_params_struct, uint32_t cfg_rv, uint32_t cfg_Qm, uint32_t cfg_bg, uint32_t cfg_Nl, uint32_t cfg_num_CBs, uint32_t cfg_Zc, uint32_t cfg_G, uint32_t cfg_F, uint32_t cfg_cinit, uint32_t cfg_Nref) |
Update PdschPerTbParams struct that tracks configuration information at per TB granularity. Check that configuration values are valid. More... | |
cuphyStatus_t | cuphySetupDlRateMatching (cuphyDlRateMatchingLaunchConfig_t dlRateMatchingLaunchConfig, const uint32_t *d_rate_matching_input, uint32_t *d_rate_matching_output, uint32_t *d_restructure_rate_matching_output, void *d_modulation_output, void *d_xtf_re_map, uint16_t max_PRB_BWP, int num_TBs, int num_layers, uint8_t enable_scrambling, uint8_t enable_layer_mapping, uint8_t enable_modulation, uint8_t precoding, uint8_t restructure_kernel, uint8_t batching, uint32_t *h_workspace, uint32_t *d_workspace, PdschPerTbParams *h_params, PdschPerTbParams *d_params, PdschDmrsParams *d_dmrs_params, PdschUeGrpParams *d_ue_grp_params, void *cpu_desc, void *gpu_desc, uint8_t enable_desc_async_copy, cudaStream_t strm) |
: Setup rate matching component incl. kernel node params for rate-matching (incl. scrambling and layer mapping) and rate-matching output restructuring (if enabled). If enable_modulation is set, this component also performs modulation too. More... | |
cuphyStatus_t cuphyDlRateMatchingGetDescrInfo | ( | size_t * | pDescrSizeBytes, |
size_t * | pDescrAlignBytes | ||
) |
[in,out] | pDescrSizeBytes | Size in bytes of descriptor |
[in,out] | pDescrAlignBytes | Alignment in bytes of descriptor |
size_t cuphyDlRateMatchingWorkspaceSize | ( | int | num_TBs | ) |
[in] | num_TBs | number of Transport blocks (TBs) to be processed within a kernel launch |
cuphyStatus_t cuphySetTBParams | ( | PdschPerTbParams * | tb_params_struct, |
uint32_t | cfg_rv, | ||
uint32_t | cfg_Qm, | ||
uint32_t | cfg_bg, | ||
uint32_t | cfg_Nl, | ||
uint32_t | cfg_num_CBs, | ||
uint32_t | cfg_Zc, | ||
uint32_t | cfg_G, | ||
uint32_t | cfg_F, | ||
uint32_t | cfg_cinit, | ||
uint32_t | cfg_Nref | ||
) |
[in,out] | tb_params_struct | pointer to a PerTbParams configuration struct |
[in] | cfg_rv | redundancy version |
[in] | cfg_Qm | modulation order |
[in] | cfg_bg | base graph |
[in] | cfg_Nl | number of layers per Tb (at most MAX_DL_LAYERS_PER_TB for downlink) |
[in] | cfg_num_CBs | number of code blocks |
[in] | cfg_Zc | lifting factor |
[in] | cfg_G | number of rated matched bits available for TB transmission |
[in] | cfg_F | number of filler bits |
[in] | cfg_cinit | seed used for scrambling sequence |
[in] | cfg_Nref | used to determine Ncb if smaller than N |
cuphyStatus_t cuphySetupDlRateMatching | ( | cuphyDlRateMatchingLaunchConfig_t | dlRateMatchingLaunchConfig, |
const uint32_t * | d_rate_matching_input, | ||
uint32_t * | d_rate_matching_output, | ||
uint32_t * | d_restructure_rate_matching_output, | ||
void * | d_modulation_output, | ||
void * | d_xtf_re_map, | ||
uint16_t | max_PRB_BWP, | ||
int | num_TBs, | ||
int | num_layers, | ||
uint8_t | enable_scrambling, | ||
uint8_t | enable_layer_mapping, | ||
uint8_t | enable_modulation, | ||
uint8_t | precoding, | ||
uint8_t | restructure_kernel, | ||
uint8_t | batching, | ||
uint32_t * | h_workspace, | ||
uint32_t * | d_workspace, | ||
PdschPerTbParams * | h_params, | ||
PdschPerTbParams * | d_params, | ||
PdschDmrsParams * | d_dmrs_params, | ||
PdschUeGrpParams * | d_ue_grp_params, | ||
void * | cpu_desc, | ||
void * | gpu_desc, | ||
uint8_t | enable_desc_async_copy, | ||
cudaStream_t | strm | ||
) |
[in] | dlRateMatchingLaunchConfig | Pointer to cuphyDlRateMatchingLaunchConfig. |
[in] | d_rate_matching_input | LDPC encoder's output; device buffer, previously allocated. |
[out] | d_rate_matching_output | rate-matching output, with scrambling and layer-mapping, if enabled; device pointer, preallocated. |
[out] | d_restructure_rate_matching_output | d_rate_matching_output restructured for modulation. There are Er bits per code block. Each layer starts at an uint32_t aligned boundary. |
[out] | d_modulation_output | pointer to output tensor (preallocated) Each symbol is a complex number using half-precision for the real and imaginary parts. Update: no longer used; the cell_output_tensor_addr field of PdschDmrsParams is used instead. |
[in] | d_xtf_re_map | RE (resource element) map array, relevant when CSI-RS symbols overlap with TB allocations. Can set to nullptr if there is no such overlap. |
[in] | max_PRB_BWP | maximum number of downlink PRBs for all cells whose TBs are processed here. Used to index into the d_xtf_re_map array. |
[in] | num_TBs | number of TBs handled in a kernel launch |
[in] | num_layers | number of layers |
[in] | enable_scrambling | enable scrambling when 1, no scrambling when 0 |
[in] | enable_layer_mapping | enable layer mapping when 1, no layer mapping when 0 |
[in] | enable_modulation | run a fused rate matching and modulation kernel when 1; used in PDSCH pipeline. |
[in] | precoding | 1 if any TB has precoding enabled; 0 otherwise. |
[in] | restructure_kernel | set-up kernel node params for restructure kernel when 1. |
[in] | batching | when enabled the TBs from this kernel launch can belong to different cells |
[in] | h_workspace | pinned host memory for temporary buffers |
[in] | d_workspace | device memory for h_workspace . The H2D copy from h_workspace to d_workspace happens within cuphySetupDlRateMatching if enable_desc_async_copy is set. |
[in] | h_params | pointer to # TBs PdschPerTbParams struct; pinned host memory |
[in] | d_params | pointer to device memory for h_params . The H2D copy from h_params to d_params happens outside cuphySetupDlRateMatching. |
[in] | d_dmrs_params | pointer to PdschDmrs parameters on the device. |
[in] | d_ue_grp_params | pointer to PdschUeGrpParams parameters on the device. |
[in] | cpu_desc | Pointer to descriptor in CPU memory |
[in] | gpu_desc | Pointer to descriptor in GPU memory |
[in] | enable_desc_async_copy | async copy CPU descriptor into GPU if set. |
[in] | strm | CUDA stream for async copy |