core.export.trtllm.trtllm_layers#

Module Contents#

Classes#

TRTLLMLayers

TRTLLM Layer names

Functions#

get_layer_name_without_prefix

Get TRTLayer name without prefix

Data#

API#

class core.export.trtllm.trtllm_layers.TRTLLMLayers(*args, **kwds)#

Bases: enum.Enum

TRTLLM Layer names

This Enum will be used to map input model layer names to TRTLLM Layer names

Initialization

position_embedding#

‘transformer.position_embedding.weight’

vocab_embedding#

‘transformer.vocab_embedding.weight’

lm_head#

‘lm_head.weight’

final_layernorm_weight#

‘transformer.ln_f.weight’

final_layernorm_bias#

‘transformer.ln_f.bias’

input_layernorm_weight#

‘transformer.layers.input_layernorm.weight’

input_layernorm_bias#

‘transformer.layers.input_layernorm.bias’

attention_qkv_weight#

‘transformer.layers.attention.qkv.weight’

attention_qkv_bias#

‘transformer.layers.attention.qkv.bias’

attention_dense_weight#

‘transformer.layers.attention.dense.weight’

attention_dense_bias#

‘transformer.layers.attention.dense.bias’

attention_linear_weight#

‘transformer.layers.attention.weight’

mlp_fc_weight#

‘transformer.layers.mlp.fc.weight’

mlp_fc_bias#

‘transformer.layers.mlp.fc.bias’

post_layernorm_weight#

‘transformer.layers.post_layernorm.weight’

post_layernorm_bias#

‘transformer.layers.post_layernorm.bias’

mlp_projection_weight#

‘transformer.layers.mlp.proj.weight’

mlp_projection_bias#

‘transformer.layers.mlp.proj.bias’

ffn_fc_weight#

‘transformer.layers.ffn.fc.weight’

ffn_projection_weight#

‘transformer.layers.ffn.proj.weight’

ffn_linear_weight#

‘transformer.layers.ffn.weight’

mlp_router_weight#

‘transformer.layers.mlp.router.weight’

mlp_fc_weight_mixture_of_experts#

‘transformer.layers.mlp.fc.weight.expert’

mlp_projection_weight_mixture_of_experts#

‘transformer.layers.mlp.proj.weight.expert’

static return_layer_name_and_number(
layer_name: str,
) Tuple[str, int]#

Helper function to return layer name and number Given an input layer e.g decoder.layers.2.self_attention.linear_qkv.weight, this function returns decoder.layers.self_attention.linear_qkv.weight and layernumber 2. In case no layer number is present, it returns None for the layer number

Parameters:

layer_name (dict) – The input layer name

Returns:

The layer name , layer number (layer number could be None)

Return type:

Tuple[str, int]

static rename_input_layer_names_to_trtllm_layer_names(
model_state_dict: dict,
trtllm_conversion_dict: dict,
state_dict_split_by_layer_numbers: bool = True,
) dict#

Helper function to rename model layer names to TRTLLM Layer names

We go through each layer (keys) in the model state dict, and map it to the equivalent TRTLLMLayer name (megatron/core/export/trtllm/trtllm). If we have a layer number associated with layer, we extract it out, map the original layer name to equivalent trtllm layer name and add layer number back. CPU Conversion will pass in model state dict without layer numbers (i.e decoder.layers.mlp.linear_fc1.weight of shape [num_layers, hidden_dim, 4 * hidden_dim]) . GPU conversion will pass model state dict with each layer seperated (i.e decoder.layers.2.mlp.linear_fc1.weight of shape [hidden_dim, 4 * hidden_dim]).

Parameters:
  • model_state_dict (dict) – The original model state dict

  • trtllm_conversion_dict (dict) – The conversion dictionary mapping input model layer names to trtllm layer names

  • state_dict_split_by_layer_numbers (bool, optional) – Are the model layers split by layer numbers in state dict. For example : mlp.fc1.weight can be represented like mlp.fc1.weight of shape [num_layers, hidden_dim, ffn_hidden_dim]} or it can be like mlp.fc1.layers.0.weight of shape [hidden_dim, ffn_hidden_dim], then mlp.fc1.layers.1.weight … for all layers. If you use represenation 2 set this to True. Defaults to True

Raises:

ValueError – In case the keys dont match to trtllm keys or if all model layers are not mapped to equivalent trtllm keys

Returns:

The model state dict with the key (i.e original model layer name) replaced by trtllm layer names

Return type:

dict

core.export.trtllm.trtllm_layers.NON_TRANSFORMER_LAYERS_NAMES#

None

core.export.trtllm.trtllm_layers.get_layer_name_without_prefix(
layer: core.export.trtllm.trtllm_layers.TRTLLMLayers,
) str#

Get TRTLayer name without prefix

Given a layer e.g TRTLLMLayers.attention_qkv_weight it returns ‘attention.qkv.weight’

Parameters:

layer (TRTLLMLayers) – The TRTLLMLayer

Returns:

The TRTLLMLayers suffix (i.e Removing transformer.layers. fromt he layer name)

Return type:

str