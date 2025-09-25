transformer_encoder_layer_spec (ModuleSpec) – transformer layer customization specs for encoder

transformer_decoder_layer_spec (ModuleSpec) – transformer layer customization specs for decoder

max_sequence_length (int) – maximum size of sequence. This is used for positional embedding

fp16_lm_cross_entropy (bool, optional) – Defaults to False

share_embeddings_and_output_weights (bool) – When True, input embeddings and output logit weights are shared. Defaults to False.

rotary_percent (float) – Percent of rotary dimension to use for rotary position embeddings. Defaults to 1.0 (100%). Ignored unless position_embedding_type is ‘rope’.