LGAI-EXAONE
/

EXAONE-4.0-32B

Text Generation

Model card Files Files and versions

LG-AI-EXAONE commited on Jul 18

Commit

b1f8731

·

1 Parent(s): 0640ef5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -32,7 +32,7 @@ The EXAONE 4.0 model series consists of two sizes: a mid-size **32B** model opti
 In the EXAONE 4.0 architecture, we apply new architectural changes compared to previous EXAONE models as below:
 1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
-2. **QK-Reorder-Norm**: We adopt the Post-LN (LayerNorm) scheme for transformer blocks instead of Pre-LN, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
 For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).

 In the EXAONE 4.0 architecture, we apply new architectural changes compared to previous EXAONE models as below:
 1. **Hybrid Attention**: For the 32B model, we adopt hybrid attention scheme, which combines *Local attention (sliding window attention)* with *Global attention (full attention)* in a 3:1 ratio. We do not use RoPE (Rotary Positional Embedding) for global attention for better global context understanding.
+2. **QK-Reorder-Norm**: We reorder the LayerNorm position from the traditional Pre-LN scheme by applying LayerNorm directly to the attention and MLP outputs, and we add RMS normalization right after the Q and K projection. It helps yield better performance on downstream tasks despite consuming more computation.
 For more details, please refer to our [technical report](https://arxiv.org/abs/2507.11407), [blog](https://www.lgresearch.ai/blog/view?seq=576), and [GitHub](https://github.com/LG-AI-EXAONE/EXAONE-4.0).