huihui-ai
/

Huihui-MoE-23B-A4B

Text Generation

Mixture of Experts

Model card Files Files and versions

huihui-ai commited on Jun 21

Commit

71e411c

·

verified ·

1 Parent(s): d82f07c

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -20,6 +20,8 @@ tags:
 ## Model Overview
 Huihui-MoE-23B-A4B is a **Mixture of Experts (MoE)** language model developed by **huihui.ai**, built upon the **[Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B)** base model. It enhances the standard Transformer architecture by replacing MLP layers with MoE layers, each containing 8 experts, to achieve high performance with efficient inference. The model is designed for natural language processing tasks, including text generation, question answering, and conversational applications.
 **Note**:
 The activated expert can handle numbers from 1 to 8, and can complete normal conversations as well.
 You can change the activation parameters using `/num_experts_per_tok <number>`. After modifying the parameters, the model will be reloaded.

 ## Model Overview
 Huihui-MoE-23B-A4B is a **Mixture of Experts (MoE)** language model developed by **huihui.ai**, built upon the **[Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B)** base model. It enhances the standard Transformer architecture by replacing MLP layers with MoE layers, each containing 8 experts, to achieve high performance with efficient inference. The model is designed for natural language processing tasks, including text generation, question answering, and conversational applications.
+The corresponding ablation version is [huihui-ai/Huihui-MoE-23B-A4B-abliterated](https://huggingface.co/huihui-ai/Huihui-MoE-23B-A4B-abliterated)
 **Note**:
 The activated expert can handle numbers from 1 to 8, and can complete normal conversations as well.
 You can change the activation parameters using `/num_experts_per_tok <number>`. After modifying the parameters, the model will be reloaded.