File size: 1,701 Bytes
8ab3a3d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58c3798
8ab3a3d
750da77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
This is the distilled educational Qwen2.5-7B-Instruct based on EduBench.
- [paper](https://arxiv.org/abs/2505.16160)
- [github](https://github.com/DIRECT-BIT/EduBench)

## Model Details

**Model Name**: EDU-Qwen2.5-7B

**Model Type**: Distilled instruction-tuned language model (7B parameters)

**Base Model**: [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
## Training Data

To fully leverage the strengths of different response generation models across various scenarios, we adopt a multi-source distillation pipeline. 
For each task, we select the best-performing model on the test set as the response generator, using it to answer educational domain questions and construct the training dataset for the distillation model. 
Through the distillation pipeline, we obtain a training set of 17,000 samples covering various subtasks across all 9 educational scenarios.

More details are provided in Appendix K of our [paper](https://arxiv.org/abs/2505.16160)

## Performance
<div align="center">
  <img src="performance.png" alt="Framework" width="1200"/>
  <br>
</div>

## 🫣Citation
If you find our benchmark, evaluation pipeline or models useful or interesting, please cite our paper.

```
@misc{xu2025edubenchcomprehensivebenchmarkingdataset,
      title={EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenarios}, 
      author={Bin Xu and Yu Bai and Huashan Sun and Yiguan Lin and Siming Liu and Xinyue Liang and Yaolin Li and Yang Gao and Heyan Huang},
      year={2025},
      eprint={2505.16160},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.16160}, 
}
```