File size: 1,701 Bytes
8ab3a3d 58c3798 8ab3a3d 750da77 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
This is the distilled educational Qwen2.5-7B-Instruct based on EduBench.
- [paper](https://arxiv.org/abs/2505.16160)
- [github](https://github.com/DIRECT-BIT/EduBench)
## Model Details
**Model Name**: EDU-Qwen2.5-7B
**Model Type**: Distilled instruction-tuned language model (7B parameters)
**Base Model**: [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
## Training Data
To fully leverage the strengths of different response generation models across various scenarios, we adopt a multi-source distillation pipeline.
For each task, we select the best-performing model on the test set as the response generator, using it to answer educational domain questions and construct the training dataset for the distillation model.
Through the distillation pipeline, we obtain a training set of 17,000 samples covering various subtasks across all 9 educational scenarios.
More details are provided in Appendix K of our [paper](https://arxiv.org/abs/2505.16160)
## Performance
<div align="center">
<img src="performance.png" alt="Framework" width="1200"/>
<br>
</div>
## 🫣Citation
If you find our benchmark, evaluation pipeline or models useful or interesting, please cite our paper.
```
@misc{xu2025edubenchcomprehensivebenchmarkingdataset,
title={EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenarios},
author={Bin Xu and Yu Bai and Huashan Sun and Yiguan Lin and Siming Liu and Xinyue Liang and Yaolin Li and Yang Gao and Heyan Huang},
year={2025},
eprint={2505.16160},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.16160},
}
```
|