DreamW1ngs
/

AdaR-qwen-2.5-math-7b-9k

Question Answering

Model card Files Files and versions

DreamW1ngs commited on Oct 11

Commit

f8f33cf

·

verified ·

1 Parent(s): 0775f0c

Update README.md

Files changed (1) hide show

README.md +36 -1

README.md CHANGED Viewed

@@ -9,4 +9,39 @@ metrics:
 base_model:
 - Qwen/Qwen2.5-Math-7B
 pipeline_tag: question-answering
----

 base_model:
 - Qwen/Qwen2.5-Math-7B
 pipeline_tag: question-answering
+---
+# Making Mathematical Reasoning Adaptive
+<p align="center">
+  <a href="https://arxiv.org/abs/2510.04617"> 📃 Paper</a> |
+  <a href="https://github.com/NJUNLP/AdaR"> ⚙️ Code</a> |
+  <a href="https://huggingface.co/collections/DreamW1ngs/adar-68e648e59b2c9aec1208b5ef"> 🤖 Project</a> |
+  <a href="https://resume.laizj.fun/"> 📭 Contact</a>
+</p>
+---
+## 🌱 Overview
+Large Language Models (LLMs) have shown impressive reasoning capabilities, yet they often rely on **spurious reasoning** — producing answers from superficial features, leading to failure at robustness and generalization.
+We propose **AdaR** framework to enable adaptive reasoning, wherein models rely on problem-solving logic to produce answers. **AdaR** synthesizes logically equivalent queries by varying variable values, and trains models with RLVR on these data to penalize spurious logic while encouraging adaptive logic.
+The framework integrates *data synthesis* and *RLVR training* to enhance both **robustness (in-domain)** and **generalization (out-of-domain)**.
+![AdaR Process Framework](./figs/process.png)
+> **Figure 1.**
+> *Subfigure I:* Three reasoning modes — direct inference (black), spurious reasoning (red), adaptive reasoning (green).
+> *Subfigure II:* Logic-preserving variable perturbation and gold-answer generation via executable logic.
+> *Subfigure III:* RLVR optimization encouraging adaptive reasoning through comparative feedback.
+## 📈 Highlights
+- 🚀 **+8.5 Average Improvement** across in-domain robustness tasks and out-of-domain tasks.
+- 🧮 **Only 9K synthetic data** needed for significant gains.
+- ⚖️ **Enable algebraic thinking** and improved stability under scaling.
+- 🔁 **Generalizable framework** applicable to instruct models.
+---