DreamW1ngs commited on
Commit
f8f33cf
·
verified ·
1 Parent(s): 0775f0c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -9,4 +9,39 @@ metrics:
9
  base_model:
10
  - Qwen/Qwen2.5-Math-7B
11
  pipeline_tag: question-answering
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  base_model:
10
  - Qwen/Qwen2.5-Math-7B
11
  pipeline_tag: question-answering
12
+ ---
13
+
14
+ # Making Mathematical Reasoning Adaptive
15
+
16
+ <p align="center">
17
+ <a href="https://arxiv.org/abs/2510.04617"> 📃 Paper</a> |
18
+ <a href="https://github.com/NJUNLP/AdaR"> ⚙️ Code</a> |
19
+ <a href="https://huggingface.co/collections/DreamW1ngs/adar-68e648e59b2c9aec1208b5ef"> 🤖 Project</a> |
20
+ <a href="https://resume.laizj.fun/"> 📭 Contact</a>
21
+ </p>
22
+
23
+ ---
24
+
25
+ ## 🌱 Overview
26
+
27
+ Large Language Models (LLMs) have shown impressive reasoning capabilities, yet they often rely on **spurious reasoning** — producing answers from superficial features, leading to failure at robustness and generalization.
28
+
29
+ We propose **AdaR** framework to enable adaptive reasoning, wherein models rely on problem-solving logic to produce answers. **AdaR** synthesizes logically equivalent queries by varying variable values, and trains models with RLVR on these data to penalize spurious logic while encouraging adaptive logic.
30
+
31
+ The framework integrates *data synthesis* and *RLVR training* to enhance both **robustness (in-domain)** and **generalization (out-of-domain)**.
32
+
33
+ ![AdaR Process Framework](./figs/process.png)
34
+
35
+ > **Figure 1.**
36
+ > *Subfigure I:* Three reasoning modes — direct inference (black), spurious reasoning (red), adaptive reasoning (green).
37
+ > *Subfigure II:* Logic-preserving variable perturbation and gold-answer generation via executable logic.
38
+ > *Subfigure III:* RLVR optimization encouraging adaptive reasoning through comparative feedback.
39
+
40
+ ## 📈 Highlights
41
+
42
+ - 🚀 **+8.5 Average Improvement** across in-domain robustness tasks and out-of-domain tasks.
43
+ - 🧮 **Only 9K synthetic data** needed for significant gains.
44
+ - ⚖️ **Enable algebraic thinking** and improved stability under scaling.
45
+ - 🔁 **Generalizable framework** applicable to instruct models.
46
+
47
+ ---