Update pipeline tag to `graph-ml` and add HF paper link
Browse filesThis PR improves the model card for `PKU-ML/G1-CoT-SFT-7B` by:
* Updating the `pipeline_tag` to `graph-ml` for more precise categorization and discoverability on the Hugging Face Hub (https://huggingface.co/models?pipeline_tag=graph-ml).
* Adding a direct link to the official Hugging Face paper page for easy access to the associated research.
Please review these changes.
README.md
CHANGED
|
@@ -1,48 +1,49 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
datasets:
|
| 4 |
- PKU-ML/Erdos-CoT
|
| 5 |
language:
|
| 6 |
- en
|
|
|
|
|
|
|
| 7 |
metrics:
|
| 8 |
- accuracy
|
| 9 |
-
|
| 10 |
-
- Qwen/Qwen2.5-7B-Instruct
|
| 11 |
-
pipeline_tag: text-generation
|
| 12 |
tags:
|
| 13 |
- graph
|
| 14 |
- chat
|
| 15 |
-
library_name: transformers
|
| 16 |
---
|
| 17 |
|
| 18 |
-
|
| 19 |
# G1-CoT-SFT-7B
|
| 20 |
|
|
|
|
|
|
|
| 21 |
## Introduction
|
| 22 |
|
| 23 |
G1 is the series of large language models trained on our benchmark [Erdos](https://huggingface.co/datasets/PKU-ML/Erdos) for solving graph reasoning tasks, based on Qwen2.5-Instruct.
|
| 24 |
-
We apply Group Relative Policy Optimization (GRPO) for reinforcement learning with supervised finetuning as a
|
| 25 |
|
| 26 |
G1 brings the following improvements:
|
| 27 |
|
| 28 |
-
-
|
| 29 |
-
-
|
| 30 |
-
-
|
| 31 |
|
| 32 |
|
| 33 |
**This repo contains the G1-CoT-SFT-7B model**, which has the following features:
|
| 34 |
-
-
|
| 35 |
-
-
|
| 36 |
-
-
|
| 37 |
-
-
|
| 38 |
-
-
|
| 39 |
|
| 40 |
For more details, please refer to our [paper](https://arxiv.org/pdf/2505.18499) and [GitHub](https://github.com/PKU-ML/G1/tree/main).
|
| 41 |
|
| 42 |
|
| 43 |
## Requirements
|
| 44 |
|
| 45 |
-
The model is trained based on Qwen/Qwen2.5-7B-Instruct. The code of Qwen2.5 has been in the latest Hugging
|
| 46 |
|
| 47 |
With `transformers<4.37.0`, you will encounter the following error:
|
| 48 |
```
|
|
@@ -72,10 +73,18 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
| 72 |
)
|
| 73 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 74 |
|
| 75 |
-
prompt = "The task is to determine the degree centrality of a node in the graph
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
"You need to format your answer as a float number."
|
| 80 |
messages = [
|
| 81 |
{"role": "user", "content": INSTRUCTION_TEMPLATE.format(instruction=prompt)}
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- Qwen/Qwen2.5-7B-Instruct
|
| 4 |
datasets:
|
| 5 |
- PKU-ML/Erdos-CoT
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
+
library_name: transformers
|
| 9 |
+
license: apache-2.0
|
| 10 |
metrics:
|
| 11 |
- accuracy
|
| 12 |
+
pipeline_tag: graph-ml
|
|
|
|
|
|
|
| 13 |
tags:
|
| 14 |
- graph
|
| 15 |
- chat
|
|
|
|
| 16 |
---
|
| 17 |
|
|
|
|
| 18 |
# G1-CoT-SFT-7B
|
| 19 |
|
| 20 |
+
This model is part of the G1 series of large language models for graph reasoning tasks, as presented in the paper [G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning](https://huggingface.co/papers/2505.18499).
|
| 21 |
+
|
| 22 |
## Introduction
|
| 23 |
|
| 24 |
G1 is the series of large language models trained on our benchmark [Erdos](https://huggingface.co/datasets/PKU-ML/Erdos) for solving graph reasoning tasks, based on Qwen2.5-Instruct.
|
| 25 |
+
We apply Group Relative Policy Optimization (GRPO) for reinforcement learning with supervised finetuning as a preliminary step.
|
| 26 |
|
| 27 |
G1 brings the following improvements:
|
| 28 |
|
| 29 |
+
- **Significant improvement on graph reasoning**: G1 models achieve up to 46% improvement over baselines on Erdős, with the 7B variant matching OpenAI’s o3-mini and the 3B model surpassing Qwen2.5-72B-Instruct by notable margins.
|
| 30 |
+
- **Strong Generalization to unseen graph tasks**: G1 exhibits zero-shot generalization on unseen graph tasks, improving performance on *other graph reasoning benchmarks* (GraphWiz, GraphArena) and *real-world graphs* (Cora, PubMed).
|
| 31 |
+
- **NO Compromise on general reasoning**: Crucially, G1 preserves general reasoning ability (GSM8K, MATH, MMLU-Pro), proving its versatility.
|
| 32 |
|
| 33 |
|
| 34 |
**This repo contains the G1-CoT-SFT-7B model**, which has the following features:
|
| 35 |
+
- Type: Causal Language Models
|
| 36 |
+
- Training Stage: SFT
|
| 37 |
+
- Architecture: the same as Qwen2.5-Instruct
|
| 38 |
+
- Number of Parameters: 7.62B
|
| 39 |
+
- Context Length: Full 32,768 tokens and generation 8192 tokens
|
| 40 |
|
| 41 |
For more details, please refer to our [paper](https://arxiv.org/pdf/2505.18499) and [GitHub](https://github.com/PKU-ML/G1/tree/main).
|
| 42 |
|
| 43 |
|
| 44 |
## Requirements
|
| 45 |
|
| 46 |
+
The model is trained based on Qwen/Qwen2.5-7B-Instruct. The code of Qwen2.5 has been in the latest Hugging Face `transformers` and we advise you to use the latest version of `transformers`.
|
| 47 |
|
| 48 |
With `transformers<4.37.0`, you will encounter the following error:
|
| 49 |
```
|
|
|
|
| 73 |
)
|
| 74 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 75 |
|
| 76 |
+
prompt = "The task is to determine the degree centrality of a node in the graph.
|
| 77 |
+
|
| 78 |
+
"\
|
| 79 |
+
"Degree centrality for a node is the fraction of nodes it is connected to.
|
| 80 |
+
|
| 81 |
+
"\
|
| 82 |
+
"Here is an undirected graph containing nodes from 1 to 15. The edges are: (1, 15), (15, 11), (2, 3), (2, 6), (3, 6), (3, 7), (6, 7), (6, 8), (7, 8), (7, 14), (4, 10), (10, 5), (10, 12), (8, 14), (8, 9), (12, 11), (12, 13).
|
| 83 |
+
|
| 84 |
+
"\
|
| 85 |
+
"Question: What is the degree centrality of node 2 in the graph?
|
| 86 |
+
|
| 87 |
+
"\
|
| 88 |
"You need to format your answer as a float number."
|
| 89 |
messages = [
|
| 90 |
{"role": "user", "content": INSTRUCTION_TEMPLATE.format(instruction=prompt)}
|