jamescallander commited on
Commit
e0acd27
·
verified ·
1 Parent(s): 3398a32

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +157 -3
README.md CHANGED
@@ -1,3 +1,157 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: rkllm
3
+ pipeline_tag: text-generation
4
+ license: apache-2.0
5
+ language:
6
+ - en
7
+ base_model:
8
+ - Qwen/Qwen2.5-Math-7B-Instruct
9
+ tags:
10
+ - rkllm
11
+ - rk3588
12
+ - rockchip
13
+ - edge-ai
14
+ - llm
15
+ - math
16
+ - chat
17
+ ---
18
+ Qwen2.5-Math-7B-Instruct — RKLLM build for RK3588 boards
19
+
20
+ **Author:** @jamescallander
21
+ **Source model:** [Qwen/Qwen2.5-Math-7B-Instruct · Hugging Face](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct)
22
+
23
+ > This repository hosts a **conversion** of `Qwen2-Math-7B-Instruct` for use on Rockchip RK3588 single-board computers (Orange Pi 5 plus, Radxa Rock 5b+, Banana Pi M7, etc.). Conversion was performed using the [RKNN-LLM toolkit](https://github.com/airockchip/rknn-llm?utm_source=chatgpt.com)
24
+
25
+ #### Conversion details
26
+
27
+ - RKLLM-Toolkit version: v1.2.1
28
+
29
+ - NPU driver: v0.9.8
30
+
31
+ - Python: 3.12
32
+
33
+ - Quantization: `w8a8_g128`
34
+
35
+ - Output: single-file `.rkllm` artifact
36
+
37
+ - Tokenizer: not required at runtime (UI handles prompt I/O)
38
+
39
+
40
+ ## ⚠️ Math reasoning disclaimer
41
+
42
+ 🛑 **This model may make calculation or reasoning errors.**
43
+
44
+ - It is intended for **educational and experimental purposes only**.
45
+
46
+ - Always **double-check results** with trusted methods, calculators, or domain experts.
47
+
48
+ - Outputs should not be used as the sole basis for academic, financial, or scientific decisions.
49
+
50
+ - Use responsibly and verify correctness before relying on results.
51
+
52
+
53
+ ## Intended use
54
+
55
+ - On-device math reasoning and step-by-step problem solving.
56
+
57
+ - Qwen2.5-Math-7B-Instruct is tuned for **mathematics and quantitative reasoning tasks** (problem solving, proofs, step-by-step derivations).
58
+
59
+
60
+ ## Limitations
61
+
62
+ - Requires 8GB free memory
63
+
64
+ - Quantized build (`w8a8_g128`) may show small quality differences vs. full-precision upstream.
65
+
66
+ - Tested on Radxa Rock 5B+; other devices may require different drivers/toolkit versions.
67
+
68
+ - Generated code should always be reviewed before use in production systems.
69
+
70
+
71
+ ## Quick start (RK3588)
72
+
73
+ ### 1) Install runtime
74
+
75
+ The RKNN-LLM toolkit and instructions can be found on the specific development board's manufacturer website or from [airockchip's github page](https://github.com/airockchip).
76
+
77
+ Download and install the required packages as per the toolkit's instructions.
78
+
79
+ ### 2) Simple Flask server deployment
80
+
81
+ The simplest way the deploy the `.rkllm` converted model is using an example script provided in the toolkit in this directory: `rknn-llm/examples/rkllm_server_demo`
82
+
83
+ ```bash
84
+ python3 <TOOLKIT_PATH>/rknn-llm/examples/rkllm_server_demo/flask_server.py \
85
+ --rkllm_model_path <MODEL_PATH>/Qwen2.5-Math-7B-Instruct_w8a8_g128_rk3588.rkllm \
86
+ --target_platform rk3588
87
+ ```
88
+
89
+ ### 3) Sending a request
90
+
91
+ A basic format for message request is:
92
+
93
+ ```json
94
+ {
95
+ "model":"Qwen2.5-Math-7B-Instruct",
96
+ "messages":[{
97
+ "role":"user",
98
+ "content":"<YOUR_PROMPT_HERE>"}],
99
+ "stream":false
100
+ }
101
+ ```
102
+
103
+ Example request using `curl`:
104
+
105
+ ```bash
106
+ curl -s -X POST <SERVER_IP_ADDRESS>:8080/rkllm_chat \
107
+ -H 'Content-Type: application/json' \
108
+ -d '{"model":"Qwen2.5-Math-7B-Instruct","messages":[{"role":"user","content":"How is sample standard deviation calculated?"}],"stream":false}'
109
+ ```
110
+
111
+ The response is formated in the following way:
112
+
113
+ ```json
114
+ {
115
+ "choices":[{
116
+ "finish_reason":"stop",
117
+ "index":0,
118
+ "logprobs":null,
119
+ "message":{
120
+ "content":"<MODEL_REPLY_HERE">,
121
+ "role":"assistant"}}],
122
+ "created":null,
123
+ "id":"rkllm_chat",
124
+ "object":"rkllm_chat",
125
+ "usage":{
126
+ "completion_tokens":null,
127
+ "prompt_tokens":null,
128
+ "total_tokens":null}
129
+ }
130
+ ```
131
+
132
+ Example response:
133
+
134
+ ```json
135
+ {"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"To calculate the sample standard deviation, follow these steps: 1. **Calculate the mean (average) of the sample:** \[ \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} \] where \( x_i \) are the individual data points and \( n \) is the number of data points. 2. **Calculate the squared differences from the mean for each data point:** \[ (x_i - \bar{x})^2 \] 3. **Sum the squared differences:** \[ \sum_{i=1}^{n} (x_i - \bar{x})^2 \] 4. **Divide the sum of the squared differences by \( n-1 \) (this is called the Bessel's correction):** \[ s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1} \] where \( s^2 \) is the sample variance. 5. **Take the square root of the sample variance to get the sample standard deviation:** \[ s = \sqrt{s^2} = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}} \] So, the formula for the sample standard deviation is: \[ \boxed{s = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}}} \]","role":"assistant"}}],"created":null,"id":"rkllm_chat","object":"rkllm_chat","usage":{"completion_tokens":null,"prompt_tokens":null,"total_tokens":null}}
136
+ ```
137
+
138
+ ### 4) UI compatibility
139
+
140
+ This server exposes an **OpenAI-compatible Chat Completions API**.
141
+
142
+ You can connect it to any OpenAI-compatible client or UI (for example: [Open WebUI](https://github.com/open-webui/open-webui?utm_source=chatgpt.com))
143
+
144
+ - Configure your client with the API base: `http://<SERVER_IP_ADDRESS>:8080` and use the endpoint: `/rkllm_chat`
145
+ - Make sure the `model` field matches the converted model’s name, for example:
146
+
147
+ ```json
148
+ {
149
+ "model": "Qwen2.5-Math-7B-Instruct",
150
+ "messages": [{"role":"user","content":"Hello!"}],
151
+ "stream": false
152
+ }
153
+ ```
154
+
155
+ # License
156
+
157
+ This conversion follows the license of the source model: [LICENSE · Qwen/Qwen2.5-Math-7B-Instruct at main](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct/blob/main/LICENSE)