uploaded readme
Browse files
README.md
ADDED
|
@@ -0,0 +1,261 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Quantization made by Richard Erkhov.
|
| 2 |
+
|
| 3 |
+
[Github](https://github.com/RichardErkhov)
|
| 4 |
+
|
| 5 |
+
[Discord](https://discord.gg/pvy7H8DZMG)
|
| 6 |
+
|
| 7 |
+
[Request more models](https://github.com/RichardErkhov/quant_request)
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
THaLLE-0.1-7B-fa - GGUF
|
| 11 |
+
- Model creator: https://huggingface.co/KBTG-Labs/
|
| 12 |
+
- Original model: https://huggingface.co/KBTG-Labs/THaLLE-0.1-7B-fa/
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
| Name | Quant method | Size |
|
| 16 |
+
| ---- | ---- | ---- |
|
| 17 |
+
| [THaLLE-0.1-7B-fa.Q2_K.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q2_K.gguf) | Q2_K | 2.81GB |
|
| 18 |
+
| [THaLLE-0.1-7B-fa.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.IQ3_XS.gguf) | IQ3_XS | 3.12GB |
|
| 19 |
+
| [THaLLE-0.1-7B-fa.IQ3_S.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.IQ3_S.gguf) | IQ3_S | 3.26GB |
|
| 20 |
+
| [THaLLE-0.1-7B-fa.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q3_K_S.gguf) | Q3_K_S | 3.25GB |
|
| 21 |
+
| [THaLLE-0.1-7B-fa.IQ3_M.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.IQ3_M.gguf) | IQ3_M | 3.33GB |
|
| 22 |
+
| [THaLLE-0.1-7B-fa.Q3_K.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q3_K.gguf) | Q3_K | 3.55GB |
|
| 23 |
+
| [THaLLE-0.1-7B-fa.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q3_K_M.gguf) | Q3_K_M | 3.55GB |
|
| 24 |
+
| [THaLLE-0.1-7B-fa.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q3_K_L.gguf) | Q3_K_L | 3.81GB |
|
| 25 |
+
| [THaLLE-0.1-7B-fa.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.IQ4_XS.gguf) | IQ4_XS | 3.96GB |
|
| 26 |
+
| [THaLLE-0.1-7B-fa.Q4_0.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q4_0.gguf) | Q4_0 | 4.13GB |
|
| 27 |
+
| [THaLLE-0.1-7B-fa.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.IQ4_NL.gguf) | IQ4_NL | 4.16GB |
|
| 28 |
+
| [THaLLE-0.1-7B-fa.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q4_K_S.gguf) | Q4_K_S | 4.15GB |
|
| 29 |
+
| [THaLLE-0.1-7B-fa.Q4_K.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q4_K.gguf) | Q4_K | 4.36GB |
|
| 30 |
+
| [THaLLE-0.1-7B-fa.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q4_K_M.gguf) | Q4_K_M | 4.36GB |
|
| 31 |
+
| [THaLLE-0.1-7B-fa.Q4_1.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q4_1.gguf) | Q4_1 | 4.54GB |
|
| 32 |
+
| [THaLLE-0.1-7B-fa.Q5_0.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q5_0.gguf) | Q5_0 | 4.95GB |
|
| 33 |
+
| [THaLLE-0.1-7B-fa.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q5_K_S.gguf) | Q5_K_S | 4.95GB |
|
| 34 |
+
| [THaLLE-0.1-7B-fa.Q5_K.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q5_K.gguf) | Q5_K | 5.07GB |
|
| 35 |
+
| [THaLLE-0.1-7B-fa.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q5_K_M.gguf) | Q5_K_M | 5.07GB |
|
| 36 |
+
| [THaLLE-0.1-7B-fa.Q5_1.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q5_1.gguf) | Q5_1 | 5.36GB |
|
| 37 |
+
| [THaLLE-0.1-7B-fa.Q6_K.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q6_K.gguf) | Q6_K | 5.82GB |
|
| 38 |
+
| [THaLLE-0.1-7B-fa.Q8_0.gguf](https://huggingface.co/RichardErkhov/KBTG-Labs_-_THaLLE-0.1-7B-fa-gguf/blob/main/THaLLE-0.1-7B-fa.Q8_0.gguf) | Q8_0 | 7.54GB |
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
Original model description:
|
| 44 |
+
---
|
| 45 |
+
license: apache-2.0
|
| 46 |
+
pipeline_tag: text-generation
|
| 47 |
+
language:
|
| 48 |
+
- en
|
| 49 |
+
tags:
|
| 50 |
+
- finance
|
| 51 |
+
---
|
| 52 |
+
# THaLLE: Text Hyperlocally Augmented Large Language Extension
|
| 53 |
+
|
| 54 |
+
**❗NOTICE❗**: `KBTG-Labs/THaLLE-0.1-7B-fa` is a WIP model checkpoint distributed for reproducing results in our [Technical Report](https://arxiv.org/abs/2406.07505).
|
| 55 |
+
|
| 56 |
+
## Training details
|
| 57 |
+
|
| 58 |
+
This model is a [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) fine-tuned on our Internal CFA Mock Exam 2009-2019 containing 9,426 Questions using LoRA.
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
### Vocab Config Patching
|
| 62 |
+
|
| 63 |
+
Prior to training, we patched Qwen/Qwen2-7B-Instruct's `tokenizer_config.json` `bos_token` field from `null` to the start token `"<|im_start|>"`.
|
| 64 |
+
|
| 65 |
+
```json
|
| 66 |
+
{
|
| 67 |
+
...
|
| 68 |
+
"bos_token": "<|im_start|>"
|
| 69 |
+
...
|
| 70 |
+
}
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
## Results
|
| 74 |
+
|
| 75 |
+
For more details see our [Technical Report](https://arxiv.org/abs/2406.07505).
|
| 76 |
+
|
| 77 |
+
| Model | Internal 2020 | Internal 2024 | Flare CFA* |
|
| 78 |
+
| --------------------------------------- | ------------- | ------------- | ---------- |
|
| 79 |
+
| APIs | | | |
|
| 80 |
+
| `gpt-3.5-turbo-0125` | 0.5458 | 0.5027 | 0.6366 |
|
| 81 |
+
| `gemini-1.5-flash-001` | 0.6271 | 0.6278 | 0.7355 |
|
| 82 |
+
| `gemini-1.5-pro-001` | 0.6780 | 0.6444 | 0.7829 |
|
| 83 |
+
| `gpt-4o-2024-05-13` | **0.8000** | **0.8055** | **0.8789** |
|
| 84 |
+
| HF models | | | |
|
| 85 |
+
| `"meta-llama/Llama-2-7b-chat-hf"` | 0.3774 | 0.3639 | 0.4264 |
|
| 86 |
+
| `"google/gemma-7b-it"` | 0.5107 | 0.5333 | 0.6027 |
|
| 87 |
+
| `"meta-llama/Meta-Llama-3-8B-Instruct"` | 0.5424 | 0.5222 | 0.6386 |
|
| 88 |
+
| `"Qwen/Qwen2-7B-Instruct"` | 0.5740 | 0.5583 | 0.6831 |
|
| 89 |
+
| `"KBTG-Labs/THaLLE-0.1-7B-fa"` | **0.6678** | **0.6500** | **0.7171** |
|
| 90 |
+
|
| 91 |
+
[*] Flare CFA is `"ChanceFocus/flare-cfa"`
|
| 92 |
+
|
| 93 |
+
## Usage
|
| 94 |
+
|
| 95 |
+
### Requirements
|
| 96 |
+
|
| 97 |
+
Since `KBTG-Labs/THaLLE-0.1-7B-fa` is a fine-tuned of Qwen2-7B-Instruct you will need to install `transformers>=4.37.0`.
|
| 98 |
+
|
| 99 |
+
### Reproducing results
|
| 100 |
+
|
| 101 |
+
Running the script below should give you this output:
|
| 102 |
+
|
| 103 |
+
```
|
| 104 |
+
Progress: 1032/1032 | Correct: 740 (71.71%)
|
| 105 |
+
```
|
| 106 |
+
|
| 107 |
+
```python
|
| 108 |
+
import re
|
| 109 |
+
from typing import Literal, Optional
|
| 110 |
+
|
| 111 |
+
import torch
|
| 112 |
+
from datasets import load_dataset
|
| 113 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 114 |
+
|
| 115 |
+
MODEL_ID: str = "KBTG-Labs/THaLLE-0.1-7B-fa"
|
| 116 |
+
SYSTEM_PROMPT: str = """You are a CFA (chartered financial analyst) taking a test to evaluate your knowledge of finance. You will be given a question along with three possible answers (A, B, and C).
|
| 117 |
+
Indicate the correct answer (A, B, or C)."""
|
| 118 |
+
QUESTION_TEMPLATE: str = """Question:
|
| 119 |
+
{question}
|
| 120 |
+
A. {choice_a}
|
| 121 |
+
B. {choice_b}
|
| 122 |
+
C. {choice_c}"""
|
| 123 |
+
|
| 124 |
+
|
| 125 |
+
def format_flare_cfa(text: str) -> dict[str, str]:
|
| 126 |
+
text = re.sub(r"\s+", " ", text)
|
| 127 |
+
|
| 128 |
+
pattern = r"Q:\s*(.*?),\s*CHOICES:\s*A:\s*(.*?),\s*B:\s*(.*?),\s*C:\s*(.*)"
|
| 129 |
+
match = re.search(pattern, text)
|
| 130 |
+
if match:
|
| 131 |
+
question, choice_a, choice_b, choice_c = match.groups()
|
| 132 |
+
return {
|
| 133 |
+
"question": question.strip(),
|
| 134 |
+
"choice_a": choice_a.strip(),
|
| 135 |
+
"choice_b": choice_b.strip(),
|
| 136 |
+
"choice_c": choice_c.strip(),
|
| 137 |
+
}
|
| 138 |
+
else:
|
| 139 |
+
raise ValueError("Input text does not match the expected format.")
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
def load_benchmark_dataset() -> list[dict[str, str]]:
|
| 143 |
+
dataset = load_dataset("ChanceFocus/flare-cfa")["test"]
|
| 144 |
+
prepared_dataset = []
|
| 145 |
+
for d in dataset:
|
| 146 |
+
entry = format_flare_cfa(d["text"])
|
| 147 |
+
entry["answer"] = str(d["answer"]).upper()
|
| 148 |
+
prepared_dataset.append(entry)
|
| 149 |
+
return prepared_dataset
|
| 150 |
+
|
| 151 |
+
|
| 152 |
+
def extract_choice(
|
| 153 |
+
response_text: str, choice_a: str, choice_b: str, choice_c: str
|
| 154 |
+
) -> Optional[Literal["A", "B", "C"]]:
|
| 155 |
+
def clean(text: str) -> str:
|
| 156 |
+
return text.replace("–", "-").strip().replace("\n", "")
|
| 157 |
+
|
| 158 |
+
find_choice = re.findall(
|
| 159 |
+
r"([T|t]he correct answer is[.|:]? [ABC]|[A|a]nswer[.|:]?[is]?\W+?\n?[ABC]\s)",
|
| 160 |
+
response_text,
|
| 161 |
+
)
|
| 162 |
+
|
| 163 |
+
if find_choice:
|
| 164 |
+
return clean(find_choice[0])[-1]
|
| 165 |
+
|
| 166 |
+
if len(response_text) == 1 and response_text in "ABC":
|
| 167 |
+
return response_text
|
| 168 |
+
|
| 169 |
+
find_choice = re.findall(r"[ABC][.]\s?", response_text)
|
| 170 |
+
if find_choice:
|
| 171 |
+
return find_choice[0][0]
|
| 172 |
+
|
| 173 |
+
choice = {"A": choice_a, "B": choice_b, "C": choice_c}
|
| 174 |
+
|
| 175 |
+
for ch, content in choice.items():
|
| 176 |
+
if clean(content) in clean(response_text):
|
| 177 |
+
return ch
|
| 178 |
+
|
| 179 |
+
return None
|
| 180 |
+
|
| 181 |
+
|
| 182 |
+
def inference(messages: list[dict[str, str]], model, tokenizer) -> str:
|
| 183 |
+
text = tokenizer.apply_chat_template(
|
| 184 |
+
messages,
|
| 185 |
+
tokenize=False,
|
| 186 |
+
add_generation_prompt=True,
|
| 187 |
+
)
|
| 188 |
+
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
|
| 189 |
+
|
| 190 |
+
generated_ids = model.generate(
|
| 191 |
+
model_inputs.input_ids,
|
| 192 |
+
max_new_tokens=768,
|
| 193 |
+
do_sample=False,
|
| 194 |
+
temperature=None,
|
| 195 |
+
top_p=None,
|
| 196 |
+
top_k=None,
|
| 197 |
+
)
|
| 198 |
+
generated_ids = [
|
| 199 |
+
output_ids[len(input_ids) :]
|
| 200 |
+
for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
| 201 |
+
]
|
| 202 |
+
|
| 203 |
+
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
| 204 |
+
return response
|
| 205 |
+
|
| 206 |
+
|
| 207 |
+
def run_benchmark(dataset: list[dict[str, str]], model, tokenizer):
|
| 208 |
+
total_correct = 0
|
| 209 |
+
|
| 210 |
+
for i, problem in enumerate(dataset, start=1):
|
| 211 |
+
messages = [
|
| 212 |
+
{"role": "system", "content": SYSTEM_PROMPT},
|
| 213 |
+
{"role": "user", "content": QUESTION_TEMPLATE.format(**problem)},
|
| 214 |
+
]
|
| 215 |
+
output_text = inference(messages, model, tokenizer)
|
| 216 |
+
prediction = extract_choice(
|
| 217 |
+
output_text,
|
| 218 |
+
problem["choice_a"],
|
| 219 |
+
problem["choice_b"],
|
| 220 |
+
problem["choice_c"],
|
| 221 |
+
)
|
| 222 |
+
|
| 223 |
+
correct = problem["answer"] == prediction
|
| 224 |
+
total_correct += correct
|
| 225 |
+
percent = total_correct / i * 100
|
| 226 |
+
|
| 227 |
+
print(
|
| 228 |
+
f"Progress: {i}/{len(dataset)} | Correct: {total_correct} ({percent:.2f}%)",
|
| 229 |
+
end="\r",
|
| 230 |
+
)
|
| 231 |
+
|
| 232 |
+
|
| 233 |
+
if __name__ == "__main__":
|
| 234 |
+
dataset = load_benchmark_dataset()
|
| 235 |
+
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
|
| 236 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 237 |
+
MODEL_ID,
|
| 238 |
+
torch_dtype=torch.bfloat16,
|
| 239 |
+
device_map="auto",
|
| 240 |
+
)
|
| 241 |
+
|
| 242 |
+
run_benchmark(dataset, model, tokenizer)
|
| 243 |
+
|
| 244 |
+
```
|
| 245 |
+
|
| 246 |
+
## Citation
|
| 247 |
+
|
| 248 |
+
If you find our work useful, please cite:
|
| 249 |
+
|
| 250 |
+
```
|
| 251 |
+
@misc{labs2024thalle,
|
| 252 |
+
title={THaLLE: Text Hyperlocally Augmented Large Language Extension -- Technical Report},
|
| 253 |
+
author={KBTG Labs and Danupat Khamnuansin and Atthakorn Petchsod and Anuruth Lertpiya and Pornchanan Balee and Thanawat Lodkaew and Tawunrat Chalothorn and Thadpong Pongthawornkamol and Monchai Lertsutthiwong},
|
| 254 |
+
year={2024},
|
| 255 |
+
eprint={2406.07505},
|
| 256 |
+
archivePrefix={arXiv},
|
| 257 |
+
primaryClass={cs.CL}
|
| 258 |
+
}
|
| 259 |
+
```
|
| 260 |
+
|
| 261 |
+
|