AlekseyCalvin commited on
Commit
37d0949
·
verified ·
1 Parent(s): c9eadea

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +144 -0
README.md ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - reasoning
5
+ license: apache-2.0
6
+ datasets:
7
+ - attn-signs/gromov-0
8
+ language:
9
+ - ru
10
+ base_model:
11
+ - yandex/YandexGPT-5-Lite-8B-pretrain
12
+ ---
13
+ # GPT Reasoner (Base model)
14
+
15
+ - [EN]
16
+ Reasoning model adapted for russian text generation.
17
+ **Based on YandexGPT-pretrain**
18
+ - [RU]
19
+ Модель рассуждений, адаптированная для генерации русскоязычного текста.
20
+ **Построена на YandexGPT-pretrain**
21
+
22
+ ## Model Details / Детализация модели
23
+ - [EN]
24
+ **Cold-start SFT version** to invoke general reasoning capabilities on a specific system prompt.
25
+ This model **IS ONLY USED** for further GRPO optimizations, it cannot generate coherent russian text in this iteration.
26
+ - [RU]
27
+ **Версия cold-start SFT обучения** для возможностей размышления и глубокого понимания запроса.
28
+ Эта модель **ИСПОЛЬЗУЕТСЯ ТОЛЬКО** для дальнейших стадий обучения с GRPO.
29
+ Модель не может генерировать когерентный текст русского языка на этой итерации.
30
+
31
+
32
+ ### Model Description / Описание модели
33
+
34
+ - **Developed by:** [Reisen Raumberg (Attention Signs team)]
35
+ - **Language(s) (NLP):** [RU/EN]
36
+ - **SFT from model:** [YandexGPT-5-lite-8B-pretrain]
37
+
38
+ Utilized HF.Accelerator
39
+ **GPU hours**: ~3h of NVIDIA A100
40
+
41
+ Для обучения использовался HuggingFace Accelerator
42
+ **GPU часы**: ~3 часа NVIDIA A100
43
+
44
+ ### Training Framework
45
+ **GPTR was trained using MyLLM framework (by Attention Signs):**
46
+ --==[MyLLM](https://github.com/Raumberg/myllm)==--
47
+
48
+ ### Model configuration (MyLLM Framework)
49
+ Full SFT finetuning
50
+ ```toml
51
+ [model]
52
+ model_name_or_path = "yandex/YandexGPT-5-Lite-8B-pretrain"
53
+
54
+ [datasets]
55
+ dataset = "attn-signs/gromov-0"
56
+ conversation_field = "conversation"
57
+ generate_eval_examples = false
58
+ evaluation_strategy = "steps"
59
+ eval_steps = 100
60
+ dataloader_num_workers = 2
61
+ remove_unused_columns = true
62
+ test_size = 0.05
63
+
64
+ [run]
65
+ save_strategy = "steps"
66
+ save_steps = 300
67
+ save_total_limit = 3
68
+ run_name = "sft-gptr-8-run2"
69
+ report_to = "wandb"
70
+ logging_first_step = true
71
+ logging_steps = 1
72
+ output_dir = "models/attn-signs-gptr-8-run2"
73
+ project_name = "sft-gptr"
74
+
75
+ [training]
76
+ train_only_on_completions = true
77
+ per_device_train_batch_size = 1
78
+ per_device_eval_batch_size = 1
79
+ num_train_epochs = 3
80
+ learning_rate = 0.000009
81
+ max_seq_length = 8192
82
+ gradient_accumulation_steps = 8
83
+ gradient_checkpointing = true
84
+ warmup_steps = 10
85
+ bf16 = true
86
+ seed = 42
87
+ use_peft = false
88
+
89
+ [fusion]
90
+ attn_implementation = "flash_attention_2"
91
+
92
+ [tokenizer]
93
+ assistant_message_template = "<s>assistant\n"
94
+ eos_token = "</s>"
95
+ pad_token = "<unk>"
96
+ chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<s>' + message['role'] + '\n' + message['content'] + '</s>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<s>assistant\n' }}{% endif %}"
97
+ force_chat_template = true
98
+ added_special_tokens = [
99
+ "<think>",
100
+ "</think>"
101
+ ]
102
+ system_prompt = """
103
+ [MODE: Reflection]
104
+ """
105
+ ```
106
+
107
+ ### Using the model / Как запустить?
108
+
109
+ ```python
110
+ repo = 'attn-signs/GPTR-8-base'
111
+
112
+ model = AutoModelForCausalLM.from_pretrained(repo)
113
+ tokenizer = AutoTokenizer.from_pretrained(repo)
114
+
115
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
116
+ model.to(device)
117
+
118
+ user_prompt = '''
119
+ У уравнений x**2 + 2019ax + b = 0 и x**2 + 2019bx + a = 0 есть один общий корень. Чему может быть равен этот корень, если известно, что a != b?
120
+ '''
121
+ system_prompt = "[MODE: Reflection]"
122
+ messages = [
123
+ {"role": "system", "content": system_prompt},
124
+ {"role": "user", "content": user_prompt}
125
+ ]
126
+ text = tokenizer.apply_chat_template(
127
+ messages,
128
+ tokenize=False,
129
+ add_generation_prompt=True
130
+ )
131
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
132
+
133
+ generated_ids = model.generate(
134
+ **model_inputs,
135
+ max_new_tokens=4096
136
+ )
137
+ generated_ids = [
138
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
139
+ ]
140
+
141
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
142
+
143
+ print(response)
144
+ ```