Update README.md
Browse files
README.md
CHANGED
|
@@ -52,7 +52,7 @@ datasets:
|
|
| 52 |
---
|
| 53 |
# Granite-3.1-Earthen-v0.3-1B-A400M-QLoRA
|
| 54 |
|
| 55 |
-
[`ibm-granite/granite-3.1-1b-a400m-instruct`](https://huggingface.co/ibm-granite/granite-3.1-1b-a400m-instruct) was trained at 8K with batch size 4 gradient accumulation 4, so each step was 131,072 tokens (including any padding tokens). It was trained for
|
| 56 |
|
| 57 |
This is a small test run. A larger version is planned.
|
| 58 |
|
|
|
|
| 52 |
---
|
| 53 |
# Granite-3.1-Earthen-v0.3-1B-A400M-QLoRA
|
| 54 |
|
| 55 |
+
[`ibm-granite/granite-3.1-1b-a400m-instruct`](https://huggingface.co/ibm-granite/granite-3.1-1b-a400m-instruct) was trained at 8K with batch size 4 gradient accumulation 4, so each step was 131,072 tokens (including any padding tokens). It was trained for 560 steps, adding up to a total of 73,400,320 unique tokens seen.
|
| 56 |
|
| 57 |
This is a small test run. A larger version is planned.
|
| 58 |
|