k050506koch
/

GPT4-dev-177M-1511

Text Generation

feature-extraction

Model card Files Files and versions

k050506koch commited on 24 days ago

Commit

0668470

·

verified ·

1 Parent(s): 9b5b518

Include Evals information

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -25,6 +25,9 @@ All core training and inference code lives in this repository (see `train.py`, `
 - **Objective:** Next-token prediction on web text (causal language modeling).
 - **Use cases:** General text generation, experimentation, and as a base for future instruction-tuned models.
 - **Status:** Undertrained research checkpoint – expect rough edges and occasional incoherence. I didn't stop training so more checkpoints will be published in the future.
 I plan to continue training and to release instruction-tuned variants based on this model in the future.

 - **Objective:** Next-token prediction on web text (causal language modeling).
 - **Use cases:** General text generation, experimentation, and as a base for future instruction-tuned models.
 - **Status:** Undertrained research checkpoint – expect rough edges and occasional incoherence. I didn't stop training so more checkpoints will be published in the future.
+- **Evals:** 29.03% on MMLU
+More detailed in EVALS.md
 I plan to continue training and to release instruction-tuned variants based on this model in the future.