---
license: apache-2.0
datasets:
- N8Programs/CreativeGPT
base_model:
- Qwen/Qwen3-14B
---


# VellumMini-0.1-Qwen3-14B
Just a sneak peek of what I'm cooking in a little project called Vellum. This model was made to evaluate the quality of the CreativeGPT dataset, and how well Qwen3 trains on it. This is just one of many datasets that the final model will be trained on (which will also be using a different base model). 

This got pretty good results compared to the regular instruct in my testing so thought I would share. I trained for 3 epochs, but both checkpoints at 2 epoch and 3 epoch were too overbaked. This checkpoint, at 1 epoch performed best. 

I'm pretty surprised how decent this came out since Qwen models aren't that great at writing, especially at this size.

### Usage

Use with thinking/chain-of-thought disabled. Use ChatML prompt format.

Qwen suggested sampler settings are recommended. 

Temperature: 0.7

Top_P: 0.8

Top_K: 20

Min_P: 0

## Quants

### GGUFs

#### iMatrix

These are reccommended.

- bartowski - https://huggingface.co/bartowski/lemon07r_VellumMini-0.1-Qwen3-14B-GGUF
- mradermacher - https://huggingface.co/mradermacher/VellumMini-0.1-Qwen3-14B-i1-GGUF

#### Static

- mradermacher - https://huggingface.co/mradermacher/VellumMini-0.1-Qwen3-14B-GGUF
- Q4_K_M Only - https://huggingface.co/lemon07r/VellumMini-0.1-Qwen3-14B-Q4_K_M-GGUF

## Special Thanks

Big thanks to everyone over at the KoboldAI discord. The members there have helped me a ton with various things over the long while I've been there.

## Training Details
### Parent Model
https://huggingface.co/Qwen/Qwen3-14B

### Training Method
Full fine-tune - SFT

### Dataset(s)
https://huggingface.co/datasets/N8Programs/CreativeGPT

### Training Hyperparameters
```
Batch size
4

Learning rate
0.00001

Number of epochs
3

Warmup ratio
0.05

Weight decay
0.02

Max gradient norm
1

Packing
No

```
### Training Results

![Screenshot_20251005_020153](https://cdn-uploads.huggingface.co/production/uploads/65751ccd1488186315b841e6/TBtH-6CD7gnbZQVdlfRpW.webp)