DuckyBlender/diegogpt-v2-mlx-bf16

This model DuckyBlender/diegogpt-v2-mlx-bf16 is a full fine-tune of Qwen/Qwen3-0.6B-MLX-bf16, trained on the complete set of public replies from a specific individual.

Training was conducted using mlx-lm version 0.26.0. It ran for 15 steps with a batch size of 16, completing in a few seconds on a MacBook Pro M1 Pro (8-core CPU, 16GB RAM). Peak memory usage was 8.3GB. The dataset contained 225 low-quality training pairs (240 lines trained total).

Run with system prompt /no_think and the following generation parameters:

  • --temp 0.7
  • --top-p 0.8
  • --top-k 20
  • --min-p 0

Example usage:

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("DuckyBlender/diegogpt-v2-mlx-bf16")

prompt = "are you red hat hacker?"

if tokenizer.chat_template is not None:
    messages = [
        {"role": "user", "content": user_input}
    ]
    prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False, enable_thinking=False)
else:
    prompt = user_input

sampler = make_sampler(temp=0.7, top_p=0.8, top_k=20, min_p=0)

response = mlx_lm.generate(
    model,
    tokenizer,
    prompt=prompt,
    sampler=sampler,
    verbose=True
)

Or directly via CLI:

mlx_lm.generate \
  --model "DuckyBlender/diegogpt-v2-mlx-bf16" \
  --temp 0.7 \
  --top-p 0.8 \
  --top-k 20 \
  --min-p 0 \
  --system "/no_think" \
  --prompt "are you red hat hacker?"

Model uses ~1.25GB RAM during inference.

Downloads last month
415
Safetensors
Model size
0.6B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DuckyBlender/diegogpt-v2-mlx-bf16

Finetuned
(1)
this model