LFM2 2.6B comparIA-votes DPO

This repository contains a fine-tuned version of the LiquidAI/LFM2-2.6B model, which has been aligned with French user preferences (available in the comparia-votes dataset) using Direct Preference Optimization (DPO).

Training Details

The model was a first experiment and so was fine-tuned on a single epoch. Key training metrics are as follows:

  • Training Time: 2 hours, 31 minutes, 5 seconds
  • Training Loss: 0.6179
  • Validation Loss: 0.6095
  • Rewards/Accuracies: 0.6660

How to Use

You can load and use this model with the Hugging Face transformers library or vllm, for more info follow the vanilla model running guide.

Downloads last month
9
Safetensors
Model size
3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for monsimas/LFM2-2.6B-French-ComparIA-Votes-DPO

Base model

LiquidAI/LFM2-2.6B
Finetuned
(8)
this model

Dataset used to train monsimas/LFM2-2.6B-French-ComparIA-Votes-DPO

Space using monsimas/LFM2-2.6B-French-ComparIA-Votes-DPO 1