Update README.md

76dad95 verified 2 months ago

808 Bytes

metadata

library_name: transformers
tags:
  - trl
  - dpo
datasets:
  - ministere-culture/comparia-votes
language:
  - fr
base_model:
  - LiquidAI/LFM2-2.6B

LFM2 2.6B comparIA-votes DPO

This repository contains a fine-tuned version of the LiquidAI/LFM2-2.6B model, which has been aligned with French user preferences (available in the comparia-votes dataset) using Direct Preference Optimization (DPO).

Training Details

The model was a first experiment and so was fine-tuned on a single epoch. Key training metrics are as follows:

Training Time: 2 hours, 31 minutes, 5 seconds
Training Loss: 0.6179
Validation Loss: 0.6095
Rewards/Accuracies: 0.6660

How to Use

You can load and use this model with the Hugging Face transformers library or vllm, for more info follow the vanilla model running guide.