--- library_name: transformers datasets: - kurakurai/luth-sft language: - fr - en base_model: - LiquidAI/LFM2-350M pipeline_tag: text-generation license: other license_name: lfm1.0 license_link: LICENSE tags: - liquid - lfm2 - luth --- ![Luth x LFM2](media/logo_collab.png) # Luth-LFM2-350M **Luth-LFM2-350M** is a French fine-tuned version of [LFM2-350M](https://huggingface.co/LiquidAI/LFM2-350M) in collaboration with Liquid AI, trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable. Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote, to further detail our recipe. ![Luth-LFM2 graph](media/lfm2-luth.png) ## Model Details The model was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged back with LFM2-350M. This process successfully retained the model's English capabilities while improving its performance in French. ## Benchmark Results We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`. ### French Benchmark Scores | Model | IFEval
French | GPQA-Diamond
French | MMLU
French | Math500
French | Arc-Challenge
French | Hellaswag
French | | --------------------- | ------------------ | ------------------------ | ---------------- | ------------------- | ------------------------- | --------------------- | | **Luth-LFM2-350M** | 38.26 | 26.40 | 39.15 | 23.00 | 34.13 | 43.39 | | LFM2-350M | 31.55 | 28.93 | 38.63 | 18.00 | 33.36 | 39.13 | | SmolLM2-360M-Instruct | 21.50 | 28.43 | 26.14 | 3.20 | 26.60 | 32.94 | ### English Benchmark Scores | Model | IFEval
English | GPQA-Diamond
English | MMLU
English | Math500
English | Arc-Challenge
English | Hellaswag
English | | --------------------- | ------------------- | ------------------------- | ----------------- | -------------------- | -------------------------- | ---------------------- | | **Luth-LFM2-350M** | 57.05 | 28.28 | 44.36 | 23.20 | 34.81 | 45.92 | | LFM2-350M | 56.81 | 27.27 | 44.79 | 20.87 | 34.27 | 45.07 | | SmolLM2-360M-Instruct | 33.95 | 20.71 | 26.18 | 3.00 | 35.41 | 52.17 | ## Code Example ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("kurakurai/Luth-LFM2-350M") model = AutoModelForCausalLM.from_pretrained("kurakurai/Luth-LFM2-350M") messages = [ {"role": "user", "content": "Quelle est la capitale de la France?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=100) print( tokenizer.decode( outputs[0][inputs["input_ids"].shape[-1] :], skip_special_tokens=True ) ) ``` ## Citation ```bibtex @misc{luth2025kurakurai, title = {Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer}, author = {Lasbordes, Maxence and Gad, Sinoué}, year = {2025}, howpublished = {\url{https://arxiv.org/abs/2510.05846}}, note = {arXiv:2510.05846} } ```