Liquid AI
Try LFMDocsLEAPDiscord

🇯🇵 LFM2.5-1.2B-JP-202606

Liquid AI

LFM2.5-1.2B-JP-202606 is our latest general purpose Japanese chat model, delivering significant improvements in knowledge, instruction following, math, code, and tool-use over both the models of comparable size and LFM2.5-1.2B-JP. It sets a new benchmark for state-of-the-art performance in Japanese language understanding. Ideal for developers building Japanese-language applications where cultural and linguistic nuance matter.

LFM2.5-1.2B-JP-202606 は、当社の最新の汎用日本語チャットモデルです。知識、指示追従、数学、コード、ツール使用の各領域において、同規模の他モデルおよび LFM2.5-1.2B-JP の双方を大幅に上回る改善を実現しています。日本語全般における最高水準のベンチマーク性能を発揮します。 文化的・言語的なニュアンスが重要となる日本語アプリケーションを構築する開発者に最適です。

Find more information about LFM2.5 in our blog post.

📊 Performance

Liquid AI

We compared LFM2.5-1.2B-JP-202606 with relevant sub-2B models on a diverse suite of benchmarks.

Model Size Knowledge Instruction Following Math Code Tool Use Domain Avg
JMMLU‑ProX JMMLU JCulture JGPQA Avg J‑MIFEval JFBench1 Avg J‑GSM8K J‑MATH500 Avg JHumanEval+ J‑BFCLv32
LFM2.5‑1.2B‑JP‑202606 1.2B 36.2354.1935.7728.6938.72 79.0854.7766.93 62.2062.8062.50 49.39 48.00 53.11
LFM2.5‑1.2B‑Instruct 1.2B 31.4247.6128.4231.7234.79 40.4436.6738.56 50.2050.0050.10 28.66 46.29 39.68
Qwen3‑1.7B (Instruct) 1.7B 30.7847.6733.3326.2634.51 40.2936.6138.45 46.0056.4051.20 47.56 52.45 44.83
Granite‑4.0‑1B 1.5B 15.3233.9334.3824.4427.02 27.5631.2629.41 42.8025.4034.10 51.22 50.57 38.46
Llama‑3.2‑1B‑Instruct 1.2B 15.9133.9722.5232.3226.18 24.1021.7822.94 25.2011.4018.30 17.68 21.06 21.23
Gemma‑3‑1B‑it 1.0B 14.1234.4523.4224.2424.06 26.3131.1528.73 33.6015.6024.60 25.00 17.26 23.93
sarashina2.2‑1b‑instruct‑v0.1 1.4B 18.340.2425.5326.2627.58 21.927.4124.66 44.424.834.60 21.95 13.86 24.53
TinySwallow‑1.5B‑Instruct 1.5B 21.5147.9831.1729.2932.49 36.5534.2535.40 47.222.434.80 26.83 11.7 28.24
llm‑jp‑3.1‑1.8b‑instruct4 1.9B 17.4443.0527.4217.6826.40 33.7730.9232.35 52.817.034.90 35.37 11.76 28.16
RakutenAI‑2.0‑mini‑instruct 1.5B 11.4631.8429.6722.2223.80 28.0624.6626.36 24.811.418.10 28.6 11.85 21.74

1 JFBench is evaluated using single-instruction prompts.
2 quickTestingOSSHandler is used for models that do not support function calling (sarashina2.2‑1b‑instruct‑v0.1, TinySwallow‑1.5B‑Instruct, llm‑jp‑3.1‑1.8b‑instruct4, and RakutenAI‑2.0‑mini‑instruct).

🗒️ Model Details

Model Parameters Description
LFM2.5-1.2B-Base 1.2B Pre-trained base model for fine-tuning
LFM2.5-1.2B-Instruct 1.2B General-purpose instruction-tuned model
LFM2.5-1.2B-Thinking 1.2B General-purpose reasoning model
LFM2.5-1.2B-JP-202606 1.2B Japanese-capable chat model
LFM2.5-VL-1.6B 1.6B Vision-language model with fast inference
LFM2.5-Audio-1.5B 1.5B Audio-language model for speech and text I/O
LFM2.5-Audio-1.5B-JP 1.5B Japanese-capable audio model for speech and text I/O

LFM2.5-1.2B-JP-202606 is a general-purpose text-only model with the following features:

  • Number of parameters: 1.17B
  • Number of layers: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks)
  • Training budget: 31.5T tokens
  • Context length: 32,768 tokens
  • Vocabulary size: 65,536
  • Knowledge cutoff: Mid-2024
  • Languages: English, Japanese
  • Generation parameters:
    • temperature: 0.1
    • top_k: 50
    • repetition_penalty: 1.05
Model Description
LFM2.5-1.2B-JP-202606 Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM.
LFM2.5-1.2B-JP-202606-GGUF Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage.
LFM2.5-1.2B-JP-202606-ONNX ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile).
LFM2.5-1.2B-JP-202606-MLX MLX format for Apple Silicon. Optimized for fast inference on Mac devices using the MLX framework.

We recommend using it for agentic workflows, tool use, structured outputs, bilingual English–Japanese assistants, and on-device personal-assistant applications. It is not recommended for knowledge-intensive tasks. It performs best when given clear, explicit instructions that define the task, expected behavior, and output format.

エージェント型ワークフロー、ツール使用、構造化出力、日英バイリンガルアシスタント、オンデバイスのパーソナルアシスタントでの利用を推奨します。一方で、詳細な知識を要するのタスクには推奨されません。タスク内容、期待される動作、出力形式を明確かつ具体的に指示することで、最も高い性能を発揮します。

Chat Template

LFM2.5 uses a ChatML-like format. See the Chat Template documentation for details. Example:

<|startoftext|><|im_start|>system
You are a helpful assistant trained by Liquid AI.<|im_end|>
<|im_start|>user
日本の首都は?<|im_end|>
<|im_start|>assistant

You can use tokenizer.apply_chat_template() to format your messages automatically.

Tool Use

LFM2.5 supports function calling as follows:

  1. Function definition: We recommend providing the list of tools as a JSON object in the system prompt. You can also use the tokenizer.apply_chat_template() function with tools.
  2. Function call: By default, LFM2.5 writes Pythonic function calls (a Python list between <|tool_call_start|> and <|tool_call_end|> special tokens), as the assistant answer. You can override this behavior by asking the model to output JSON function calls in the system prompt.
  3. Function execution: The function call is executed, and the result is returned as a "tool" role.
  4. Final answer: LFM2 interprets the outcome of the function call to address the original user prompt in plain text.

See the Tool Use documentation for the full guide. Example:

<|startoftext|><|im_start|>system
List of tools: [{"name": "get_candidate_status", "description": "採用プロセスにおける候補者の現在のステータスを取得します", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "候補者の一意の識別子"}}, "required": ["candidate_id"]}}]<|im_end|>
<|im_start|>user
候補者ID 12345 の現在のステータスは何ですか?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>候補者ID 12345 の現在のステータスを確認しています。<|im_end|>
<|im_start|>tool
[{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|>
<|im_start|>assistant
ID 12345 の候補者は現在、Clinical Research Associate のポジションで「面接予定」の段階にあり、面接日は 2023年11月20日に設定されています。<|im_end|>

🏃 Inference

LFM2.5 is supported by many inference frameworks. See the Inference documentation for the full list.

Name Description Docs Notebook
Transformers Simple inference with direct access to model internals. Link Colab link
vLLM High-throughput production deployments with GPU. Link Colab link
llama.cpp Cross-platform inference with CPU offloading. Link Colab link
MLX Apple's machine learning framework optimized for Apple Silicon. Link
LM Studio Desktop application for running LLMs locally. Link

Here's a quick start example with Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_id = "LFM2.5-1.2B-JP-202606"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
prompt = "日本の首都は?"
input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
).to(model.device)
output = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.1,
    top_k=50,
    repetition_penalty=1.05,
    max_new_tokens=512,
    streamer=streamer,
)

🔧 Fine-Tuning

We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.

Name Description Docs Notebook
CPT (Unsloth) Continued Pre-Training using Unsloth for text completion. Link Colab link
CPT (Unsloth) Continued Pre-Training using Unsloth for translation. Link Colab link
SFT (Unsloth) Supervised Fine-Tuning with LoRA using Unsloth. Link Colab link
SFT (TRL) Supervised Fine-Tuning with LoRA using TRL. Link Colab link
DPO (TRL) Direct Preference Optimization with LoRA using TRL. Link Colab link
GRPO (Unsloth) GRPO with LoRA using Unsloth. Link Colab link
GRPO (TRL) GRPO with LoRA using TRL. Link Colab link

📬 Contact

Citation

@article{liquidai2025lfm2,
  title={LFM2 Technical Report},
  author={Liquid AI},
  journal={arXiv preprint arXiv:2511.23404},
  year={2025}
}
Downloads last month
189
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LiquidAI/LFM2.5-1.2B-JP-202606

Finetuned
(35)
this model
Quantizations
6 models

Collections including LiquidAI/LFM2.5-1.2B-JP-202606

Paper for LiquidAI/LFM2.5-1.2B-JP-202606