K2-Inhale 🫁 (LoRA Adapter for LLM360/K2-Think)

Author / Fine-tune: Sutan Rifky Tedjasukmana (@SutanRifkyt)
Base model: LLM360/K2-Think (credit to LLM360)
Method: QLoRA (4-bit base) with PEFT adapters
Domain: Lung CT & PET/CT findings β€” nodules, consolidation, FDG uptake, possible staging hints
Languages: English + Bahasa Indonesia (output is patient-friendly, non-radiologist tone)
Intended use: Patient-facing explanation + triage suggestion
Not intended for: Final diagnosis, treatment planning, or replacing licensed clinicians.


πŸ” What this model does

K2-Inhale is a lightweight LoRA adapter trained on top of LLM360/K2-Think to:

  1. Rewrite lung CT / PET-CT findings into patient-friendly explanation.
  2. Give a plain-language "how worrying is this for lung cancer".
  3. Suggest a next step (follow-up CT, PET-CT, tissue biopsy, urgent oncologist, etc.).

Target audience is:

  • patients who just got an imaging report and are anxious,
  • junior clinicians who want a patient-facing summary first draft.

⚠️ This model is NOT a medical device and should NOT be used for autonomous diagnosis.


🧠 How to load (recommended path = base model + LoRA)

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model_id = "LLM360/K2-Think"
adapter_id    = "SutanRifkyt/K2-Inhale"

tokenizer = AutoTokenizer.from_pretrained(
    base_model_id,
    use_fast=False,
    trust_remote_code=False
)

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=False
)

model = PeftModel.from_pretrained(
    model,
    adapter_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = """<|user|>:
Explain this chest CT finding in simple language for the patient, assess how concerning it is for lung cancer, and say what should happen next.

Clinical findings:
Spiculated 1.8 cm nodule in the right upper lobe with irregular margins and increased FDG uptake on PET.
<|assistant|>:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=300,
        temperature=0.2,
        do_sample=True,
    )

print(tokenizer.decode(output[0], skip_special_tokens=True))
⚑ Quantized version

For easier inference on smaller GPUs / single consumer cards, a quantized export is included under quantized/.

quantized/ is an experimental merged model snapshot intended for local testing / demo.
Quality may be lower vs full base+LoRA above.

Basic usage (example, adjust to your runtime):

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

quantized_id = "SutanRifkyt/K2-Inhale/quantized"

tokenizer = AutoTokenizer.from_pretrained(
    quantized_id,
    use_fast=False,
    trust_remote_code=False
)

model = AutoModelForCausalLM.from_pretrained(
    quantized_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=False
)


Note: If you see GGUF / AWQ / bitsandbytes entries, load with the correct loader for that format.

πŸ“š Training data (high-level)

~8k supervised instruction-style pairs constructed from:

public lung CT and PET/CT descriptions (incl. TCIA-like oncology cohorts),

synthetic expansions of impression/assessment text,

staged "what happens next" counseling scripts.

Each sample looks like:

instruction: "Explain this finding for the patient, include cancer concern level, and next step"

input: actual CT/PET-CT style text (nodule size, FDG uptake, etc.)

output: step-by-step reasoning and final recommendation in plain language.

🚨 Safety & limitations

This model is for triage / education, not diagnosis.

It may sound confident even when uncertain.

It has not been clinically validated.

Always involve a radiologist / oncologist for real decisions.

πŸ“š Citation / credit

Base model LLM360/K2-Think is released by the LLM360 team.
This repository only publishes LoRA/PEFT adapter weights and an optional quantized snapshot, fine-tuned by Sutan Rifky Tedjasukmana (@SutanRifkyt) for lung imaging triage.

Li, P., Wang, S., Li, T., Lu, J., HuangFu, Y., & Wang, D. (2020). A Large-Scale CT and PET/CT Dataset for Lung Cancer Diagnosis (Lung-PET-CT-Dx) [Data set].
The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2020.NNC2-0461

S. Montagna et al., "LLM-based Solutions for Healthcare Chatbots: a Comparative Analysis," 2024 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), Biarritz, France, 2024,
pp. 346-351, doi: 10.1109/PerComWorkshops59983.2024.10503257. keywords: {Pervasive computing;Privacy;Filtering;Conferences;
Computational modeling;Medical services;Chatbots;Large Language Model;Medical Chatbot;Chronic Disease Management},

Baharoon, M., Luo, L., Moritz, M., Kumar, A., Kim, S.E., Zhang, X., Zhu, M., Alabbad, M.H., Alhazmi, M.S., Mistry, N.P. and Kleinschmidt, K.R., 2025. Rexgroundingct: A 3d chest ct dataset for segmentation of findings from free-text reports. arXiv preprint arXiv:2507.22030.

Faiyazuddin, M., Rahman, S.J.Q., Anand, G., Siddiqui, R.K., Mehta, R., Khatib, M.N., Gaidhane, S., Zahiruddin, Q.S., Hussain, A. and Sah, R. (2025),The Impact of Artificial Intelligence on Healthcare: A Comprehensive Review of Advancements in Diagnostics, Treatment, and Operational Efficiency.
Health Science Reports, 8: e70312. https://doi.org/10.1002/hsr2.70312

Suhana Bedi, Yutong Liu, Lucy Orr-Ewing, Dev Dash, Sanmi Koyejo, Alison Callahan, Jason A. Fries, Michael Wornow, Akshay Swaminathan, Lisa Soleymani Lehmann, Hyo Jung Hong, Mehr Kashyap, Akash R. Chaurasia, Nirav R. Shah, Karandeep Singh, Troy Tazbaz, Arnold Milstein, Michael A. Pfeffer, Nigam H. Shah
Shool, S., Adimi, S., Saboori Amleshi, R. et al. A systematic review of large language model (LLM) evaluations in clinical medicine. BMC Med Inform Decis Mak 25, 117 (2025). https://doi.org/10.1186/s12911-025-02954-4

Cheng, Z., Fan, R., Hao, S., Killian, T.W., Li, H., Sun, S., Ren, H., Moreno, A., Zhang, D., Zhong, T. and Xiong, Y., 2025.
K2-think: A parameter-efficient reasoning system. arXiv preprint arXiv:2509.07604.

License: Apache-2.0 for adapter weights.
Underlying medical text sources may include portions of CC BY 4.0 datasets and synthetic expansions derived from them.
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SutanRifkyt/K2-Inhale

Base model

Qwen/Qwen2.5-32B
Finetuned
LLM360/K2-Think
Adapter
(3)
this model