Minh Khoe Tue Y LLM (MKTY-3B-Chat)

DOI

๐ŸŒ Documentation Language

Chinese Simplified (็ฎ€ไฝ“ไธญๆ–‡) | English | Vietnamese (Tiแบฟng Viแป‡t)

Please note that the English and Vietnamese versions of this document are translated from the Chinese version using LLM, with manual proofreading. However, discrepancies may still exist. In case of inconsistencies between the English or Vietnamese versions and the Chinese version, the Chinese version shall prevail.

Full Project Title: Minh Khoe Tue Y (Chinese Simplified: ๆ˜Žๅบทๆ…งๅŒป; Vietnamese: Minh Khแปe Tuแป‡ Y; Nom Script: ๆ˜ŽๅŠธๆ…ง้†ซ ) โ€” Design and Implementation of a Health Management and Diagnostic Assistance System Based on LLMs and Multimodal Artificial Intelligence. Abbreviation: MKTY Smart Healthcare System

๐Ÿ“– Model Overview

This model is a component of the "Minh Khoe Tue Y - Design and Implementation of a Health Management and Assisted Diagnosis System Based on LLM and Multimodal Artificial Intelligence" project (referred to as the Minh Khoe Tue Y Smart Healthcare System). It was developed as part of my undergraduate graduation project for the Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), class of 2025. The project has been open-sourced and is available at: https://github.com/duyu09/MKTY-System.

This model has been fine-tuned and optimized in the fields of medicine, healthcare, and biology, outperforming its base model, Qwen2.5-3B-Instruct. The fine-tuning process employs the LoRA algorithm and is conducted in two stages, focusing solely on the Chinese language. Initially, during the Pretrain phase, the model undergoes incremental training using medical textbooks, medical records, and healthcare-related articles. Subsequently, Supervised Fine-Tuning (SFT) is performed using corpora that include symptoms and corresponding medical records, doctor-patient dialogues (symptom descriptions and diagnoses), medical knowledge Q&A, and dialogue corpora based on the "LLM Discussion Mechanism." The total data volume is approximately 2.88GB.

Notably, the model has been optimized for the "LLM Discussion Mechanism." The specific operation of this mechanism is as follows: when answering each question, the model generates multiple results based on different contexts, simulating a scenario where "multiple individuals express their viewpoints." The system also includes a "moderator" role responsible for summarizing the viewpoints from each round of discussion. Subsequently, all participants engage in the next round of discussion based on the original question, the moderator's summary, and their respective contexts. This process iterates until the discussion results converge (i.e., the semantics become consistent) or the preset maximum number of discussion rounds is reached.

๐Ÿ”ง Hardware Requirements

For GPU inference, a minimum of 7GB of VRAM is required. If the VRAM capacity is insufficient or if no dedicated GPU is available, the MKTY-3B large model can also run using CPU + 7GB RAM.

๐Ÿš€ Usage Example

Based on the Tongyi Qianwen (Chinese: ้€šไน‰ๅƒ้—ฎ) Qwen2.5-3B-Instruct model, it can be quickly loaded and launched using the transformers library.

Model Loading

from transformers import AutoModelForCausalLM, AutoTokenizer
def load_model_and_tokenizer(model_name):
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype="auto",
        device_map="auto"
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    return model, tokenizer
def generate_response(prompt, messages, model, tokenizer, max_new_tokens=2000):
    messages.append({"role": "user", "content": prompt})
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
    generated_ids = model.generate(
        **model_inputs,
        max_new_tokens=max_new_tokens
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]
    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    messages.append({"role": "assistant", "content": response})
    return response

Standard Q&A Mode

if __name__ == "__main__":
    model_name = r"MKTY-3B-Chat"
    messages = []
    model, tokenizer = load_model_and_tokenizer(model_name)
    while True:
        prompt = input("User> ")
        if prompt == "exit":
            break
        response = generate_response(prompt, messages, model, tokenizer)
        print("MKTY>", response)

LLM Discussion Mode (Example language: Chinese Simplified)

if __name__ == "__main__":
    model_name = "MKTY-3B-Chat"
    discuss_rounds = 3
    agent_number = 3
    model, tokenizer = load_model_and_tokenizer(model_name)
    messages_arr = [[] for _ in range(agent_number)]
    while True:
        prompt = input("User> ")
        if prompt == "exit":
            break
        moderator_opinion = "ๆš‚ๆ— "
        for i in range(discuss_rounds):
            responses_arr = []
            prompt_per_round = "- ้—ฎ้ข˜๏ผš\n" + prompt + "\n - ไธŠ่ฝฎ่ฎจ่ฎบไธปๆŒไบบๆ„่ง๏ผš\n" + moderator_opinion + "\n - ่ฏทไฝ ็ป“ๅˆไธปๆŒไบบๆ„่ง๏ผŒๅฏนไธŠ่ฟฐๅŒป็–—ๆˆ–ๅŒปๅญฆไธ“ไธš็š„้—ฎ้ข˜ๅ‘่กจ่ฏฆ็ป†่ง‚็‚น๏ผŒๅฏไปฅ่ดจ็–‘ๅนถ่ฏดๆ˜Ž็†็”ฑใ€‚\n"
            for j in range(agent_number):
                messages = messages_arr[j]
                response = generate_response(prompt_per_round, messages, model, tokenizer)
                responses_arr.append(response)
                print(f"็ฌฌ{i + 1}่ฝฎ่ฎจ่ฎบ๏ผŒLLM {j + 1}่ง‚็‚น>\n", response)
                print("-------------------")
            moderator_prompt = "- ้—ฎ้ข˜๏ผš\n" + prompt + "\n\n"
            for res_index in range(len(responses_arr)):
                moderator_prompt = moderator_prompt + f"- LLM {res_index + 1}่ง‚็‚น๏ผš\n" + responses_arr[res_index] + "\n\n"
            moderator_prompt = moderator_prompt + "ๅฏนไบŽ็ป™ๅฎš็š„ๅŒป็–—็›ธๅ…ณ้—ฎ้ข˜๏ผŒ่ฏท็ปผๅˆๅ„LLM่ง‚็‚น๏ผŒ็ป“ๅˆ่‡ช่บซ็Ÿฅ่ฏ†๏ผŒๅพ—ๅ‡บไฝ ่‡ชๅทฑ็š„ๅˆคๆ–ญ๏ผŒๅฐฝๅฏ่ƒฝ่ฏฆๅฐฝ๏ผŒๅ…จ้ƒจ้ƒฝๅˆ†ๆžๅˆฐไฝ๏ผŒ่ฟ˜่ฆๅ……ๅˆ†่ฏดๆ˜Ž็†็”ฑใ€‚\n"
            moderator_opinion = generate_response(moderator_prompt, [], model, tokenizer)
            print(f"็ฌฌ{i + 1}่ฝฎ่ฎจ่ฎบ๏ผŒไธปๆŒไบบ็š„ๆ„่ง>\n", moderator_opinion)
            print("-------------------")
            clear_history(messages_arr)
    

๐ŸŽ“ Authors

โ–ˆโ–ˆ\      โ–ˆโ–ˆ\     โ–ˆโ–ˆ\   โ–ˆโ–ˆ\   โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ\  โ–ˆโ–ˆ\     โ–ˆโ–ˆ\
โ–ˆโ–ˆโ–ˆ\    โ–ˆโ–ˆโ–ˆ |    โ–ˆโ–ˆ | โ–ˆโ–ˆ  |  \__โ–ˆโ–ˆ  __| \โ–ˆโ–ˆ\   โ–ˆโ–ˆ  |
โ–ˆโ–ˆโ–ˆโ–ˆ\  โ–ˆโ–ˆโ–ˆโ–ˆ |    โ–ˆโ–ˆ |โ–ˆโ–ˆ  /      โ–ˆโ–ˆ |     \โ–ˆโ–ˆ\ โ–ˆโ–ˆ  /
โ–ˆโ–ˆ\โ–ˆโ–ˆ\โ–ˆโ–ˆ โ–ˆโ–ˆ |    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ  /       โ–ˆโ–ˆ |      \โ–ˆโ–ˆโ–ˆโ–ˆ  /
โ–ˆโ–ˆ \โ–ˆโ–ˆโ–ˆ  โ–ˆโ–ˆ |    โ–ˆโ–ˆ  โ–ˆโ–ˆ<        โ–ˆโ–ˆ |       \โ–ˆโ–ˆ  /
โ–ˆโ–ˆ |\โ–ˆ  /โ–ˆโ–ˆ |    โ–ˆโ–ˆ |\โ–ˆโ–ˆ\       โ–ˆโ–ˆ |        โ–ˆโ–ˆ |
โ–ˆโ–ˆ | \_/ โ–ˆโ–ˆ |โ–ˆโ–ˆ\ โ–ˆโ–ˆ | \โ–ˆโ–ˆ\ โ–ˆโ–ˆ\  โ–ˆโ–ˆ |โ–ˆโ–ˆ\     โ–ˆโ–ˆ |โ–ˆโ–ˆ\
\__|     \__|\__|\__|  \__|\__| \__|\__|    \__|\__|

This model is used for the graduation project of the Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences) in 2025, and is only for academic exchange. Neither I nor my supervisor teachers are responsible for any consequences arising from the use of the model.

  • ๐Ÿง‘โ€๐Ÿ’ป Project Author:

    • DU Yu (Chinese: ๆœๅฎ‡; Vietnamese: ฤแป— Vลฉ; Email: [email protected]), undergraduate student at Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences)
  • ๐Ÿซ Thesis Advisors:

    • Academic Advisor: JIANG Wenfeng (Chinese: ๅงœๆ–‡ๅณฐ; Vietnamese: Khฦฐฦกng Vฤƒn Phong), Associate professor, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences)
    • Industry Advisor: LI Jun (Chinese: ๆŽๅ›; Vietnamese: Lรฝ Quรขn), Shandong Strong (Shichuang) Software Training College, Ambow Education Group (NYSE: AMBO)

The complete project's open source address: https://github.com/duyu09/MKTY-System. Welcome to download and discuss about it.

๐Ÿ”— Links

๐Ÿ“„ Citation

@software{du_2025_17444889,
  author       = {Du, Yu},
  title        = {Minh Khoe Tue Y Smart Healthcare System},
  month        = oct,
  year         = 2025,
  publisher    = {Zenodo},
  version      = {v1.1.2},
  doi          = {10.5281/zenodo.17444889},
  url          = {https://github.com/duyu09/MKTY-System},
  swhid        = {swh:1:dir:a633243bf04e6ba18e2d5ffcf92ea57f73566f43
                   ;origin=https://doi.org/10.5281/zenodo.17444888;vi
                   sit=swh:1:snp:37dc91d2c166a07c7dc8ebac0b4be97961b0
                   267b;anchor=swh:1:rel:a88f82a5ca10d278bcc10734f5cf
                   a560286a8b47;path=duyu09-MKTY-System-8edd0c9
                  },
}
Downloads last month
20
Safetensors
Model size
3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Duyu/MKTY-3B-Chat

Base model

Qwen/Qwen2.5-3B
Finetuned
(808)
this model