You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

RareSeek-R1: A specialized language model for rare disease diagnosis and reasoning

RareSeek-R1 is a domain-specialized large language model for rare-disease diagnostic reasoning, developed through a Progressive Parameter-Efficient Transfer Learning framework. The model is first instruction-tuned on the clinically grounded RareMed-Corpus, a large, multi-source dataset deeply integrated from medical textbooks, guidelines, biomedical literature, and real-world EHR narratives. It is then fine-tuned on RareMed-CoT, a high-fidelity corpus designed to instill explicit, stepwise clinical reasoning aligned with real diagnostic workflows. To further enhance factual reliability, GraphRAG is incorporated to anchor the model’s inference to up-to-date variant–gene–phenotype–disease relationships. This retrieval augmentation substantially reduces hallucinations, improves factual calibration, and yields notable performance gains—particularly when EHR narratives are combined with prioritized genetic variants. Together, RareSeek-R1 performs direct reasoning over full-length EHRs, leverages graph-grounded retrieval, and demonstrably augments clinician-level diagnostic accuracy, advancing a reliable and scalable AI paradigm for rare-disease diagnosis.

RareSeek-R1 Teaser Image

RareMedData: https://huggingface.co/datasets/TaoMedAI/RareMedData

Downloads last month
-
Safetensors
Model size
71B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TaoMedAI/RareSeek-R1

Finetuned
(17)
this model