m3rg-iitd commited on
Commit
1929191
·
verified ·
1 Parent(s): 0d2334c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -15
README.md CHANGED
@@ -14,49 +14,97 @@ tags:
14
  - table understanding
15
  - table data parsing
16
  ---
 
17
  # Model Card for LLaMat-3-Chat
18
 
19
- **LLaMat-3-Chat** is a specialized large language model designed to serve as a copilot for materials research. Finetuned from **LLaMat-3**, this model is adapted for tasks such as information extraction from material science text and tabular data.
20
 
21
- ---
22
 
23
- ## Overview
 
 
24
 
25
  - **Model Type:** Large Language Model (LLM)
26
  - **Base Model:** LLaMat-3 (continued pretraining of LLaMA-3 on material science data)
27
  - **Language:** English
28
  - **License:** LLaMA-3 License
29
  - **Tags:** Material Science, Domain Adaptation, Table Understanding, Scientific Data Parsing, Materials Copilot
 
30
 
31
  ---
32
 
33
- ## Model Details
34
 
35
- ### Key Features
36
- - **Instruction Following Abilities:** Optimized for understanding and processing instructions in the material science domain.
37
- - **Domain-Specific Expertise:** Pretrained on material science tokens, enabling high performance in scientific applications.
38
- - **Applications:** information extraction, table understanding, and parsing data for research tasks.
 
39
 
40
- ### Development and Support
41
- - **Developed by:** [M3RG, IIT Delhi](https://github.com/M3RG-IITD/) & [DAIR, IIT Delhi](https://github.com/dair-iitd)
42
- - **Compute Support:**
43
- - **Edinburgh International Data Facility (EIDF):** Provided access to Cerebras CS2 clusters for pretraining.
44
- - **IIT Delhi High-Performance Computing Cluster:** Supported fine-tuning and inference stages.
45
 
46
  ---
47
 
48
  ## Technical Specifications
49
 
50
  ### Hardware Infrastructure
 
51
  - **Pretraining:** 2 Cerebras CS-2 Wafer-Scale Engines (WSE-2)
52
  - **Finetuning:** 8 NVIDIA A100 80GB GPUs
53
  - **Inferencing:** 1 NVIDIA A100 80GB GPU
54
 
55
  ### Software Stack
 
56
  - **Frameworks:** PyTorch, Hugging Face Transformers
57
 
58
  ---
59
 
60
- ## Model Sources
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  - **Repository:** [LLaMat-3 on GitHub](https://github.com/M3RG-IITD/llamat)
62
- - **Compute Resources:** [EIDF Cerebras CS Clusters](https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  - table understanding
15
  - table data parsing
16
  ---
17
+
18
  # Model Card for LLaMat-3-Chat
19
 
20
+ ## Overview
21
 
22
+ **LLaMat-3-Chat** is a specialized large language model designed to serve as an AI copilot for materials research. Finetuned from **LLaMat-3**, this model is adapted for tasks such as information extraction from material science text and tabular data. It provides advanced capabilities in scientific data processing, assisting researchers in analyzing and interpreting material science literature, reports, and datasets.
23
 
24
+ For more details, refer to our paper: [Foundational Large Language Models for Materials Research](https://arxiv.org/abs/2412.09560).
25
+
26
+ ### Model Details
27
 
28
  - **Model Type:** Large Language Model (LLM)
29
  - **Base Model:** LLaMat-3 (continued pretraining of LLaMA-3 on material science data)
30
  - **Language:** English
31
  - **License:** LLaMA-3 License
32
  - **Tags:** Material Science, Domain Adaptation, Table Understanding, Scientific Data Parsing, Materials Copilot
33
+ - **Developed by:** [M3RG, IIT Delhi](https://github.com/M3RG-IITD/) & [DAIR, IIT Delhi](https://github.com/dair-iitd)
34
 
35
  ---
36
 
37
+ ## Intended Use
38
 
39
+ LLaMat-3-Chat is designed to assist researchers, scientists, and industry professionals in:
40
+ - Extracting structured information from material science texts and tables.
41
+ - Analyzing experimental results and processing large datasets.
42
+ - Assisting in literature review and knowledge discovery.
43
+ - Supporting research-driven natural language queries related to material science.
44
 
45
+ This model is intended for academic and industrial research purposes.
 
 
 
 
46
 
47
  ---
48
 
49
  ## Technical Specifications
50
 
51
  ### Hardware Infrastructure
52
+
53
  - **Pretraining:** 2 Cerebras CS-2 Wafer-Scale Engines (WSE-2)
54
  - **Finetuning:** 8 NVIDIA A100 80GB GPUs
55
  - **Inferencing:** 1 NVIDIA A100 80GB GPU
56
 
57
  ### Software Stack
58
+
59
  - **Frameworks:** PyTorch, Hugging Face Transformers
60
 
61
  ---
62
 
63
+ ## Training Data
64
+
65
+ LLaMat-3-Chat was trained on a curated corpus of material science literature, scientific papers, structured datasets, and technical reports. The training set includes:
66
+
67
+ - material science research papers published in journals of Elsevier and Springer.
68
+ - Material science community discourse
69
+ - Redpajama dataset
70
+ - Openorca instruction finetuning dataset
71
+ - mathQA dataset
72
+ - MatSciNLP benchmark dataset
73
+ - task specific datasets (mentioned in Table A.2 in [Foundational Large Language Models for Materials Research](https://arxiv.org/abs/2412.09560).)
74
+
75
+ ---
76
+
77
+ ## Results
78
+
79
+ detailed results and comparison with existing models can be read from [Foundational Large Language Models for Materials Research](https://arxiv.org/abs/2412.09560).
80
+
81
+ ---
82
+
83
+ ### Development and Support
84
+ - **Developed by:** [M3RG, IIT Delhi](https://github.com/M3RG-IITD/) & [DAIR, IIT Delhi](https://github.com/dair-iitd)
85
+ - **Compute Support:**
86
+ - **IIT Delhi High-Performance Computing Cluster:** Supported fine-tuning and inference stages.
87
+ - **Edinburgh International Data Facility (EIDF):** [EIDF Cerebras CS Clusters](https://edinburgh-international-data-facility.ed.ac.uk/services/computing/cerebras-cs) provided access to Cerebras CS2 clusters for pretraining.
88
+
89
+ ---
90
+
91
+ ## Repository with training and evaluation code
92
+
93
  - **Repository:** [LLaMat-3 on GitHub](https://github.com/M3RG-IITD/llamat)
94
+
95
+ ---
96
+
97
+ ## Citation
98
+
99
+ If you use LLaMat-3-Chat in your research, please cite our work:
100
+
101
+ ```
102
+ @article{LLaMat-3,
103
+ author = {Vaibhav Mishra and Somaditya Singh and Dhruv Ahlawat and Mohd Zaki and Vaibhav Bihani and Hargun Singh Grover and Biswajit Mishra and Santiago Miret and Mausam and N. M. Anoop Krishnan},
104
+ title = {Foundational Large Language Models for Materials Research},
105
+ journal = {arXiv preprint arXiv:2412.09560},
106
+ year = {2024},
107
+ url = {https://arxiv.org/abs/2412.09560}
108
+ }
109
+
110
+ ```