Update README.md
Browse files
README.md
CHANGED
|
@@ -63,15 +63,19 @@ tags:
|
|
| 63 |
|
| 64 |
# Model Card for Astro-HEP-BERT
|
| 65 |
|
| 66 |
-
**Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using the <a target="_blank" rel="noopener noreferrer" href="https://huggingface.co/datasets/arnosimons/astro-hep-corpus">Astro-HEP Corpus</a>, containing 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics
|
|
|
|
|
|
|
| 67 |
|
| 68 |
The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).
|
| 69 |
|
| 70 |
-
For further insights into the model, the corpus, and the underlying research project (<a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >Network Epistemology in Practice</a>) please refer to the following
|
|
|
|
|
|
|
| 71 |
|
| 72 |
-
|
| 73 |
|
| 74 |
-
|
| 75 |
|
| 76 |
|
| 77 |
## Model Details
|
|
|
|
| 63 |
|
| 64 |
# Model Card for Astro-HEP-BERT
|
| 65 |
|
| 66 |
+
**Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using the <a target="_blank" rel="noopener noreferrer" href="https://huggingface.co/datasets/arnosimons/astro-hep-corpus">Astro-HEP Corpus</a>, containing 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics. The sole training objective was **Masked Language Modeling (MLM)**.
|
| 67 |
+
|
| 68 |
+
To optimize the model's ability to embed domain-specific language, **training was conducted exclusively on entire paragraphs**, rather than packing in as many sentences as possible, as often suggested in BERT tutorials. This "full-paragraphs format" preserves sentences within their original context, which is especially meaningful in academic writing where paragraphs focus on one idea.
|
| 69 |
|
| 70 |
The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).
|
| 71 |
|
| 72 |
+
For further insights into the model, the corpus, and the underlying research project (<a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >Network Epistemology in Practice</a>) please refer to the following three papers:
|
| 73 |
+
|
| 74 |
+
1) <a target="_blank" rel="noopener noreferrer" href="https://arxiv.org/abs/2411.14877">Simons, A (2024). Astro-HEP-BERT: A bidirectional language model for studying the meanings of concepts in astrophysics and high energy physics. arXiv:2411.14877.</a>
|
| 75 |
|
| 76 |
+
2) <a target="_blank" rel="noopener noreferrer" href="https://arxiv.org/abs/2411.14073">Simons, A (2024). Meaning at the planck scale? Contextualized word embeddings for doing history, philosophy, and sociology of science. arXiv:2411.14073.</a>
|
| 77 |
|
| 78 |
+
3) <a target="_blank" rel="noopener noreferrer" href="https://arxiv.org/abs/2506.12242">Simons, A; Zichert, M; and Wüthrich, A (2025). Large Language Models for History, Philosophy, and Sociology of Science: Interpretive Uses, Methodological Challenges, and Critical Perspectives. arXiv:2506.12242.</a>
|
| 79 |
|
| 80 |
|
| 81 |
## Model Details
|