Biomedical Language Models are Robust to Sub-optimal Tokenization
Paper
•
2306.17649
•
Published
•
1
We replicate the PubMedBERT model using the same data, hardware and code as our new BioVocabBERT model to ensure their comparion is fair.
Details about our pre-training procedure and downstream results can be found in our BioNLP @ ACL 2023 paper.