Difference between this and v1.0
#1
by
matous-volf
- opened
What is the difference between this model and the version 1.0? This one doesn't have the model card filled. Is it ready to use? And what about the ModernBERT versions? Thanks!
The ModernBERT versions are ready to use. But they are slightly less performant than the DeBERTa models and we only recommend using those on a GPU with flash attention. Otherwise they will be slightly slower than the DeBERTa models.
You are free to use version 1.1 but should not notice any significant difference between 1.0 and 1.1. We trained the 1.1 versions primarily to collect more data on overfitting for the paper's reviewers. If we don't need to make any additional changes before publication, we will update the model cards for the 1.1 models.