Update README.md
Browse files
README.md
CHANGED
|
@@ -43,14 +43,13 @@ I've found the `cross-encoders/roberta-large-stsb` model to be very useful in cr
|
|
| 43 |
They're simple to use, fast and very accurate.
|
| 44 |
|
| 45 |
The Ettin series followed up with new encoders trained on the ModernBERT architecture, with a range of sizes, starting at 17M.
|
| 46 |
-
|
| 47 |
-
however, the reduced parameters and computationally efficient interleaved local/global attention layers make this a very fast model,
|
| 48 |
which can easily process a few hundred sentence pairs per second on CPU, and a few thousand per second on my A6000.
|
| 49 |
|
| 50 |
---
|
| 51 |
|
| 52 |
## Features
|
| 53 |
-
- **High performing:** Achieves **Pearson: 0.
|
| 54 |
- **Efficient architecture:** Based on the Ettin-encoder design (17M parameters), offering very fast inference speeds.
|
| 55 |
- **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals.
|
| 56 |
- **Diversified training:** Pretrained on `dleemiller/wiki-sim` and fine-tuned on `sentence-transformers/stsb`.
|
|
@@ -65,7 +64,7 @@ which can easily process a few hundred sentence pairs per second on CPU, and a f
|
|
| 65 |
| `ModernCE-base-sts` | **0.9162** | **0.9122** | **8192** | 149M | **Fast** |
|
| 66 |
| `stsb-roberta-large` | 0.9147 | - | 512 | 355M | Slow |
|
| 67 |
| `stsb-distilroberta-base` | 0.8792 | - | 512 | 82M | Fast |
|
| 68 |
-
| `EttinX-sts-xxs` |
|
| 69 |
|
| 70 |
|
| 71 |
---
|
|
@@ -107,8 +106,8 @@ Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingf
|
|
| 107 |
|
| 108 |
### Validation Results
|
| 109 |
The model achieved the following test set performance after fine-tuning:
|
| 110 |
-
- **Pearson Correlation:** 0.
|
| 111 |
-
- **Spearman Correlation:** 0.
|
| 112 |
|
| 113 |
---
|
| 114 |
|
|
|
|
| 43 |
They're simple to use, fast and very accurate.
|
| 44 |
|
| 45 |
The Ettin series followed up with new encoders trained on the ModernBERT architecture, with a range of sizes, starting at 17M.
|
| 46 |
+
The reduced parameters and computationally efficient interleaved local/global attention layers make this a very fast model,
|
|
|
|
| 47 |
which can easily process a few hundred sentence pairs per second on CPU, and a few thousand per second on my A6000.
|
| 48 |
|
| 49 |
---
|
| 50 |
|
| 51 |
## Features
|
| 52 |
+
- **High performing:** Achieves **Pearson: 0.8316** and **Spearman: 0.8211** on the STS-Benchmark test set.
|
| 53 |
- **Efficient architecture:** Based on the Ettin-encoder design (17M parameters), offering very fast inference speeds.
|
| 54 |
- **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals.
|
| 55 |
- **Diversified training:** Pretrained on `dleemiller/wiki-sim` and fine-tuned on `sentence-transformers/stsb`.
|
|
|
|
| 64 |
| `ModernCE-base-sts` | **0.9162** | **0.9122** | **8192** | 149M | **Fast** |
|
| 65 |
| `stsb-roberta-large` | 0.9147 | - | 512 | 355M | Slow |
|
| 66 |
| `stsb-distilroberta-base` | 0.8792 | - | 512 | 82M | Fast |
|
| 67 |
+
| `EttinX-sts-xxs` | 0.8316 | 0.8211 | **8192** | 17M | **Very Fast** |
|
| 68 |
|
| 69 |
|
| 70 |
---
|
|
|
|
| 106 |
|
| 107 |
### Validation Results
|
| 108 |
The model achieved the following test set performance after fine-tuning:
|
| 109 |
+
- **Pearson Correlation:** 0.8316
|
| 110 |
+
- **Spearman Correlation:** 0.8211
|
| 111 |
|
| 112 |
---
|
| 113 |
|