Update README.md
Browse files
README.md
CHANGED
|
@@ -66,6 +66,8 @@ For more details, including benchmark evaluation, hardware requirements, and inf
|
|
| 66 |
| WritingBench | 83.2 | 78.4 | 85.3 | 83.1 | 79.1 | 80.3 | **88.3** |
|
| 67 |
| **Agent** | | | | | | | |
|
| 68 |
| BFCL-v3 | 63.8 | 67.2 | **72.4** | 67.2 | 61.8 | 70.8 | 71.9 |
|
|
|
|
|
|
|
| 69 |
| TAU2-Retail | 64.9 | 71.0 | **76.3** | 71.3 | - | 40.4 | 71.9 |
|
| 70 |
| TAU2-Airline | 60.0 | 59.0 | **70.0** | 60.0 | - | 30.0 | 58.0 |
|
| 71 |
| TAU2-Telecom | 33.3 | 42.0 | **60.5** | 37.4 | - | 21.9 | 45.6 |
|
|
|
|
| 66 |
| WritingBench | 83.2 | 78.4 | 85.3 | 83.1 | 79.1 | 80.3 | **88.3** |
|
| 67 |
| **Agent** | | | | | | | |
|
| 68 |
| BFCL-v3 | 63.8 | 67.2 | **72.4** | 67.2 | 61.8 | 70.8 | 71.9 |
|
| 69 |
+
| TAU1-Retail | 63.9 | 71.8 | 73.9 | **74.8** | - | 54.8 | 67.8 |
|
| 70 |
+
| TAU1-Airline | **53.5** | 49.2 | 52.0 | 52.0 | - | 26.0 | 46.0 |
|
| 71 |
| TAU2-Retail | 64.9 | 71.0 | **76.3** | 71.3 | - | 40.4 | 71.9 |
|
| 72 |
| TAU2-Airline | 60.0 | 59.0 | **70.0** | 60.0 | - | 30.0 | 58.0 |
|
| 73 |
| TAU2-Telecom | 33.3 | 42.0 | **60.5** | 37.4 | - | 21.9 | 45.6 |
|