We invite scrutiny and feedback of our benchmark.
Obviously this project was inspired by the commendable work of the team behind RTEB and MTEB.
Rather than having a legal split of a multilingual benchmark, we think it makes the most sense to have full domain coverage in a single language first. In theory, someone could do, for instance, a MLEB-F (French) and then use a mix of French, Belgian, Swiss, etc... legal documents to get full spectrum coverage in the French language.
If you are interested in doing something like that, reach out to us and we'd love to exchange notes and guidance :).
Anyways, happy benchmarking!
@fzliu
@KennethEnevoldsen
@Samoed
@isaacchung
@tomaarsen
@fzoll
@Muennighoff
@nouamanetazi
@loicmagne
@nreimers
@clem