gsarti
/

opus-mt-tc-base-en-ja

text2text-generation

Model card Files Files and versions

opus-mt-tc-base-en-ja / README.md

gsarti's picture

Add multilingual to the language tag (#1)

355f253 almost 3 years ago

|

history blame contribute delete

1.64 kB

	---
	language:
	- en
	- ja
	- multilingual
	license: cc-by-4.0
	tags:
	- translation
	- opus-mt-tc
	model-index:
	- name: opus-mt-tc-base-en-ja
	results:
	- task:
	type: translation
	name: Translation eng-jpg
	dataset:
	name: tatoeba-test-v2021-08-07
	type: tatoeba_mt
	args: eng-jpg
	metrics:
	- type: bleu
	value: 15.2
	name: BLEU
	---

	# Opus Tatoeba English-Japanese

	This model was obtained by running the script [convert_marian_to_pytorch.py](https://github.com/huggingface/transformers/blob/master/src/transformers/models/marian/convert_marian_to_pytorch.py). The original models were trained by [J�rg Tiedemann](https://blogs.helsinki.fi/tiedeman/) using the [MarianNMT](https://marian-nmt.github.io/) library. See all available `MarianMTModel` models on the profile of the [Helsinki NLP](https://huggingface.co/Helsinki-NLP) group.

	* dataset: opus+bt
	* model: transformer-align
	* source language(s): eng
	* target language(s): jpn
	* model: transformer-align
	* pre-processing: normalization + SentencePiece (spm32k,spm32k)
	* download: [opus+bt-2021-04-10.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-jpn/opus+bt-2021-04-10.zip)
	* test set translations: [opus+bt-2021-04-10.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-jpn/opus+bt-2021-04-10.test.txt)
	* test set scores: [opus+bt-2021-04-10.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-jpn/opus+bt-2021-04-10.eval.txt)

	## Benchmarks

	\| testset \| BLEU \| chr-F \| #sent \| #words \| BP \|
	\|---------\|-------\|-------\|-------\|--------\|----\|
	\| Tatoeba-test.eng-jpn \| 15.2 \| 0.258 \| 10000 \| 99206 \| 1.000 \|