Synthetic baselines trained for our paper "Scaling Low-Resource MT via Synthetic Data Generation with LLMs" accepted as a main in EMNLP 2025.
			
	
	AI & ML interests
At the University of Helsinki, we focus on: - NLP for morphologically-rich languages - Cross-lingual NLP - NLP in the humanities
Recent Activity
	View all activity
	
			Organization Card
		
		Helsinki-NLP refers to the language technology research group at the University of Helsinki. Here, we publish various resource related to multilingual NLP, machine translation, text simplification to name a few application areas. We focus on wide language coverage, open data sets and public pre-trained models.
multilingual translation models trained on the Tatoeba Translation Challenge dataset (from OPUS) and a massively multilingual Bible corpus
			
	
	- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-aav-fra_ita_por_spaTranslation • 0.2B • Updated • 19 • 2
- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-afa-enTranslation • 0.2B • Updated • 12 • 1
- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-afa-deu_eng_nldTranslation • 0.2B • Updated • 6 • 2
- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-afa-deu_eng_fra_por_spaTranslation • 0.2B • Updated • 111
Synthetic baselines trained for our paper "Scaling Low-Resource MT via Synthetic Data Generation with LLMs" accepted as a main in EMNLP 2025.
			
	
	multilingual translation models trained on the Tatoeba Translation Challenge dataset (from OPUS) and a massively multilingual Bible corpus
			
	
	- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-aav-fra_ita_por_spaTranslation • 0.2B • Updated • 19 • 2
- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-afa-enTranslation • 0.2B • Updated • 12 • 1
- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-afa-deu_eng_nldTranslation • 0.2B • Updated • 6 • 2
- 
	
	
	  Helsinki-NLP/opus-mt-tc-bible-big-afa-deu_eng_fra_por_spaTranslation • 0.2B • Updated • 111
			models
			1,534
		
			
	
	
	
	
	 
				Helsinki-NLP/opus-mt-synthetic-en-eu
		
	
				Updated
					
				
				• 
					
					172
				
	
				• 
					
					1
				
 
				Helsinki-NLP/opus-mt-synthetic-en-mk
		
	
				Updated
					
				
				• 
					
					8
				
	
				
				
 
				Helsinki-NLP/opus-mt-synthetic-en-ka
		
	
				Updated
					
				
				• 
					
					5
				
	
				
				
 
				Helsinki-NLP/opus-mt-synthetic-en-so
		
	
				Updated
					
				
				• 
					
					217
				
	
				• 
					
					1
				
 
				Helsinki-NLP/opus-mt-synthetic-en-is
		
	
				Updated
					
				
				• 
					
					10
				
	
				• 
					
					1
				
 
				Helsinki-NLP/opus-mt-synthetic-en-uk
		
	
				Updated
					
				
				• 
					
					18
				
	
				
				
 
				Helsinki-NLP/opus-mt-synthetic-en-gd
		
	
				Updated
					
				
				• 
					
					18
				
	
				
				
 
				Helsinki-NLP/simple-finnish-gpt3-xl
			Text Generation
			• 
		
				1B
			• 
	
				Updated
					
				
				• 
					
					17
				
	
				• 
					
					1
				
 
				Helsinki-NLP/opus-mt-tc-bible-big-deu_eng_fra_por_spa-mul
			Translation
			• 
		
				0.2B
			• 
	
				Updated
					
				
				• 
					
					716
				
	
				• 
					
					1
				
 
				Helsinki-NLP/opus-mt-tc-bible-big-mul-deu_eng_fra_por_spa
			Translation
			• 
		
				0.2B
			• 
	
				Updated
					
				
				• 
					
					22
				
	
				• 
					
					2
				
			datasets
			51
		
			
	
	
	
	
	Helsinki-NLP/nemotron-cc-translated
			Preview
			• 
	
				Updated
					
				
	
				• 
					
					2.78k
				
				
				
Helsinki-NLP/fineweb-edu-translated
			Preview
			• 
	
				Updated
					
				
	
				• 
					
					349k
				
				• 
					
					1
				
Helsinki-NLP/OpenSubtitles2024
			Viewer
			• 
	
				Updated
					
				• 
			
			570M
	
				• 
					
					156
				
				• 
					
					2
				
Helsinki-NLP/shroom
			Preview
			• 
	
				Updated
					
				
	
				• 
					
					4
				
				
				
Helsinki-NLP/mu-shroom
			Viewer
			• 
	
				Updated
					
				• 
			
			11.5k
	
				• 
					
					77
				
				• 
					
					4
				
Helsinki-NLP/tatoeba_mt_train
			Viewer
			• 
	
				Updated
					
				• 
			
			13.7B
	
				• 
					
					1.07k
				
				• 
					
					1
				
Helsinki-NLP/tatoeba_mt
	
				Updated
					
				
	
				• 
					
					1.71k
				
				• 
					
					61
				
Helsinki-NLP/un_pc
			Viewer
			• 
	
				Updated
					
				• 
			
			323M
	
				• 
					
					2.3k
				
				• 
					
					23
				
Helsinki-NLP/un_ga
			Viewer
			• 
	
				Updated
					
				• 
			
			1.11M
	
				• 
					
					146
				
				• 
					
					3
				
Helsinki-NLP/opus_books
			Viewer
			• 
	
				Updated
					
				• 
			
			1.25M
	
				• 
					
					25k
				
				• 
					
					79
				
