hfl
/

cino-large

hfl-rc commited on Oct 23, 2021

Commit

a96c43c

1 Parent(s): b847c69

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -16,7 +16,15 @@ Multilingual Pre-trained Language Model, such as mBERT, XLM-R, provide multiling
 We have seen rapid progress on building multilingual PLMs in recent year.
 However, there is a lack of contributions on building PLMs on Chines minority languages, which hinders researchers from building powerful NLP systems.
-To address the absence of Chinese minority PLMs, Joint Laboratory of HIT and iFLYTEK Research (HFL) proposes CINO (Chinese-miNOrity pre-trained language model), which is built on XLM-R with additional pre-training using Chinese minority corpus, such as Tibetan, Mongolian (Uighur form), Uyghur, Kazakh (Arabic form), Korean, Zhuang, Cantonese, etc.
 Please read our GitHub repository for more details (Chinese): https://github.com/ymcui/Chinese-Minority-PLM

 We have seen rapid progress on building multilingual PLMs in recent year.
 However, there is a lack of contributions on building PLMs on Chines minority languages, which hinders researchers from building powerful NLP systems.
+To address the absence of Chinese minority PLMs, Joint Laboratory of HIT and iFLYTEK Research (HFL) proposes CINO (Chinese-miNOrity pre-trained language model), which is built on XLM-R with additional pre-training using Chinese minority corpus, such as
+- Chinese，中文（zh）
+- Tibetan，藏语（bo）
+- Mongolian (Uighur form)，蒙语（mn）
+- Uyghur，维吾尔语（ug）
+- Kazakh (Arabic form)，哈萨克语（kk）
+- Korean，朝鲜语（ko）
+- Zhuang，壮语
+- Cantonese，粤语（yue）
 Please read our GitHub repository for more details (Chinese): https://github.com/ymcui/Chinese-Minority-PLM