metadata
tags:
- spacy
- token-classification
language:
- ja
license: cc-by-sa-3.0
model-index:
- name: ja_core_news_trf
results:
- task:
name: NER
type: token-classification
metrics:
- name: NER Precision
type: precision
value: 0.8227383863
- name: NER Recall
type: recall
value: 0.8465408805
- name: NER F Score
type: f_score
value: 0.8344699318
- task:
name: TAG
type: token-classification
metrics:
- name: TAG (XPOS) Accuracy
type: accuracy
value: 0.9713282143
- task:
name: POS
type: token-classification
metrics:
- name: POS (UPOS) Accuracy
type: accuracy
value: 0.979409718
- task:
name: MORPH
type: token-classification
metrics:
- name: Morph (UFeats) Accuracy
type: accuracy
value: 0
- task:
name: LEMMA
type: token-classification
metrics:
- name: Lemma Accuracy
type: accuracy
value: 0.9670499959
- task:
name: UNLABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Unlabeled Attachment Score (UAS)
type: f_score
value: 0.9304880245
- task:
name: LABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Labeled Attachment Score (LAS)
type: f_score
value: 0.9178365731
- task:
name: SENTS
type: token-classification
metrics:
- name: Sentences F-Score
type: f_score
value: 0.9507246377
Details: https://spacy.io/models/ja#ja_core_news_trf
Japanese transformer pipeline (Transformer(name='cl-tohoku/bert-base-japanese-char-v2', piece_encoder='char', stride=160, type='bert', width=768, window=216, vocab_size=6144)). Components: transformer, morphologizer, parser, ner.
| Feature | Description |
|---|---|
| Name | ja_core_news_trf |
| Version | 3.7.2 |
| spaCy | >=3.7.0,<3.8.0 |
| Default Pipeline | transformer, morphologizer, parser, attribute_ruler, ner |
| Components | transformer, morphologizer, parser, attribute_ruler, ner |
| Vectors | 0 keys, 0 unique vectors (0 dimensions) |
| Sources | UD Japanese GSD v2.8 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel) UD Japanese GSD v2.8 NER (Megagon Labs Tokyo) cl-tohoku/bert-base-japanese-char-v2 (Inui Laboratory, Tohoku University) |
| License | CC BY-SA 3.0 |
| Author | Explosion |
Label Scheme
View label scheme (64 labels for 3 components)
| Component | Labels |
|---|---|
morphologizer |
POS=NOUN, POS=ADP, POS=VERB, POS=SCONJ, POS=AUX, POS=PUNCT, POS=PART, POS=DET, POS=NUM, POS=ADV, POS=PRON, POS=ADJ, POS=PROPN, POS=CCONJ, POS=SYM, POS=NOUN|Polarity=Neg, POS=AUX|Polarity=Neg, POS=INTJ, POS=SCONJ|Polarity=Neg |
parser |
ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct |
ner |
CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART |
Accuracy
| Type | Score |
|---|---|
TOKEN_ACC |
99.37 |
TOKEN_P |
97.64 |
TOKEN_R |
97.88 |
TOKEN_F |
97.76 |
POS_ACC |
97.94 |
MORPH_ACC |
0.00 |
MORPH_MICRO_P |
34.01 |
MORPH_MICRO_R |
98.04 |
MORPH_MICRO_F |
50.51 |
SENTS_P |
93.18 |
SENTS_R |
97.04 |
SENTS_F |
95.07 |
DEP_UAS |
93.05 |
DEP_LAS |
91.78 |
TAG_ACC |
97.13 |
LEMMA_ACC |
96.70 |
ENTS_P |
82.27 |
ENTS_R |
84.65 |
ENTS_F |
83.45 |