julian-schelb commited on
Commit
f091ca6
1 Parent(s): b18e680

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -24,6 +24,16 @@ datasets:
24
 
25
  ## Model description
26
 
 
 
 
 
 
 
 
 
 
 
27
  ## About RoBERTa
28
 
29
  This model is a fine-tuned version of [XLM-RoBERTa](https://huggingface.co/xlm-roberta-large). The original model was pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116) by Conneau et al. and first released in [this repository](https://github.com/pytorch/fairseq/tree/master/examples/xlmr).
@@ -38,10 +48,6 @@ This way, the model learns an inner representation of 100 languages that can the
38
 
39
  This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
40
 
41
- ## Training data
42
-
43
- ## Metrics
44
-
45
  ## Usage
46
 
47
  You can use this model by using the AutoTokenize and AutoModelForTokenClassification class:
 
24
 
25
  ## Model description
26
 
27
+ ## Training data
28
+
29
+ ## Evaluation results
30
+
31
+ This model achieves the following results (meassured using the validation portion of the [wikiann](https://huggingface.co/datasets/wikiann)):
32
+
33
+ | Metric | Value |
34
+ |:------:|:----:|
35
+ |loss | 87.6 |
36
+
37
  ## About RoBERTa
38
 
39
  This model is a fine-tuned version of [XLM-RoBERTa](https://huggingface.co/xlm-roberta-large). The original model was pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116) by Conneau et al. and first released in [this repository](https://github.com/pytorch/fairseq/tree/master/examples/xlmr).
 
48
 
49
  This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
50
 
 
 
 
 
51
  ## Usage
52
 
53
  You can use this model by using the AutoTokenize and AutoModelForTokenClassification class: