matsten commited on
Commit
fcb3932
·
verified ·
1 Parent(s): 5c666cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -4,5 +4,14 @@ language:
4
  base_model:
5
  - cis-lmu/glot500-base
6
  ---
 
 
 
 
 
 
 
 
 
 
7
 
8
- Inuktitut morphological segmenter model reported in the paper: ADD WHEN PUBLISHED
 
4
  base_model:
5
  - cis-lmu/glot500-base
6
  ---
7
+ The model Glot500-m-iuseg is a fine-tuned version of the Glot500-m model. It was fine-tuned to segment Inuktitut words by morpheme boundaries and is intended to be used as a pre-processing tool for the language.
8
+
9
+
10
+ The model found in this repository is our best performing fine-tuned model described in the paper: "Surface-Level Morphological Segmentation of Low-resource Inuktitut Using Pre-trained Large Language Models" (link will be added when published)
11
+
12
+ **Datasets used:**
13
+ The Nunavut Hansard Inuktitut–English Parallel Corpus 3.0 with Preliminary Machine Translation Results: https://aclanthology.org/2020.lrec-1.312/
14
+
15
+ **Method used:**
16
+ LLMSegm: Surface-level Morphological Segmentation Using Large Language Model: https://aclanthology.org/2024.lrec-main.933/
17