Spaces:
Sleeping
Sleeping
update readme
Browse files
README.md
CHANGED
@@ -36,7 +36,7 @@ The response should be a json of the form:
|
|
36 |
The data used to train the classifier comes from the NADI 2021 dataset for Arabic Dialect Identification [(Abdul-Mageed et al., 2021)](#cite-mageed-2021).
|
37 |
It is a corpus of tweets collected using Twitter's API and labeled thanks to the users location with the country and region.
|
38 |
|
39 |
-
I used the language model `https://huggingface.co/moussaKam/AraBART` to extract features from the input text by taking the output of its last hidden layer. I used these
|
40 |
|
41 |
For more detail, please refer to the docs directory.
|
42 |
|
|
|
36 |
The data used to train the classifier comes from the NADI 2021 dataset for Arabic Dialect Identification [(Abdul-Mageed et al., 2021)](#cite-mageed-2021).
|
37 |
It is a corpus of tweets collected using Twitter's API and labeled thanks to the users location with the country and region.
|
38 |
|
39 |
+
I used the language model `https://huggingface.co/moussaKam/AraBART` to extract features from the input text by taking the output of its last hidden layer. I used these vector embeddings as the input for a Multinomial Logistic Regression to classify the input text into one of the 21 dialects (Countries).
|
40 |
|
41 |
For more detail, please refer to the docs directory.
|
42 |
|