Spaces:

zaidmehdi
/

arabic-dialect-classifier

Running

zaidmehdi commited on Mar 2, 2024

Commit

9c2badc

1 Parent(s): 3e97f18

update readme

Files changed (1) hide show

README.md CHANGED Viewed

@@ -36,7 +36,7 @@ The response should be a json of the form:
 The data used to train the classifier comes from the NADI 2021 dataset for Arabic Dialect Identification [(Abdul-Mageed et al., 2021)](#cite-mageed-2021).
 It is a corpus of tweets collected using Twitter's API and labeled thanks to the users location with the country and region.
-I used the language model `https://huggingface.co/moussaKam/AraBART` to extract features from the input text by taking the output of its last hidden layer. I used these word embeddings as the input for a Multinomial Logistic Regression to classify the input text into one of the 21 dialects (Countries).
 For more detail, please refer to the docs directory.

 The data used to train the classifier comes from the NADI 2021 dataset for Arabic Dialect Identification [(Abdul-Mageed et al., 2021)](#cite-mageed-2021).
 It is a corpus of tweets collected using Twitter's API and labeled thanks to the users location with the country and region.
+I used the language model `https://huggingface.co/moussaKam/AraBART` to extract features from the input text by taking the output of its last hidden layer. I used these vector embeddings as the input for a Multinomial Logistic Regression to classify the input text into one of the 21 dialects (Countries).
 For more detail, please refer to the docs directory.