unb-lamfo-nlp-mcti
/

NLP-Classification-MCTI

English

Clsssification

science

Model card Files Files and versions Community

MarcosDib commited on Dec 12, 2022

Commit

a854db5

1 Parent(s): 0d2996b

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -8

README.md CHANGED Viewed

@@ -83,7 +83,7 @@ Other 24 smaller models are released afterward.
 The detailed release history can be found on the [here](https://huggingface.co/unb-lamfo-nlp-mcti) on github.
 | Model                        | #params | Language |
-|------------------------------|--------------------|-------|
 | [`mcti-base-uncased`]        | 110M    | English  |
 | [`mcti-large-uncased`]       | 340M    | English  | sub
 | [`mcti-base-cased`]          | 110M    | English  |
@@ -91,7 +91,7 @@ The detailed release history can be found on the [here](https://huggingface.co/u
 | [`-base-multilingual-cased`] | 110M    | Multiple |
 | Dataset                              | Compatibility to base* |
-|--------------------------------------|------------------------|
 | Labeled MCTI                         | 100%                   |
 | Full MCTI                            | 100%                   |
 | BBC News Articles                    | 56.77%                 |
@@ -202,14 +202,13 @@ The following assumptions were considered:
 - Preprocessing experiments compare accuracy in a shallow neural network (SNN);
 - Pre-processing was investigated for the classification goal.
-From the Database obtained in Meta 4, stored in the project's [GitHub](github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/scraps-desenvolvimento/Rotulagem/db_PPF_validacao_para%20UNB_%20FINAL), a Notebook was developed in [Google Colab](colab.research.google.com)
-to implement the [pre-processing code](github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/pre-
-processamento/Pre_Processamento/MCTI_PPF_Pr%C3%A9_processamento), which also can be found on the project's GitHub.
 Several Python packages were used to develop the preprocessing code:
 |                         Objective                      |   Package    |
-|--------------------------------------------------------|--------------|
 | Resolve contractions and slang usage in text           | [contractions](https://pypi.org/project/contractions) |
 | Natural Language Processing                            | [nltk](https://pypi.org/project/nltk)         |
 | Others data manipulations and calculations included in Python 3.10: io, json, math, re (regular expressions), shutil, time, unicodedata;    | [numpy](https://pypi.org/project/numpy)        |
@@ -217,7 +216,7 @@ Several Python packages were used to develop the preprocessing code:
 | http library                                           | [requests](https://pypi.org/project/requests)     |
 | Training model                                         | [scikit-learn](https://pypi.org/project/scikit-learn) |
 | Machine learning                                       | [tensorflow](https://pypi.org/project/tensorflow)   |
-| Machine learning                                       | [keras](keras.io)        |
 | Translation from multiple languages to English         | [translators](https://pypi.org/project/translators)  |
@@ -225,7 +224,7 @@ As detailed in the notebook on [GitHub](https://github.com/mcti-sefip/mcti-sefip
 bases, derived from the base of goal 4, with the application of the methods shown in Figure 2.
 |  Base  |                   Textos originais                           |
-|--------|--------------------------------------------------------------|
 | xp1    | Expandir Contrações                                          |
 | xp2    | Expandir Contrações + Transformar texto em minúsculo         |
 | xp3    | Expandir Contrações + Remover Pontuação                      |

 The detailed release history can be found on the [here](https://huggingface.co/unb-lamfo-nlp-mcti) on github.
 | Model                        | #params | Language |
+|:----------------------------:|:-------:|:--------:|
 | [`mcti-base-uncased`]        | 110M    | English  |
 | [`mcti-large-uncased`]       | 340M    | English  | sub
 | [`mcti-base-cased`]          | 110M    | English  |
 | [`-base-multilingual-cased`] | 110M    | Multiple |
 | Dataset                              | Compatibility to base* |
+|:------------------------------------:|:----------------------:|
 | Labeled MCTI                         | 100%                   |
 | Full MCTI                            | 100%                   |
 | BBC News Articles                    | 56.77%                 |
 - Preprocessing experiments compare accuracy in a shallow neural network (SNN);
 - Pre-processing was investigated for the classification goal.
+From the Database obtained in Meta 4, stored in the project's [GitHub](github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/scraps-desenvolvimento/Rotulagem/db_PPF_validacao_para%20UNB_%20FINAL.xlsx), a Notebook was developed in [Google Colab](colab.research.google.com)
+to implement the [pre-processing code](github.com/mcti-sefip/mcti-sefip-ppfcd2020/blob/pre-processamento/Pre_Processamento/MCTI_PPF_Pr%C3%A9_processamento.ipynb), which also can be found on the project's GitHub.
 Several Python packages were used to develop the preprocessing code:
 |                         Objective                      |   Package    |
+|:------------------------------------------------------:|:------------:|
 | Resolve contractions and slang usage in text           | [contractions](https://pypi.org/project/contractions) |
 | Natural Language Processing                            | [nltk](https://pypi.org/project/nltk)         |
 | Others data manipulations and calculations included in Python 3.10: io, json, math, re (regular expressions), shutil, time, unicodedata;    | [numpy](https://pypi.org/project/numpy)        |
 | http library                                           | [requests](https://pypi.org/project/requests)     |
 | Training model                                         | [scikit-learn](https://pypi.org/project/scikit-learn) |
 | Machine learning                                       | [tensorflow](https://pypi.org/project/tensorflow)   |
+| Machine learning                                       | [keras](https://keras.io/)        |
 | Translation from multiple languages to English         | [translators](https://pypi.org/project/translators)  |
 bases, derived from the base of goal 4, with the application of the methods shown in Figure 2.
 |  Base  |                   Textos originais                           |
+|:------:|:------------------------------------------------------------:|
 | xp1    | Expandir Contrações                                          |
 | xp2    | Expandir Contrações + Transformar texto em minúsculo         |
 | xp3    | Expandir Contrações + Remover Pontuação                      |