nicolay-r
/

flan-t5-tsa-thor-base

@@ -10,112 +10,110 @@ pipeline_tag: text2text-generation
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 [More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 [More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
 #### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ### Results
-[More Information Needed]

 # Model Card for Model ID
 ## Model Details
 ### Model Description
+- **Developed by:** Reforged by [nicolay-r](https://github.com/nicolay-r), initial credits for implementation to [scofield7419](https://github.com/scofield7419)
+- **Model type:** [Flan-T5](https://huggingface.co/docs/transformers/en/model_doc/flan-t5)
+- **Language(s) (NLP):** English
+- **License:** [Apache License 2.0](https://github.com/scofield7419/THOR-ISA/blob/main/LICENSE.txt)
+### Model Sources
+- **Repository:** [Reasoning-for-Sentiment-Analysis-Framework](https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework)
+- **Paper [optional]:** https://arxiv.org/abs/2404.12342
+- **Demo [optional]:** https://arxiv.org/abs/2404.12342
 ## Uses
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 [More Information Needed]
+### Downstream Use
+Please refer to the [related section](https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework?tab=readme-ov-file#three-hop-chain-of-thought-thor) of the **Reasoning-for-Sentiment-Analysis** Framework
 ### Out-of-Scope Use
+This model represent a fine-tuned version of the Flan-T5 on RuSentNE-2023 dataset.
+Since dataset represent three-scale output answers (`positive`, `negative`, `neutral`),
+the behavior in general might be biased to this particular task.
 ### Recommendations
 Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
+Please proceed with the code from the related [Three-Hop-Reasoning CoT](https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework?tab=readme-ov-file#three-hop-chain-of-thought-thor) section.
+Or following the related section on [Google Colab notebook](https://colab.research.google.com/github/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework/blob/main/Reasoning_for_Sentiment_Analysis_Framework.ipynb
+)
 ## Training Details
 ### Training Data
+We utilize `train` data which was **automatically translated into English using GoogleTransAPI**.
+The initial source of the texts written in Russian, is from the following repository:
+https://github.com/dialogue-evaluation/RuSentNE-evaluation
+The translated version on the dataset in English could be automatically downloaded via the following script:
+https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework/blob/main/rusentne23_download.py
 ### Training Procedure
+This model has been trained using the Three-hop-Reasoning framework, proposed in the paper:
+https://arxiv.org/abs/2305.11255
+For training procedure accomplishing, the reforged version of this framework was used:
+https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework
+Google-colab notebook for reproduction:
+https://colab.research.google.com/github/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework/blob/main/Reasoning_for_Sentiment_Analysis_Framework.ipynb
+The overall training process took **4 epochs**.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e62d11d27a8292c3637f86/JwCP0EIe6q1VVdNrTzPQl.png)
 #### Training Hyperparameters
+- **Training regime:** All the configuration details were highlighted in the related
+ [config](https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework/blob/main/config/config.yaml) file
 [More Information Needed]
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
+The direct link to the `test` evaluation data:
+https://github.com/dialogue-evaluation/RuSentNE-evaluation/blob/main/final_data.csv
 #### Metrics
+For the model evaluation, two metrics were used:
+1. F1_PN -- F1-measure over `positive` and `negative` classes;
+2. F1_PN0 -- F1-measure over `positive`, `negative`, **and `neutral`** classes;
 ### Results
+**Result:** F1_PN = 60.024
+Below is the log of the training process that showcases the final peformance on the RuSentNE-2023 `test` set after 4 epochs (lines 5-6):
+```tsv
+    F1_PN  F1_PN0  default   mode
+0  45.523  59.375   59.375  valid
+1  62.345  70.260   70.260  valid
+2  62.722  70.704   70.704  valid
+3  62.721  70.671   70.671  valid
+4  62.357  70.247   70.247  valid
+5  60.024  68.171   68.171   test
+6  60.024  68.171   68.171   test
+```