Spaces:
Sleeping
Sleeping
Fix typos
Browse files
app.py
CHANGED
@@ -140,31 +140,34 @@ st.info("I crafted this dataset using a more powerful LLM and scripts, no need f
|
|
140 |
`https://huggingface.co/datasets/wgcv/website-title-description`
|
141 |
|
142 |
# Models
|
143 |
-
|
144 |
|
145 |
Given the substantial volume of data, training a model from scratch was deemed impractical. Instead, our approach focused on evaluating the performance of existing pre-trained models as a baseline. This strategy served as an optimal starting point for developing a custom, lightweight model tailored to our specific use case: enhancing browser tab organization and efficiently summarizing the core concepts of favorited websites.
|
146 |
|
147 |
### T5-small
|
148 |
-
- The [T5-small](https://huggingface.co/wgcv/tidy-tab-model-t5-small) model is a
|
149 |
-
- It's a text-to-text model
|
150 |
-
- It's a general model for all NLP tasks
|
151 |
-
- The task is defined by the input format
|
152 |
-
- To perform summarization, prefix the text with 'summarize:'
|
153 |
-
- 60.5M parameters
|
154 |
-
- Disclaimer:
|
|
|
|
|
|
|
155 |
|
156 |
### Pegasus-xsum
|
157 |
-
- The [Pegasus-xsum](https://huggingface.co/wgcv/tidy-tab-model-pegasus-xsum) model is a
|
158 |
-
- It's a text-to-text model
|
159 |
-
- It's a specialized summarization model
|
160 |
-
- 570M params
|
161 |
|
162 |
### Bart-large
|
163 |
-
- The [Bart-large](https://huggingface.co/wgcv/tidy-tab-model-bart-large-cnn) model is a
|
164 |
- Prior to our fine-tuning, it was fine-tuned on the CNN/Daily Mail dataset.
|
165 |
- It's a BART model, using a transformer encoder-decoder (seq2seq) architecture.
|
166 |
- BART models typically perform better with small datasets compared to text-to-text models.
|
167 |
-
- 406M params
|
168 |
|
169 |
|
170 |
|
|
|
140 |
`https://huggingface.co/datasets/wgcv/website-title-description`
|
141 |
|
142 |
# Models
|
143 |
+
The objective of the project was to show that it was possible to create a small ML model from a bigger LLM model that could achieve good or better results in specific tasks compared to the original LLM
|
144 |
|
145 |
Given the substantial volume of data, training a model from scratch was deemed impractical. Instead, our approach focused on evaluating the performance of existing pre-trained models as a baseline. This strategy served as an optimal starting point for developing a custom, lightweight model tailored to our specific use case: enhancing browser tab organization and efficiently summarizing the core concepts of favorited websites.
|
146 |
|
147 |
### T5-small
|
148 |
+
- The [T5-small](https://huggingface.co/wgcv/tidy-tab-model-t5-small) model is a fine-tuned of google-t5/t5-small.
|
149 |
+
- It's a text-to-text model.
|
150 |
+
- It's a general model for all NLP tasks.
|
151 |
+
- The task is defined by the input format.
|
152 |
+
- To perform summarization, prefix the text with 'summarize:'.
|
153 |
+
- 60.5M parameters.
|
154 |
+
- Disclaimer: The model was retrained once more because poor inference was observed.
|
155 |
+
|
156 |
+
|
157 |
+
|
158 |
|
159 |
### Pegasus-xsum
|
160 |
+
- The [Pegasus-xsum](https://huggingface.co/wgcv/tidy-tab-model-pegasus-xsum) model is a fine-tuned of google/pegasus-xsum.
|
161 |
+
- It's a text-to-text model.
|
162 |
+
- It's a specialized summarization model.
|
163 |
+
- 570M params.
|
164 |
|
165 |
### Bart-large
|
166 |
+
- The [Bart-large](https://huggingface.co/wgcv/tidy-tab-model-bart-large-cnn) model is a fine-tuned of facebook/bart-large-cnn.
|
167 |
- Prior to our fine-tuning, it was fine-tuned on the CNN/Daily Mail dataset.
|
168 |
- It's a BART model, using a transformer encoder-decoder (seq2seq) architecture.
|
169 |
- BART models typically perform better with small datasets compared to text-to-text models.
|
170 |
+
- 406M params.
|
171 |
|
172 |
|
173 |
|