agentlans
/

deberta-v3-xsmall-quality

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

agentlans commited on Sep 27

Commit

e45cd6c

•

1 Parent(s): 08d031a

Update README.md

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -6,6 +6,17 @@ tags:
 model-index:
 - name: deberta-v3-xsmall-quality
   results: []
 ---
 # English Text Quality Classifier
@@ -27,7 +38,7 @@ The **deberta-v3-xsmall-quality** model is designed to evaluate text quality by
 ## Training and Evaluation Data
-The model was trained on a dataset comprising **100,000 sentences** sourced from five distinct datasets, with **20,000 sentences** drawn from each of the following:
 1. **allenai/c4**
 2. **HuggingFaceFW/fineweb-edu**
@@ -37,6 +48,8 @@ The model was trained on a dataset comprising **100,000 sentences** sourced from
 This diverse dataset enables the model to generalize well across different text types and domains.
 ## How to use
 ```python

 model-index:
 - name: deberta-v3-xsmall-quality
   results: []
+license: mit
+datasets:
+- agentlans/text-quality
+- allenai/c4
+- HuggingFaceFW/fineweb-edu
+- monology/pile-uncopyrighted
+- agentlans/common-crawl-sample
+- agentlans/wikipedia-paragraphs
+language:
+- en
+pipeline_tag: text-classification
 ---
 # English Text Quality Classifier
 ## Training and Evaluation Data
+The model was trained on the [agentlans/text-quality](https://huggingface.co/datasets/agentlans/text-quality) dataset comprising **100,000 sentences** sourced from five distinct datasets, with **20,000 sentences** drawn from each of the following:
 1. **allenai/c4**
 2. **HuggingFaceFW/fineweb-edu**
 This diverse dataset enables the model to generalize well across different text types and domains.
+90% of the rows were used for training and the remaining 10% for evaluation.
 ## How to use
 ```python