Spaces:
Running
Running
File size: 3,760 Bytes
3ecbc02 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
# TOSRoberta: Terms of Service Analyzer ππ€
[](https://github.com/HimanshuMohanty-Git24/TOSRoberta/blob/main/LICENSE)
[](https://github.com/HimanshuMohanty-Git24/TOSRoberta/stargazers)
[](https://github.com/HimanshuMohanty-Git24/TOSRoberta/network)
[](https://github.com/HimanshuMohanty-Git24/TOSRoberta/issues)
TOSRoberta is an advanced Terms of Service (ToS) analyzer powered by a fine-tuned RoBERTa-large model. It classifies clauses in ToS documents based on their fairness level, helping users quickly identify potentially unfair terms.

## π Features
- π Analyzes ToS documents and classifies clauses into three categories:
- β
Clearly Fair
- β οΈ Potentially Unfair
- β Clearly Unfair
- π Supports both PDF and text file uploads
- π» User-friendly web interface built with Streamlit
- π§ Powered by a fine-tuned RoBERTa-large model (CodeHima/Tos-Roberta)
## π Model Performance
Our Tos-Roberta model demonstrates strong performance on the task of ToS clause classification:
- **Validation Accuracy**: 89.64%
- **Test Accuracy**: 85.84%
Detailed performance metrics per epoch:
| Epoch | Training Loss | Validation Loss | Accuracy | F1 Score | Precision | Recall |
|-------|---------------|-----------------|----------|----------|-----------|----------|
| 1 | 0.443500 | 0.398950 | 0.874699 | 0.858838 | 0.862516 | 0.874699 |
| 2 | 0.416400 | 0.438409 | 0.853012 | 0.847317 | 0.849916 | 0.853012 |
| 3 | 0.227700 | 0.505879 | 0.896386 | 0.893325 | 0.891521 | 0.896386 |
| 4 | 0.052600 | 0.667532 | 0.891566 | 0.893167 | 0.895115 | 0.891566 |
| 5 | 0.124200 | 0.747090 | 0.884337 | 0.887412 | 0.891807 | 0.884337 |
## π Project Structure
```
tos-analyzer/
β
βββ app.py
βββ requirements.txt
βββ utils/
β βββ __init__.py
β βββ text_processing.py
β βββ model_utils.py
βββ README.md
```
## π οΈ Installation
1. Clone the repository:
```
git clone https://github.com/HimanshuMohanty-Git24/TOSRoberta.git
cd TOSRoberta
```
2. Install the required dependencies:
```
pip install -r requirements.txt
```
3. Run the Streamlit app:
```
streamlit run app.py
```
## π Training Visualization
We used Weights & Biases for monitoring the training process. Here's a glimpse of our training metrics:

## π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Acknowledgements
- [Hugging Face](https://huggingface.co/) for the Transformers library
- [Streamlit](https://streamlit.io/) for the easy-to-use web app framework
- [Weights & Biases](https://wandb.ai/) for experiment tracking
## π¬ Contact
Himanshu Mohanty - [CodingHima](https://x.com/CodingHima) - codehimanshu24@gmail.com
Project Link: [https://github.com/HimanshuMohanty-Git24/TOSRoberta](https://github.com/HimanshuMohanty-Git24/TOSRoberta)
---
βοΈ If you find this project useful, please consider giving it a star!
|