File size: 3,760 Bytes
3ecbc02
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
# TOSRoberta: Terms of Service Analyzer πŸ“œπŸ€–

[![GitHub license](https://img.shields.io/github/license/HimanshuMohanty-Git24/TOSRoberta.svg)](https://github.com/HimanshuMohanty-Git24/TOSRoberta/blob/main/LICENSE)
[![GitHub stars](https://img.shields.io/github/stars/HimanshuMohanty-Git24/TOSRoberta.svg)](https://github.com/HimanshuMohanty-Git24/TOSRoberta/stargazers)
[![GitHub forks](https://img.shields.io/github/forks/HimanshuMohanty-Git24/TOSRoberta.svg)](https://github.com/HimanshuMohanty-Git24/TOSRoberta/network)
[![GitHub issues](https://img.shields.io/github/issues/HimanshuMohanty-Git24/TOSRoberta.svg)](https://github.com/HimanshuMohanty-Git24/TOSRoberta/issues)

TOSRoberta is an advanced Terms of Service (ToS) analyzer powered by a fine-tuned RoBERTa-large model. It classifies clauses in ToS documents based on their fairness level, helping users quickly identify potentially unfair terms.

![image](https://github.com/HimanshuMohanty-Git24/TOSRoberta/assets/94133298/c4f6a293-9109-4e63-86e6-766dc16ad589)


## 🌟 Features

- πŸ“Š Analyzes ToS documents and classifies clauses into three categories:
  - βœ… Clearly Fair
  - ⚠️ Potentially Unfair
  - ❌ Clearly Unfair
- πŸ“ Supports both PDF and text file uploads
- πŸ’» User-friendly web interface built with Streamlit
- 🧠 Powered by a fine-tuned RoBERTa-large model (CodeHima/Tos-Roberta)

## πŸš€ Model Performance

Our Tos-Roberta model demonstrates strong performance on the task of ToS clause classification:

- **Validation Accuracy**: 89.64%
- **Test Accuracy**: 85.84%

Detailed performance metrics per epoch:

| Epoch | Training Loss | Validation Loss | Accuracy | F1 Score | Precision | Recall   |
|-------|---------------|-----------------|----------|----------|-----------|----------|
| 1     | 0.443500      | 0.398950        | 0.874699 | 0.858838 | 0.862516  | 0.874699 |
| 2     | 0.416400      | 0.438409        | 0.853012 | 0.847317 | 0.849916  | 0.853012 |
| 3     | 0.227700      | 0.505879        | 0.896386 | 0.893325 | 0.891521  | 0.896386 |
| 4     | 0.052600      | 0.667532        | 0.891566 | 0.893167 | 0.895115  | 0.891566 |
| 5     | 0.124200      | 0.747090        | 0.884337 | 0.887412 | 0.891807  | 0.884337 |

## πŸ“ Project Structure

```
tos-analyzer/
β”‚
β”œβ”€β”€ app.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ text_processing.py
β”‚   └── model_utils.py
└── README.md
```

## πŸ› οΈ Installation

1. Clone the repository:
   ```
   git clone https://github.com/HimanshuMohanty-Git24/TOSRoberta.git
   cd TOSRoberta
   ```

2. Install the required dependencies:
   ```
   pip install -r requirements.txt
   ```

3. Run the Streamlit app:
   ```
   streamlit run app.py
   ```

## πŸ“Š Training Visualization

We used Weights & Biases for monitoring the training process. Here's a glimpse of our training metrics:

![image](https://github.com/HimanshuMohanty-Git24/TOSRoberta/assets/94133298/d28bbd84-9008-4d19-bff1-4b62874a5faa)


## 🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## πŸ™ Acknowledgements

- [Hugging Face](https://huggingface.co/) for the Transformers library
- [Streamlit](https://streamlit.io/) for the easy-to-use web app framework
- [Weights & Biases](https://wandb.ai/) for experiment tracking

## πŸ“¬ Contact

Himanshu Mohanty - [CodingHima](https://x.com/CodingHima) - codehimanshu24@gmail.com

Project Link: [https://github.com/HimanshuMohanty-Git24/TOSRoberta](https://github.com/HimanshuMohanty-Git24/TOSRoberta)

---

⭐️ If you find this project useful, please consider giving it a star!