TOSRoberta / README.md
CodeHima's picture
Update README.md
42d6184 verified

A newer version of the Streamlit SDK is available: 1.43.2

Upgrade
metadata
license: mit
title: TOSRoberta
sdk: streamlit
emoji: πŸš€
colorFrom: red
colorTo: yellow

TOSRoberta: Terms of Service Analyzer πŸ“œπŸ€–

TOSRoberta is an advanced Terms of Service (ToS) analyzer powered by a fine-tuned RoBERTa-large model. It classifies clauses in ToS documents based on their fairness level, helping users quickly identify potentially unfair terms.

image

🌟 Features

  • πŸ“Š Analyzes ToS documents and classifies clauses into three categories:
    • βœ… Clearly Fair
    • ⚠️ Potentially Unfair
    • ❌ Clearly Unfair
  • πŸ“ Supports both PDF and text file uploads
  • πŸ’» User-friendly web interface built with Streamlit
  • 🧠 Powered by a fine-tuned RoBERTa-large model (CodeHima/Tos-Roberta)

πŸš€ Model Performance

Our Tos-Roberta model demonstrates strong performance on the task of ToS clause classification:

  • Validation Accuracy: 89.64%
  • Test Accuracy: 85.84%

Detailed performance metrics per epoch:

Epoch Training Loss Validation Loss Accuracy F1 Score Precision Recall
1 0.443500 0.398950 0.874699 0.858838 0.862516 0.874699
2 0.416400 0.438409 0.853012 0.847317 0.849916 0.853012
3 0.227700 0.505879 0.896386 0.893325 0.891521 0.896386
4 0.052600 0.667532 0.891566 0.893167 0.895115 0.891566
5 0.124200 0.747090 0.884337 0.887412 0.891807 0.884337

πŸ“ Project Structure

tos-analyzer/
β”‚
β”œβ”€β”€ app.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ text_processing.py
β”‚   └── model_utils.py
└── README.md

πŸ› οΈ Installation

  1. Clone the repository:

    git clone https://github.com/HimanshuMohanty-Git24/TOSRoberta.git
    cd TOSRoberta
    
  2. Install the required dependencies:

    pip install -r requirements.txt
    
  3. Run the Streamlit app:

    streamlit run app.py
    

πŸ“Š Training Visualization

We used Weights & Biases for monitoring the training process. Here's a glimpse of our training metrics:

image

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgements

πŸ“¬ Contact

Himanshu Mohanty - CodingHima - codehimanshu24@gmail.com

Project Link: https://github.com/HimanshuMohanty-Git24/TOSRoberta


⭐️ If you find this project useful, please consider giving it a star!