IMDB Sentiment Analysis Project
Overview
This project implements a sentiment analysis system for IMDB movie reviews using various machine learning and deep learning techniques. It includes a React frontend for user interaction and a Flask backend for processing and analyzing the reviews.
Features
- Sentiment analysis of IMDB movie reviews
- Multiple machine learning models:
- Naive Bayes (Gaussian NB)
- Random Forest
- Logistic Regression
- LSTM
- Transformer
- Interactive web interface for real-time analysis
- Visualization of model accuracies and dataset distribution
- User feedback system for continuous improvement
Technologies Used
- Frontend: React, Recharts, Lucide React
- Backend: Flask, NLTK, SpaCy, scikit-learn, TensorFlow/Keras
- Data Processing: Pandas, NumPy
- Machine Learning: scikit-learn, TensorFlow, Keras
- Natural Language Processing: NLTK, SpaCy
Setup Instructions
Prerequisites
- Node.js and npm
- Python 3.7+
- Git
Frontend Setup
- Clone the repository:
git clone https://github.com/saquib34/zensibleInterview.git
- Navigate to the project directory:
cd zensibleInterview
- Install dependencies:
npm install
- Start the development server:
npm start
Backend Setup
- Ensure you're in the project directory
- Install required Python packages:
pip install -r requirements.txt
- Start the Flask server:
python app.py
Usage
- Open your web browser and navigate to
http://localhost:3000
(or the port specified by your React setup) - Enter an IMDB movie review in the text input
- Click "Analyze" to see the sentiment analysis results
- (Optional) Provide feedback on the analysis accuracy
Project Structure
/src
: React frontend source code/public
: Public assets for the frontend/backend
: Flask backend code/models
: Trained machine learning models/data
: Dataset and data processing scriptsrequirements.txt
: Python dependenciespackage.json
: Node.js dependencies
Dataset
This project uses the IMDB Dataset of 50K Movie Reviews, available on Kaggle: IMDB Dataset
Models and Performance
Model | Accuracy |
---|---|
Gaussian NB | 0.7379 |
Random Forest | 0.7997 |
Logistic Regression | 0.82 |
LSTM | 0.7424 |
Transformer | 0.5 |
Contributing
Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes.
License
Contact
Developer: Saquib GitHub: saquib34
Acknowledgments
- IMDB for providing the dataset
- Kaggle for hosting the dataset
- All open-source libraries and tools used in this project