Amazon Review Generator T5
This model is a fine-tuned version of the T5 model designed to generate Amazon product reviews based on the product title and star rating. The fine-tuning process was conducted on a dataset of software product reviews from the "McAuley-Lab/Amazon-Reviews-2023" dataset.
Use Case
The primary use case of this model is to generate realistic and coherent product reviews for Amazon products. It can be particularly useful for generating sample reviews for product listings, sentiment analysis, and natural language generation tasks in e-commerce.
Model Architecture
The model is based on the T5 (Text-to-Text Transfer Transformer) architecture, which is a versatile transformer model for a variety of text generation tasks.
Training Data
The model was fine-tuned on a dataset of Amazon software product reviews. The data was preprocessed to include only verified purchases with review texts longer than 100 characters. A total of 100,000 samples were used for fine-tuning.
Training Procedure
The training was performed using the Hugging Face transformers
library with the following settings:
- Model:
t5-base
- Number of Epochs: 3
- Batch Size: 16 for training, 32 for evaluation
- Optimizer: AdamW
- Learning Rate: Default settings
- Hardware: Training was conducted on GPU (NVIDIA RTX 3060)
Model Performance
Due to the scope of this project, comprehensive evaluation metrics are not provided. However, sample outputs demonstrate the model’s ability to generate coherent and contextually relevant reviews.
Example Usage
Here’s how you can use the model to generate reviews:
import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration
# Load the model and tokenizer
model_name = "RSPRIMES1234/Amazon-Review-Generator-T5"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)
# Set up GPU usage (optional)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
# Define the function to generate reviews
def generate_review(product_title, star_rating):
input_text = f"review: {product_title}, {star_rating} Stars!"
inputs = tokenizer(input_text, return_tensors='pt', max_length=128, padding='max_length', truncation=True)
inputs = {k: v.to(device) for k, v in inputs.items()}
outputs = model.generate(inputs['input_ids'], max_length=128, no_repeat_ngram_size=3, num_beams=6, early_stopping=True)
review = tokenizer.decode(outputs[0], skip_special_tokens=True)
return review
# Example usage
product_title = "Example Product"
star_rating = 5
print(generate_review(product_title, star_rating))
Limitations and Considerations
- Data Bias: The model was trained on reviews for software products, which may bias its performance when generating reviews for other types of products.
- Ethical Use: Generated reviews should be used responsibly and ethically. Misuse of generated content can lead to misinformation and ethical concerns.
Citation
If you use this model in your research or applications, please cite the original T5 paper and provide a link to this model on Hugging Face.
- Downloads last month
- 31