Ateeqq
/

Text-Rewriter-Paraphraser

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Text-Rewriter-Paraphraser / README.md

Ateeqq's picture

Update README.md

78e272c verified 8 months ago

|

2.94 kB

	---
	license: llama3
	inference:
	parameters:
	num_beams: 3
	num_beam_groups: 3
	num_return_sequences: 1
	repetition_penalty: 10
	diversity_penalty: 3.01
	no_repeat_ngram_size: 2
	temperature: 0.8
	max_length: 128
	widget:
	- text: >-
	Learn to build generative AI applications with an expert AWS instructor with the 2-day Developing Generative AI Applications on AWS course.
	example_title: AWS course
	- text: >-
	In healthcare, Generative AI can help generate synthetic medical data to train machine learning models, develop new drug candidates, and design clinical trials.
	example_title: Generative AI
	- text: >-
	By leveraging prior model training through transfer learning, fine-tuning
	can reduce the amount of expensive computing power and labeled data needed
	to obtain large models tailored to niche use cases and business needs.
	example_title: Fine Tuning
	---


	# Text Rewriter Paraphraser

	This repository contains a fine-tuned text-rewriting model based on the T5-Base with 223M parameters.

	## Key Features:

	* Fine-tuned on t5-base: Leverages the power of a pre-trained text-to-text transfer model for effective paraphrasing.
	* Large Dataset (430k examples): Trained on a comprehensive dataset combining three open-source sources and cleaned using various techniques for optimal performance.
	* High Quality Paraphrases: Generates paraphrases that significantly alter sentence structure while maintaining accuracy and factual correctness.
	* Non-AI Detectable: Aims to produce paraphrases that appear natural and indistinguishable from human-written text.

	Model Performance:

	* Train Loss: 1.0645
	* Validation Loss: 0.8761

	## Getting Started:

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	# Replace 'YOUR_TOKEN' with your actual Hugging Face access token
	tokenizer = AutoTokenizer.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser", token='YOUR_TOKEN')
	model = AutoModelForSeq2SeqLM.from_pretrained("Ateeqq/Text-Rewriter-Paraphraser", token='YOUR_TOKEN')
	```
	```python
	text = "Data science is a field that deals with extracting knowledge and insights from data. "

	inputs = tokenizer(text, return_tensors="pt")

	output = model.generate(**inputs, max_length=50)

	print(tokenizer.decode(output[0]))
	```

	Disclaimer:

	* Limited Use: It grants a non-exclusive, non-transferable license to use the this model same as Llama-3. This means you can't freely share it with others or sell the model itself.
	* Commercial Use Allowed: You can use the model for commercial purposes, but under the terms of the license agreement.
	* Attribution Required: You need to abide by the agreement's terms regarding attribution. It is essential to use the paraphrased text responsibly and ethically, with proper attribution of the original source.

	Further Development:

	(Mention any ongoing development or areas for future improvement in Discussions.)