Phi-Marketing / README.md

Update README.md

0c80350 verified 5 months ago

5.27 kB

	---
	license: ms-pl
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- marketing
	---
	# PhiMarketing: A Marketing Large Language Model

	PhiMarketing is a 3.8B parameter Domain-Specific Large Language Model (LLM).
	It was specifically adapted to the marketing domain from [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) through continuous pretraining on a meticulously curated and comprehensive marketing corpus of more than 43B tokens.
	We are releasing this early checkpoint of the model to the AI community.


	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/660a4d7614fdf5e925104e77/bs7DscWIrb02ycu3CCOO7.jpeg)

	### Model Description

	PhiMarketing is a powerful tool that can help generate high-quality marketing content and conduct research in the field of marketing.
	It's a valuable resource for anyone aiming to stay ahead in the fast-changing world of marketing.

	While the model is designed to encode marketing knowledge, this version is not yet adapted to deliver knowledge appropriately, safely, or within professional actionable constraints.
	We advise against using PhiMarketing in real-world practice settings.

	### Model Details
	- Developed by: [Marketeam](https://www.marketeam.ai/)
	- Model type: Causal decoder-only transformer language model
	- Continue-pretrained from model: Phi-3-mini-128k-instruct
	- Context length: 3K tokens
	- Input & Output: Text-only
	- Language: English
	- Knowledge Cutoff: December 2023

	## Uses

	PhiMarketing has been developed for further research of LLM for marketing applications.
	The potential use cases for this tool are diverse and varied, ranging from marketing question answering to general marketing information queries, and actions (function-calls) on marketing platforms.

	PhiMarketing is a Foundation Language Model (FLM) without finetuning or instruction-tuning.
	We recommend applying SFT or RLHF-tuned for specific downstream tasks. Or rather apply in-context learning with 1000-1500 tokens added to the prompt.


	## Training Details

	### Training Data

	Marketing data from publicly available and internal sources such as:
	- Blogs
	- Books
	- Websites
	- Podcasts
	- Newsletters
	- Publications
	- Social Media
	- Ad-Campaigns
	- Landing Pages
	- Press Releases
	- Email-Campaigns
	- Brochures & Flyers
	- Product Description
	- Testimonials & Reviews
	- ...
	And ±10% of previously seen data to avoid catastrophic forgetting.


	### Training Procedure

	Our training procedure includes using the AWS SageMaker framework, 4 NVIDIA A100 GPUs, p4de.24xlarge machine.
	With a total train time of ±250 hours, with a total training cost of ±10K$.
	This is an early checkpoint of the model that we are releasing to the community.

	#### Training Hyperparameters

	\| Param \| Value \|
	\|---------------\|-----------------\|
	\| bf16 \| true \|
	\| tf32 \| true \|
	\| lr \| 1e-4 \|
	\| optim \| adamw \|
	\| epochs \| 1 \|
	\| lr scheduler \| constant \|
	\| warmup ratio \| 0.03 \|
	\| max grad norm \| 0.3 \|
	\| context len \| 3072 \|
	\| attention \|flash attention 2\|


	## How to use

	#### Using Transformers pipeline

	```python
	import transformers
	import torch

	model_id = "marketeam/PhiMarketing"
	tokenizer_id = "microsoft/Phi-3-mini-128k-instruct"
	token = "hf-token"

	pipeline = transformers.pipeline("text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16},
	tokenizer=tokenizer_id, token=token, device_map='auto')

	pipeline("What are the key components of a digital marketing strategy?")
	```

	#### Using Transformers generate

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	model_id = "marketeam/PhiMarketing"
	tokenizer_id = "microsoft/Phi-3-mini-128k-instruct"
	token = "hf_token"
	device = "cuda" if torch.cuda.is_available() else "cpu"

	tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, token=token)
	model = AutoModelForCausalLM.from_pretrained(
	model_id, torch_dtype=torch.bfloat16, token=token,trust_remote_code=true).to(device)

	message = "How do I calculate customer lifetime value?"
	inputs = tokenizer(message, return_tensors="pt").to(device)
	outputs = model.generate(**inputs)
	tokenizer.batch_decode(outputs, skip_special_tokens=True)
	```


	## Intended Usage

	PhiMarketing is now available for further testing and assessment. Potential use cases include, but are not limited to:
	- Text Generation: This model can produce creative text formats in the marketing domain.
	- Knowledge Exploration: It can assist marketing researchers by generating valuable marketing information or answering questions about marketing-specific topics.
	- Natural Language Processing (NLP) Research: This model can form the basis for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field.


	## Contributers

	[Sahar Millis](https://www.linkedin.com/in/sahar-millis/) [Coby Benveniste](https://www.linkedin.com/in/coby-benveniste/) [Nofar Sachs](https://www.linkedin.com/in/nofar-sachs-2146801b3/) [Eran Mazur](https://www.linkedin.com/in/eranmazur/)