|
--- |
|
license: ms-pl |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
tags: |
|
- marketing |
|
--- |
|
# PhiMarketing: A Marketing Large Language Model |
|
|
|
PhiMarketing is a 3.8B parameter Domain-Specific Large Language Model (LLM). |
|
It was specifically adapted to the marketing domain from [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) through continuous pretraining on a meticulously curated and comprehensive marketing corpus of more than 43B tokens. |
|
We are releasing this **early checkpoint** of the model to the AI community. |
|
|
|
|
|
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/660a4d7614fdf5e925104e77/bs7DscWIrb02ycu3CCOO7.jpeg) |
|
|
|
### Model Description |
|
|
|
PhiMarketing is a powerful tool that can help generate high-quality marketing content and conduct research in the field of marketing. |
|
It's a valuable resource for anyone aiming to stay ahead in the fast-changing world of marketing. |
|
|
|
While the model is designed to encode marketing knowledge, this version is not yet adapted to deliver knowledge appropriately, safely, or within professional actionable constraints. |
|
We advise against using PhiMarketing in real-world practice settings. |
|
|
|
### Model Details |
|
- Developed by: [Marketeam](https://www.marketeam.ai/) |
|
- Model type: Causal decoder-only transformer language model |
|
- Continue-pretrained from model: Phi-3-mini-128k-instruct |
|
- Context length: 3K tokens |
|
- Input & Output: Text-only |
|
- Language: English |
|
- Knowledge Cutoff: December 2023 |
|
|
|
## Uses |
|
|
|
PhiMarketing has been developed for further research of LLM for marketing applications. |
|
The potential use cases for this tool are diverse and varied, ranging from marketing question answering to general marketing information queries, and actions (function-calls) on marketing platforms. |
|
|
|
PhiMarketing is a Foundation Language Model (FLM) without finetuning or instruction-tuning. |
|
We recommend applying SFT or RLHF-tuned for specific downstream tasks. Or rather apply in-context learning with 1000-1500 tokens added to the prompt. |
|
|
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
Marketing data from publicly available and **internal** sources such as: |
|
- Blogs |
|
- Books |
|
- Websites |
|
- Podcasts |
|
- Newsletters |
|
- Publications |
|
- Social Media |
|
- Ad-Campaigns |
|
- Landing Pages |
|
- Press Releases |
|
- Email-Campaigns |
|
- Brochures & Flyers |
|
- Product Description |
|
- Testimonials & Reviews |
|
- ... |
|
And ±10% of previously seen data to avoid *catastrophic forgetting*. |
|
|
|
|
|
### Training Procedure |
|
|
|
Our training procedure includes using the AWS SageMaker framework, 4 NVIDIA A100 GPUs, p4de.24xlarge machine. |
|
With a total train time of ±250 hours, with a total training cost of ±10K$. |
|
This is an **early checkpoint** of the model that we are releasing to the community. |
|
|
|
#### Training Hyperparameters |
|
|
|
| Param | Value | |
|
|---------------|-----------------| |
|
| bf16 | true | |
|
| tf32 | true | |
|
| lr | 1e-4 | |
|
| optim | adamw | |
|
| epochs | 1 | |
|
| lr scheduler | constant | |
|
| warmup ratio | 0.03 | |
|
| max grad norm | 0.3 | |
|
| context len | 3072 | |
|
| attention |flash attention 2| |
|
|
|
|
|
## How to use |
|
|
|
#### Using Transformers pipeline |
|
|
|
```python |
|
import transformers |
|
import torch |
|
|
|
model_id = "marketeam/PhiMarketing" |
|
tokenizer_id = "microsoft/Phi-3-mini-128k-instruct" |
|
token = "hf-token" |
|
|
|
pipeline = transformers.pipeline("text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, |
|
tokenizer=tokenizer_id, token=token, device_map='auto') |
|
|
|
pipeline("What are the key components of a digital marketing strategy?") |
|
``` |
|
|
|
#### Using Transformers generate |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
model_id = "marketeam/PhiMarketing" |
|
tokenizer_id = "microsoft/Phi-3-mini-128k-instruct" |
|
token = "hf_token" |
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, token=token) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, torch_dtype=torch.bfloat16, token=token,trust_remote_code=true).to(device) |
|
|
|
message = "How do I calculate customer lifetime value?" |
|
inputs = tokenizer(message, return_tensors="pt").to(device) |
|
outputs = model.generate(**inputs) |
|
tokenizer.batch_decode(outputs, skip_special_tokens=True) |
|
``` |
|
|
|
|
|
## Intended Usage |
|
|
|
PhiMarketing is now available for further testing and assessment. Potential use cases include, but are not limited to: |
|
- Text Generation: This model can produce creative text formats in the marketing domain. |
|
- Knowledge Exploration: It can assist marketing researchers by generating valuable marketing information or answering questions about marketing-specific topics. |
|
- Natural Language Processing (NLP) Research: This model can form the basis for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field. |
|
|
|
|
|
## Contributers |
|
|
|
[Sahar Millis](https://www.linkedin.com/in/sahar-millis/) [Coby Benveniste](https://www.linkedin.com/in/coby-benveniste/) [Nofar Sachs](https://www.linkedin.com/in/nofar-sachs-2146801b3/) [Eran Mazur](https://www.linkedin.com/in/eranmazur/) |