|
|
|
LLM_BENCHMARKS_TEXT = f""" |
|
# 🧰 Train a Model |
|
|
|
Intel offers a variety of platforms that can be used to train LLMs including datacenter and consumer grade CPUs, GPUs, and ASICs. |
|
Below, you'll find documentation on how to access free and paid resources to train a model and submit it to the Powered-by-Intel LLM Leaderboard. |
|
|
|
## Intel Developer Cloud - Quick Start |
|
The Intel Developer Cloud is one of the best places to access free and paid compute instances for model training. Intel offers Jupyter Notebook instances supported by |
|
224 Core 4th Generation Xeon Baremetal nodes with 4x Max Series GPU 1100 GPUs. To access these resources please follow the instructions below: |
|
1. Visit [cloud.intel.com](cloud.intel.com) and create a free account. |
|
2. Navigate to the "Training" module under the "Software" section in the left panel |
|
3. Under the GenAI Essentials section, select the LLM Fine-Tuning with QLoRA notebook and click "Launch" |
|
4. Follow the instructions in the notebook to train your model using Intel® Data Center GPU Max 1100 |
|
5. Upload your model to the Hugging Face Model Hub |
|
6. Go to the "Submit" tab follow instructions to create a leaderboard evaluation request |
|
|
|
## Additional Training Code Samples |
|
|
|
Below you will find a list of additional resources for training models on different intel hardware platforms: |
|
- Intel® Gaudi® Accelerators |
|
- [Parameter Efficient Fine-Tuning of Llama-2 70B](https://github.com/HabanaAI/Gaudi-tutorials/blob/main/PyTorch/llama2_fine_tuning_inference/llama2_fine_tuning_inference.ipynb) |
|
- Intel® Xeon® Processors |
|
- [Distributed Training of GPT2 LLMs on AWS](https://github.com/intel/intel-cloud-optimizations-aws/tree/main/distributed-training) |
|
- [Fine-tuning Falcon 7B on Xeon Processors](https://medium.com/@eduand-alvarez/fine-tune-falcon-7-billion-on-xeon-cpus-with-hugging-face-and-oneapi-a25e10803a53) |
|
- Intel® Data Center GPU Max Series |
|
- [LLM Fine-tuning with QLoRA on Max Series GPUs](https://console.idcservice.net/training/detail/159c24e4-5598-3155-a790-2qv973tlm172) |
|
## Submitting your Model to the Hub |
|
Once you have trained your model, it is a straighforward process to upload and open source it on the Hugging Face Hub. |
|
|
|
```python |
|
|
|
# Logging in to Hugging Face |
|
|
|
from huggingface_hub import notebook_login, Repository |
|
|
|
# Login to Hugging Face |
|
notebook_login() |
|
|
|
# Model and Tokenize Loading |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
# Define the path to the checkpoint |
|
checkpoint_path = "" # Replace with your checkpoint folder |
|
|
|
# Load the model |
|
model = AutoModelForSequenceClassification.from_pretrained(checkpoint_path) |
|
|
|
# Load the tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("") #add name of your model's tokenizer on Hugging Face OR custom tokenizer |
|
|
|
#Saving and Uploading the Model and Tokenizer |
|
|
|
# Save the model and tokenizer |
|
model_name_on_hub = "desired-model-name" |
|
model.save_pretrained(model_name_on_hub) |
|
tokenizer.save_pretrained(model_name_on_hub) |
|
|
|
# Push to the hub |
|
model.push_to_hub(model_name_on_hub) |
|
tokenizer.push_to_hub(model_name_on_hub) |
|
|
|
# Congratulations! Your fine-tuned model is now uploaded to the Hugging Face Model Hub. |
|
# You can view and share your model using its URL: https://huggingface.co/your-username/your-model-name |
|
|
|
``` |
|
|
|
""" |
|
|
|
SUBMIT_TEXT = f""" |
|
# Use the Resource Below to Start Training a Model Today |
|
|
|
""" |