{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "WE5GJ6s7y0Xo" }, "source": [ "## Fine-tune large models using 🤗 [`peft`](https://github.com/huggingface/peft) adapters, [`transformers`](https://github.com/huggingface/transformers) & [`bitsandbytes`](https://github.com/TimDettmers/bitsandbytes)\n", "\n", "In this tutorial we will cover how we can fine-tune large language models using the very recent `peft` library and `bitsandbytes` for loading large models in **8-bit**.\n", "The fine-tuning method will rely on a recent method called \"Low Rank Adapters\" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), instead of fine-tuning the entire model you just have to fine-tune these adapters and load them properly inside the model. \n", "After fine-tuning the model you can also share your adapters on the 🤗 Hub and load them very easily. Let's get started!" ] }, { "cell_type": "markdown", "metadata": { "id": "TfBzP8gWzkpv" }, "source": [ "### Install requirements\n", "\n", "First, run the cells below to install the requirements:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "otj46qRbtpnd", "outputId": "cbc6a9e9-2263-4c9d-f36a-7d28a1486513" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m92.2/92.2 MB\u001b[0m \u001b[31m17.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m474.6/474.6 kB\u001b[0m \u001b[31m27.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m219.1/219.1 kB\u001b[0m \u001b[31m7.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m42.2/42.2 kB\u001b[0m \u001b[31m287.1 kB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m110.5/110.5 kB\u001b[0m \u001b[31m16.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m212.5/212.5 kB\u001b[0m \u001b[31m31.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m134.3/134.3 kB\u001b[0m \u001b[31m20.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.0/1.0 MB\u001b[0m \u001b[31m70.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m224.5/224.5 kB\u001b[0m \u001b[31m31.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m114.5/114.5 kB\u001b[0m \u001b[31m17.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m268.8/268.8 kB\u001b[0m \u001b[31m4.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m149.6/149.6 kB\u001b[0m \u001b[31m21.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25h Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n", " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n", " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", " Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n", " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n", " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.8/7.8 MB\u001b[0m \u001b[31m106.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m82.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25h Building wheel for transformers (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", " Building wheel for peft (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n" ] } ], "source": [ "!pip install -q bitsandbytes datasets accelerate loralib einops\n", "!pip install -q git+https://github.com/huggingface/transformers.git@main git+https://github.com/huggingface/peft.git" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "-Q9oFzZdcD6T", "outputId": "046873a7-32aa-458b-94c3-2ab760033b56" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Mounted at /content/drive\n" ] } ], "source": [ "from google.colab import drive\n", "drive.mount('/content/drive')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 145, "referenced_widgets": [ "4e0f42040d8e4c76bb20363a9e4046cf", "ae94034b488843abbe7ac4c98e3e5d4c", "12b608ed159145aca6da27c36ec4a9b1", "76ed1fdaf6104c5986de9c6d5f42880e", "808edf7de7f947409c9582e11d135644", "34703094b08548aab6e5fe18f150eae7", "24bf2aa87ac54d5f97227f17a23540af", "e09b9f4dabd14b819cd521c1c2ccd6eb", "7e10a8c02e894209af442c9be36deb1a", "e63c2a186a484d57976539b8e4921b54", "b4e9ac74df32415aa71d3d2d14a892c3", "39e1e81a2b9948aa91fd85ff9d7c0805", "ac226b9296364042acdd608dfd39948d", "a1bab69649494c4991ab120b3e2b53c4", "b396e65fd46e40abb24e4475bbcad684", "ccc2e887ddea4d9ebb84d530c0ab286b", "d93cd35a1bae417588ed29a9502bdd91", "c56f29f847fa4cb1a8d5ea366751690d", "d2f7ddbf5bb14d388d051298626232c8", "e096ca2e6bc54addab358dc4d7586e03", "e9cbde63bf2647ab8ef00d317bd6fdc1", "ac6bae317cc1409c8c5a9b5771065c71", "5b5a0ec5cb7846c39e86805c209df940", "ac5748211a4148e18964fc8fd9cb3d3b", "699c180ffc134b69879ef4fcc5433096", "facfba04f0e644978be531565f471d63", "5e7ad5db85ba4d0fa40f0f6506b943d4", "73ee75e82f5e491f86cefee05928be61", "8e2330ebd6b744b9aa884f625e135406", "0274218ecb194d9fab451781c32d78e6", "c651075f3e0c49558ea9d11d0ada548b", "3f8ee32d95144035a3f3c4d4ac6527b1" ] }, "id": "rTdXkcWecz3s", "outputId": "c28aec19-0116-4cde-8b84-1231747850dc" }, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "VBox(children=(HTML(value='
Step | \n", "Training Loss | \n", "
---|---|
25 | \n", "1.479100 | \n", "
50 | \n", "1.327900 | \n", "
75 | \n", "1.280700 | \n", "
100 | \n", "1.320500 | \n", "
125 | \n", "1.317400 | \n", "
150 | \n", "1.264100 | \n", "
175 | \n", "1.282600 | \n", "
200 | \n", "1.322100 | \n", "
225 | \n", "1.276600 | \n", "
250 | \n", "1.316100 | \n", "
275 | \n", "1.266000 | \n", "
300 | \n", "1.282200 | \n", "
325 | \n", "1.250500 | \n", "
350 | \n", "1.241300 | \n", "
375 | \n", "1.228700 | \n", "
400 | \n", "1.291200 | \n", "
425 | \n", "1.259800 | \n", "
450 | \n", "1.294200 | \n", "
475 | \n", "1.266900 | \n", "
500 | \n", "1.286000 | \n", "
525 | \n", "1.252400 | \n", "
550 | \n", "1.281800 | \n", "
575 | \n", "1.299100 | \n", "
"
],
"text/plain": [
" "
]
},
"metadata": {}
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"TrainOutput(global_step=614, training_loss=1.290572554746746, metrics={'train_runtime': 22488.4324, 'train_samples_per_second': 0.437, 'train_steps_per_second': 0.027, 'total_flos': 8.003914219624858e+17, 'train_loss': 1.290572554746746, 'epoch': 1.0})"
]
},
"metadata": {},
"execution_count": 21
}
],
"source": [
"training_args = transformers.TrainingArguments(\n",
" auto_find_batch_size=True,\n",
" gradient_accumulation_steps=4,\n",
" num_train_epochs=1,\n",
" learning_rate=2e-4,\n",
" fp16=True,\n",
" save_total_limit=4,\n",
" logging_steps=25,\n",
" output_dir=\"/content/drive/MyDrive/Colab Files/WM/falcon-chat-7b\",\n",
" save_strategy='epoch',\n",
" optim=\"paged_adamw_8bit\",\n",
" lr_scheduler_type = 'cosine',\n",
" warmup_ratio = 0.05,\n",
")\n",
"\n",
"trainer = transformers.Trainer(\n",
" model=model,\n",
" train_dataset=data,\n",
" args=training_args,\n",
" data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),\n",
")\n",
"model.config.use_cache = False # silence the warnings. Please re-enable for inference!\n",
"trainer.train()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Duak7T_B3VpJ"
},
"source": [
"## Share adapters on the 🤗 Hub"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "VxB6UV5XAvvP",
"outputId": "c25bfe26-3e97-4983-f8a5-4544abd13f68"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"CommitInfo(commit_url='https://huggingface.co/dfurman/falcon-7b-chat-oasst1/commit/c1d659b12ba143921a39039c5c73de8d08c915c8', commit_message='Upload model', commit_description='', oid='c1d659b12ba143921a39039c5c73de8d08c915c8', pr_url=None, pr_revision=None, pr_num=None)"
]
},
"metadata": {},
"execution_count": 10
}
],
"source": [
"model.push_to_hub(\"dfurman/falcon-7b-chat-oasst1\", use_auth_token=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "S65GcxNGA9kz"
},
"source": [
"## Load adapters from the Hub\n",
"\n",
"You can also directly load adapters from the Hub using the commands below:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 113,
"referenced_widgets": [
"8cbd584670c949b1ae0be34ba2624e9f",
"016794f341634054a0958e1db42c4243",
"6af46abb0e7d4b3198302bb2f4aa8bd7",
"02caecb6b1814a34a0e264451ce5cbfa",
"5894072560c2470193848542f31f2c36",
"070bea93b2554794b0acaab492791bf5",
"4353a30953044bd68db966fb11de3d31",
"6f7bd66e8dd840c881abbf616c2bb21a",
"97f250200a7446d787b3ac0d65808849",
"6a70ccb4dc91405c97d4e6342a601260",
"4f46f81d53ff4269b1c3f48b78ad3e96",
"c84f780648dd43b387bd825c2bf21ea6",
"22910dbe439f409ca9a776dd969e4920",
"1d4672981e4c4073a65f7cac9e7e63b7",
"5b1b971fee8d454caa6524f0ca0be091",
"f70d526253d54a679c9f5dd814de2123",
"a689ec73a49d42daaa2ecc9ad69b2faa",
"6849d267afd64263b447ffa36d988651",
"d54e3163a2b746599afdc44933b47af8",
"b26f404b48044cf7a4c77877a5c92b5c",
"fc428118dbca4df6b30d9def8ccc3e28",
"c6fc1039f6894698ae8418d83775c266",
"5d1e2e2f48d3429d85b9d3a80cc982fe",
"e950d075fc174617832e71ffee770838",
"d9e000681f074772972ed8961eae2207",
"8d942f61ac5142d2b595b20a010d6010",
"27aa9197bfeb475aa933facd7426354f",
"5747603d9e294fa788d197c403940b64",
"c0fbde154e6c4e599a6d9d0829816548",
"b23276da91514942b475c2e912152b98",
"50ed875cbf0649faba421a6a40fb9214",
"45b23a89b6bc45b3a0b1645d2c745db8",
"e4f69091e840474a8fd321404fdb460f"
]
},
"id": "hsD1VKqeA62Z",
"outputId": "ac11b44d-74e6-4281-980f-652a0d323a6a"
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading (…)/adapter_config.json: 0%| | 0.00/333 [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "8cbd584670c949b1ae0be34ba2624e9f"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Loading checkpoint shards: 0%| | 0/2 [00:00, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "c84f780648dd43b387bd825c2bf21ea6"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading adapter_model.bin: 0%| | 0.00/18.9M [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "5d1e2e2f48d3429d85b9d3a80cc982fe"
}
},
"metadata": {}
}
],
"source": [
"import torch\n",
"from peft import PeftModel, PeftConfig\n",
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
"\n",
"peft_model_id = \"dfurman/falcon-7b-chat-oasst1\"\n",
"config = PeftConfig.from_pretrained(peft_model_id)\n",
"model = AutoModelForCausalLM.from_pretrained(\n",
" config.base_model_name_or_path, \n",
" return_dict=True, \n",
" load_in_8bit=True, \n",
" device_map={\"\":0},\n",
" trust_remote_code=True,\n",
")\n",
"tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)\n",
"tokenizer.pad_token = tokenizer.eos_token\n",
"\n",
"# Load the Lora model\n",
"model = PeftModel.from_pretrained(model, peft_model_id)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MHYljmTjj5wX"
},
"source": [
"## Inference\n",
"\n",
"You can then directly use the trained model or the model that you have loaded from the 🤗 Hub for inference as you would do it usually in `transformers`."
]
},
{
"cell_type": "code",
"source": [
"prompt = \"\"\"\n",
" \n",
"
\n",
" \n",
" \n",
" \n",
" Step \n",
" Training Loss \n",
" \n",
" \n",
" 25 \n",
" 1.479100 \n",
" \n",
" \n",
" 50 \n",
" 1.327900 \n",
" \n",
" \n",
" 75 \n",
" 1.280700 \n",
" \n",
" \n",
" 100 \n",
" 1.320500 \n",
" \n",
" \n",
" 125 \n",
" 1.317400 \n",
" \n",
" \n",
" 150 \n",
" 1.264100 \n",
" \n",
" \n",
" 175 \n",
" 1.282600 \n",
" \n",
" \n",
" 200 \n",
" 1.322100 \n",
" \n",
" \n",
" 225 \n",
" 1.276600 \n",
" \n",
" \n",
" 250 \n",
" 1.316100 \n",
" \n",
" \n",
" 275 \n",
" 1.266000 \n",
" \n",
" \n",
" 300 \n",
" 1.282200 \n",
" \n",
" \n",
" 325 \n",
" 1.250500 \n",
" \n",
" \n",
" 350 \n",
" 1.241300 \n",
" \n",
" \n",
" 375 \n",
" 1.228700 \n",
" \n",
" \n",
" 400 \n",
" 1.291200 \n",
" \n",
" \n",
" 425 \n",
" 1.259800 \n",
" \n",
" \n",
" 450 \n",
" 1.294200 \n",
" \n",
" \n",
" 475 \n",
" 1.266900 \n",
" \n",
" \n",
" 500 \n",
" 1.286000 \n",
" \n",
" \n",
" 525 \n",
" 1.252400 \n",
" \n",
" \n",
" 550 \n",
" 1.281800 \n",
" \n",
" \n",
" 575 \n",
" 1.299100 \n",
" \n",
" \n",
" \n",
"600 \n",
" 1.298200 \n",
"
Copy a token from your Hugging Face\ntokens page and paste it below.
Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file.