nvidia
/

Hymba-1.5B-Instruct

Text Generation

Model card Files Files and versions Community

Update LMFlow support

#10

by shizhediao2 - opened Dec 6, 2024

base: refs/heads/main

←

from: refs/pr/10

Discussion Files changed

Files changed (1) hide show

README.md +34 -0

README.md CHANGED Viewed

@@ -145,6 +145,40 @@ The prompt template used by Hymba-1.5B-Instruct is as follows, which has been in
 ```
 ## Limitations
 The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts. The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.

 ```
+## Finetuning Hymba
+[LMFlow](https://github.com/OptimalScale/LMFlow) is a complete pipeline for fine-tuning large language models.
+The following steps provide an example of how to fine-tune the `Hymba-1.5B-Base` models using LMFlow.
+1. Using Docker
+    ```
+      docker pull ghcr.io/tilmto/hymba:v1
+      docker run --gpus all -v /home/$USER:/home/$USER -it ghcr.io/tilmto/hymba:v1 bash
+    ```
+2. Install LMFlow
+    ```
+      git clone https://github.com/OptimalScale/LMFlow.git
+      cd LMFlow
+      conda create -n lmflow python=3.9 -y
+      conda activate lmflow
+      conda install mpi4py
+      pip install -e .
+    ```
+3. Fine-tune the model using the following command.
+    ```
+      cd LMFlow
+      bash ./scripts/run_finetune_hymba.sh
+    ```
+With LMFlow, you can also fine-tune the model on your custom dataset. The only thing you need to do is transform your dataset into the [LMFlow data format](https://optimalscale.github.io/LMFlow/examples/DATASETS.html).
+In addition to full-finetuniing, you can also fine-tune hymba efficiently with [DoRA](https://arxiv.org/html/2402.09353v4), [LoRA](https://github.com/OptimalScale/LMFlow?tab=readme-ov-file#lora), [LISA](https://github.com/OptimalScale/LMFlow?tab=readme-ov-file#lisa), [Flash Attention](https://github.com/OptimalScale/LMFlow/blob/main/readme/flash_attn2.md), and other acceleration techniques.
+For more details, please refer to the [LMFlow for Hymba](https://github.com/OptimalScale/LMFlow/tree/main/experimental/Hymba) documentation.
 ## Limitations
 The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts. The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.