mkthoma commited on
Commit
5f4d77b
·
1 Parent(s): 475af39

readme update

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md CHANGED
@@ -11,3 +11,85 @@ license: mit
11
  ---
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
14
+
15
+
16
+ # Finetuning of Open Source LLM Models
17
+ This repo contains the code for finetuning open source LLM models, specifically the [Microsoft Phi2](https://huggingface.co/microsoft/phi-2) model that was recently released. It has 2.78 billion parameters.
18
+
19
+ Phi-2 is a Transformer with 2.7 billion parameters. It was trained using the same data sources as [Phi-1.5](https://huggingface.co/microsoft/phi-1_5), augmented with a new data source that consists of various NLP synthetic texts and filtered websites (for safety and educational value). When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-2 showcased a nearly state-of-the-art performance among models with less than 13 billion parameters.
20
+
21
+ The model hasn't been fine-tuned through reinforcement learning from human feedback. The intention behind crafting this open-source model is to provide the research community with a non-restricted small model to explore vital safety challenges, such as reducing toxicity, understanding societal biases, enhancing controllability, and more.
22
+
23
+ The finetuning code is taken from [this repo](https://github.com/mshumer/gpt-llm-trainer/blob/main/README.md) with a few minor changes for Phi2. The dataset used for finetuning was taken from the [OpenAssistant Conversations Dataset](https://huggingface.co/datasets/OpenAssistant/oasst1?row=0) (OASST1).
24
+
25
+ The Github code can be found [here](https://github.com/mkthoma/llm_finetuning).
26
+
27
+ ## Preparing finetuning data
28
+ If you take a look at the dataet used for finetuning, you can see that each prompt and the following conversations (prompt and response) are stored in a tree like structure.
29
+
30
+ ```
31
+ {
32
+ "message_tree_id": "14fbb664-a620-45ce-bee4-7c519b16a793",
33
+ "tree_state": "ready_for_export",
34
+ "prompt": {
35
+ "message_id": "14fbb664-a620-45ce-bee4-7c519b16a793",
36
+ "text": "Why can't we divide by 0? (..)",
37
+ "role": "prompter",
38
+ "lang": "en",
39
+ "replies": [
40
+ {
41
+ "message_id": "894d30b6-56b4-4605-a504-89dd15d4d1c8",
42
+ "text": "The reason we cannot divide by zero is because (..)",
43
+ "role": "assistant",
44
+ "lang": "en",
45
+ "replies": [
46
+ // ...
47
+ ]
48
+ },
49
+ {
50
+ "message_id": "84d0913b-0fd9-4508-8ef5-205626a7039d",
51
+ "text": "The reason that the result of a division by zero is (..)",
52
+ "role": "assistant",
53
+ "lang": "en",
54
+ "replies": [
55
+ {
56
+ "message_id": "3352725e-f424-4e3b-a627-b6db831bdbaa",
57
+ "text": "Math is confusing. Like those weird Irrational (..)",
58
+ "role": "prompter",
59
+ "lang": "en",
60
+ "replies": [
61
+ {
62
+ "message_id": "f46207ca-3149-46e9-a466-9163d4ce499c",
63
+ "text": "Irrational numbers are simply numbers (..)",
64
+ "role": "assistant",
65
+ "lang": "en",
66
+ "replies": []
67
+ },
68
+ // ...
69
+ ]
70
+ }
71
+ ]
72
+ }
73
+ ]
74
+ }
75
+ }
76
+ ```
77
+
78
+ In order to parse this into a dataset format with each prompt and response I have used the notebooks for [train](https://github.com/mkthoma/llm_finetuning/blob/main/Data%20Prep/OpenAssistant%20Finetuning%20Training%20Data%20Prep.ipynb) and [validation](https://github.com/mkthoma/llm_finetuning/blob/main/Data%20Prep/OpenAssistant%20Finetuning%20Validation%20Data%20Prep.ipynb) dataset respectively. The parsed dataset is used for finetuning the model.
79
+
80
+ ## Finetuning the model
81
+ The different parameters for the model are defined in the hyperparameters section of the notebook. The dataset is read, and the instructions for the user and system prompt are added, and then passed to the trainer for the model to be finetuned.
82
+
83
+ The finetuned model is then saved, merged with the base model of Phi2 and then saved to the Google Drive storage. We are then able to load the model and then make inferences based on the prompts provided and the maximum token length.
84
+
85
+ I have used a A100 Colab instance for 1 epoch of finetuning which took about 4 hours to finish.
86
+
87
+ ## Training Logs
88
+
89
+ ![image](https://github.com/mkthoma/llm_finetuning/assets/135134412/d6cca038-62b6-412f-b42c-a118a0489184)
90
+
91
+ ## Sample Inferences
92
+
93
+ ![image](https://github.com/mkthoma/llm_finetuning/assets/135134412/64b0babf-aaf8-4999-a966-6ac02d8bb339)
94
+
95
+ ![image](https://github.com/mkthoma/llm_finetuning/assets/135134412/ede8846b-256d-410f-9274-5146729dd147)