AjayMukundS commited on
Commit
cddb4a4
1 Parent(s): a7337c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -14,7 +14,7 @@ pipeline_tag: text-generation
14
  # Deployed Model
15
  AjayMukundS/Llama-2-7b-chat-finetune
16
 
17
- ## Model Description
18
  This is a Llama 2 Fine Tuned Model with 7 Billion Parameters on the Dataset from **mlabonne/guanaco-llama2**. The training data is basically a Chat between a Human and an Assistant where the Human poses some queries and the Assistant responds to those queries in a suitable fashion.
19
  In the case of Llama 2, the following Chat Template is used for the chat models:
20
 
@@ -32,7 +32,7 @@ User prompt (required) --> to give the instruction / User Query
32
 
33
  Model Answer (required)
34
 
35
- ## Training Data
36
  The Instruction Dataset is reformated to follow the above Llama 2 template.
37
 
38
  **Original Dataset** --> https://huggingface.co/datasets/timdettmers/openassistant-guanaco\
@@ -45,7 +45,7 @@ To know how this dataset was created, you can check this notebook --> https://co
45
 
46
  To drastically reduce the VRAM usage, we must fine-tune the model in 4-bit precision, which is why we’ll use QLoRA here and the GPU on which the model was fined tuned on was **L4 (Google Colab Pro)**
47
 
48
- ## Process
49
  1) Load the dataset as defined.
50
  2) Configure bitsandbytes for 4-bit quantization.
51
  3) Load the Llama 2 model in 4-bit precision on a GPU (L4 - Google Colab Pro) with the corresponding tokenizer.
 
14
  # Deployed Model
15
  AjayMukundS/Llama-2-7b-chat-finetune
16
 
17
+ # Model Description
18
  This is a Llama 2 Fine Tuned Model with 7 Billion Parameters on the Dataset from **mlabonne/guanaco-llama2**. The training data is basically a Chat between a Human and an Assistant where the Human poses some queries and the Assistant responds to those queries in a suitable fashion.
19
  In the case of Llama 2, the following Chat Template is used for the chat models:
20
 
 
32
 
33
  Model Answer (required)
34
 
35
+ # Training Data
36
  The Instruction Dataset is reformated to follow the above Llama 2 template.
37
 
38
  **Original Dataset** --> https://huggingface.co/datasets/timdettmers/openassistant-guanaco\
 
45
 
46
  To drastically reduce the VRAM usage, we must fine-tune the model in 4-bit precision, which is why we’ll use QLoRA here and the GPU on which the model was fined tuned on was **L4 (Google Colab Pro)**
47
 
48
+ # Process
49
  1) Load the dataset as defined.
50
  2) Configure bitsandbytes for 4-bit quantization.
51
  3) Load the Llama 2 model in 4-bit precision on a GPU (L4 - Google Colab Pro) with the corresponding tokenizer.