ajibawa-2023 commited on
Commit
b6a4440
1 Parent(s): ad623db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,12 +12,14 @@ Large Language Models (LLMs) are good with code generations. Sometimes they do m
12
  This is what I have tried over here. The base Llama-2 model was used for training purpose. It is trained on around 74000 set of codes. Each set having 2 conversations.
13
  Along with Python, Java, JavaScript, GO, C++, Rust etc. code with detailed explanation is used for training purpose. It is built upon using my existing Dataset [Python-Code-23k-ShareGPT](https://huggingface.co/datasets/ajibawa-2023/Python-Code-23k-ShareGPT).
14
  This conversation is in Vicuna/ShareGPT format. Each set, along with code, has detailed explanation.
15
- I have released the new [data](https://huggingface.co/datasets/ajibawa-2023/Python-Code-23k-ShareGPT).
 
16
 
17
  **Training:**
18
 
19
  Entire dataset was trained on Azure 4 x A100 80GB. For 3 epoch, training took 42 hours. DeepSpeed codebase was used for training purpose. This was trained on Llama-1 by Meta.
20
 
 
21
  This is a full fine tuned model. Links for quantized models will be released soon.
22
 
23
 
 
12
  This is what I have tried over here. The base Llama-2 model was used for training purpose. It is trained on around 74000 set of codes. Each set having 2 conversations.
13
  Along with Python, Java, JavaScript, GO, C++, Rust etc. code with detailed explanation is used for training purpose. It is built upon using my existing Dataset [Python-Code-23k-ShareGPT](https://huggingface.co/datasets/ajibawa-2023/Python-Code-23k-ShareGPT).
14
  This conversation is in Vicuna/ShareGPT format. Each set, along with code, has detailed explanation.
15
+
16
+ I have released the new data [Code-74k-ShareGPT](https://huggingface.co/datasets/ajibawa-2023/Code-74k-ShareGPT) on which this Model is trained.
17
 
18
  **Training:**
19
 
20
  Entire dataset was trained on Azure 4 x A100 80GB. For 3 epoch, training took 42 hours. DeepSpeed codebase was used for training purpose. This was trained on Llama-1 by Meta.
21
 
22
+
23
  This is a full fine tuned model. Links for quantized models will be released soon.
24
 
25