pszemraj commited on
Commit
12e7206
1 Parent(s): 6a271a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -4
README.md CHANGED
@@ -9,18 +9,27 @@ inference: False
9
  license: apache-2.0
10
  ---
11
 
12
- # ethzanalytics/gpt-j-8bit-daily_dialogues_1E
 
 
 
 
 
 
 
13
 
14
- This is a version of `hivemind/gpt-j-6B-8bit` fine-tuned on the Wizard of Wikipedia dataset for 10k steps on an A100. it can be used as a chatbot.
15
 
16
  _NOTE: this needs to be loaded via the special patching technique outlined in the hivemind model card (as with all 8bit models)_
17
 
 
18
 
19
- TODO: rest of README
20
 
21
 
22
  ---
23
 
24
 
25
- [original demo link](https://colab.research.google.com/gist/pszemraj/76c0a80c9eacfb2c31e21c4cceb344a0/ai-msgbot-gpt-j-6b-8bit-chatbot-demo.ipynb)
 
26
 
 
 
9
  license: apache-2.0
10
  ---
11
 
12
+ # ethzanalytics/gpt-j-8bit-KILT_WoW_10k_steps
13
+
14
+
15
+ <a href="https://colab.research.google.com/gist/pszemraj/e49c60aafe04acc52fcfdd1baefe12e4/-ai-msgbot-gpt-j-6b-8bit-with-hub.ipynb">
16
+ <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
17
+ </a>
18
+
19
+ This is a version of `hivemind/gpt-j-6B-8bit` fine-tuned on the [Wizard of Wikipedia](https://arxiv.org/abs/1811.01241) dataset for 10k steps (_just under an epoch_) on an A100. it can be used as a chatbot. It is designed to be used with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to take advantage of the prompt engineering.
20
 
 
21
 
22
  _NOTE: this needs to be loaded via the special patching technique outlined in the hivemind model card (as with all 8bit models)_
23
 
24
+ ## Training
25
 
26
+ For details, please see [this wandb report](https://wandb.ai/pszemraj/conversational-6B-train-vanilla/reports/Training-6B-GPT-J-8bit-for-Dialogue--VmlldzoyNTg3MzE0) for both the daily-dialogues version and the WoW version.
27
 
28
 
29
  ---
30
 
31
 
32
+ TODO: rest of README
33
+
34
 
35
+ ---