totally-not-an-llm commited on
Commit
f770975
·
1 Parent(s): 630953b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -29,10 +29,14 @@ ASSISTANT:
29
 
30
  Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3. QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
31
 
 
 
32
  ### Model quirks:
33
  - Due to the nature of the dataset, it does better with more detail. I've found it gives much better stories when I provide more requirements.
34
  - It really likes to use numbered lists. I don't necessarilly have a problem with this but it's something to note when training on the dataset.
35
  - I've had trouble with ggml k-quants.
 
 
36
 
37
  ### Future plans:
38
  - Native finetune
 
29
 
30
  Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3. QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
31
 
32
+ This is an early test, so here are some things to note on the model:
33
+
34
  ### Model quirks:
35
  - Due to the nature of the dataset, it does better with more detail. I've found it gives much better stories when I provide more requirements.
36
  - It really likes to use numbered lists. I don't necessarilly have a problem with this but it's something to note when training on the dataset.
37
  - I've had trouble with ggml k-quants.
38
+ - It likes to write fairy tales over anything else, which is strange. This can easily be fixed by prompting.
39
+ - Occasionally it will fall into repetition, this seems to be a commmon issue with llama-2 models.
40
 
41
  ### Future plans:
42
  - Native finetune