totally-not-an-llm commited on
Commit
d2db490
·
1 Parent(s): 825f454

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -15,10 +15,10 @@ The model is completely uncensored.
15
  This model is an early test of the EverythingLM dataset and some new experimental principles, so don't consider it SOTA.
16
 
17
  ### Notable features:
18
- - Automatically triggered CoT reasoning
19
- - Verbose and detailed replies
20
- - Creative stories
21
- - Better prompt understanding
22
 
23
  ### Prompt format:
24
  It is a modified Vicuna format, the same used in many of ehartford's models.
@@ -32,17 +32,17 @@ ASSISTANT:
32
  Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3. QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
33
 
34
  ### Model quirks:
35
- - Due to the nature of the dataset, it does better with more detail. I've found it gives much better stories when I provide more requirements
36
- - It really likes to use numbered lists. I don't necessarilly have a problem with this but it's something to note when training on the dataset
37
- - It likes to write fairy tales over anything else, which is strange. This can easily be fixed by prompting
38
- - Occasionally it will fall into repetition, this seems to be a commmon issue with llama-2 models
39
  - Haven't tested pushing it all the way to 16k context.
40
 
41
  ### Future plans:
42
- - Native finetune
43
- - Other model sizes
44
  - Improve dataset by:
45
- - Regenerating using gpt-4
46
- - A bit more data with more diversity
47
- - Refactor dataset generation script
48
- - Test some model merges using this model
 
15
  This model is an early test of the EverythingLM dataset and some new experimental principles, so don't consider it SOTA.
16
 
17
  ### Notable features:
18
+ - Automatically triggered CoT reasoning.
19
+ - Verbose and detailed replies.
20
+ - Creative stories.
21
+ - Better prompt understanding.
22
 
23
  ### Prompt format:
24
  It is a modified Vicuna format, the same used in many of ehartford's models.
 
32
  Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3. QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
33
 
34
  ### Model quirks:
35
+ - Due to the nature of the dataset, it does better with more detail. I've found it gives much better stories when I provide more requirements.
36
+ - It really likes to use numbered lists. I don't necessarilly have a problem with this but it's something to note when training on the dataset.
37
+ - It likes to write fairy tales over anything else, which is strange. This can easily be fixed by prompting.
38
+ - Occasionally it will fall into repetition, this seems to be a commmon issue with llama-2 models.
39
  - Haven't tested pushing it all the way to 16k context.
40
 
41
  ### Future plans:
42
+ - Native finetune.
43
+ - Other model sizes.
44
  - Improve dataset by:
45
+ - Regenerating using gpt-4.
46
+ - A bit more data with more diversity.
47
+ - Refactor dataset generation script.
48
+ - Test some model merges using this model.