totally-not-an-llm
commited on
Commit
·
44748e3
1
Parent(s):
f770975
Update README.md
Browse files
README.md
CHANGED
@@ -32,11 +32,11 @@ Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated
|
|
32 |
This is an early test, so here are some things to note on the model:
|
33 |
|
34 |
### Model quirks:
|
35 |
-
- Due to the nature of the dataset, it does better with more detail. I've found it gives much better stories when I provide more requirements
|
36 |
-
- It really likes to use numbered lists. I don't necessarilly have a problem with this but it's something to note when training on the dataset
|
37 |
-
-
|
38 |
-
-
|
39 |
-
-
|
40 |
|
41 |
### Future plans:
|
42 |
- Native finetune
|
|
|
32 |
This is an early test, so here are some things to note on the model:
|
33 |
|
34 |
### Model quirks:
|
35 |
+
- Due to the nature of the dataset, it does better with more detail. I've found it gives much better stories when I provide more requirements
|
36 |
+
- It really likes to use numbered lists. I don't necessarilly have a problem with this but it's something to note when training on the dataset
|
37 |
+
- It likes to write fairy tales over anything else, which is strange. This can easily be fixed by prompting
|
38 |
+
- Occasionally it will fall into repetition, this seems to be a commmon issue with llama-2 models
|
39 |
+
- Haven't tested pushing it all the way to 16k context.
|
40 |
|
41 |
### Future plans:
|
42 |
- Native finetune
|