totally-not-an-llm
/

EverythingLM-13b-16k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

totally-not-an-llm commited on Aug 12, 2023

Commit

f770975

·

1 Parent(s): 630953b

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -29,10 +29,14 @@ ASSISTANT:
 Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3.  QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
 ### Model quirks:
 - Due to the nature of the dataset, it does better with more detail.  I've found it gives much better stories when I provide more requirements.
 - It really likes to use numbered lists.  I don't necessarilly have a problem with this but it's something to note when training on the dataset.
 - I've had trouble with ggml k-quants.
 ### Future plans:
 - Native finetune

 Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3.  QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
+This is an early test, so here are some things to note on the model:
 ### Model quirks:
 - Due to the nature of the dataset, it does better with more detail.  I've found it gives much better stories when I provide more requirements.
 - It really likes to use numbered lists.  I don't necessarilly have a problem with this but it's something to note when training on the dataset.
 - I've had trouble with ggml k-quants.
+- It likes to write fairy tales over anything else, which is strange.  This can easily be fixed by prompting.
+- Occasionally it will fall into repetition, this seems to be a commmon issue with llama-2 models.
 ### Future plans:
 - Native finetune