totally-not-an-llm
/

EverythingLM-13b-16k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

totally-not-an-llm commited on Aug 12, 2023

Commit

d2db490

·

1 Parent(s): 825f454

Update README.md

Files changed (1) hide show

README.md +14 -14

README.md CHANGED Viewed

@@ -15,10 +15,10 @@ The model is completely uncensored.
 This model is an early test of the EverythingLM dataset and some new experimental principles, so don't consider it SOTA.
 ### Notable features:
-- Automatically triggered CoT reasoning
-- Verbose and detailed replies
-- Creative stories
-- Better prompt understanding
 ### Prompt format:
 It is a modified Vicuna format, the same used in many of ehartford's models.
@@ -32,17 +32,17 @@ ASSISTANT:
 Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3.  QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
 ### Model quirks:
-- Due to the nature of the dataset, it does better with more detail.  I've found it gives much better stories when I provide more requirements
-- It really likes to use numbered lists.  I don't necessarilly have a problem with this but it's something to note when training on the dataset
-- It likes to write fairy tales over anything else, which is strange.  This can easily be fixed by prompting
-- Occasionally it will fall into repetition, this seems to be a commmon issue with llama-2 models
 - Haven't tested pushing it all the way to 16k context.
 ### Future plans:
-- Native finetune
-- Other model sizes
 - Improve dataset by:
-  - Regenerating using gpt-4
-  - A bit more data with more diversity
-- Refactor dataset generation script
-- Test some model merges using this model

 This model is an early test of the EverythingLM dataset and some new experimental principles, so don't consider it SOTA.
 ### Notable features:
+- Automatically triggered CoT reasoning.
+- Verbose and detailed replies.
+- Creative stories.
+- Better prompt understanding.
 ### Prompt format:
 It is a modified Vicuna format, the same used in many of ehartford's models.
 Training took about 1 hour using QLoRa on 1xA100, so this model can be recreated for about $3.  QLoRa model can be found here: https://huggingface.co/totally-not-an-llm/EverythingLM-13b-peft.
 ### Model quirks:
+- Due to the nature of the dataset, it does better with more detail.  I've found it gives much better stories when I provide more requirements.
+- It really likes to use numbered lists.  I don't necessarilly have a problem with this but it's something to note when training on the dataset.
+- It likes to write fairy tales over anything else, which is strange.  This can easily be fixed by prompting.
+- Occasionally it will fall into repetition, this seems to be a commmon issue with llama-2 models.
 - Haven't tested pushing it all the way to 16k context.
 ### Future plans:
+- Native finetune.
+- Other model sizes.
 - Improve dataset by:
+  - Regenerating using gpt-4.
+  - A bit more data with more diversity.
+- Refactor dataset generation script.
+- Test some model merges using this model.