Delta-Vector
/

Darkens-8B-EXL2

English

chat

Model card Files Files and versions Community

Delta-Vector commited on Oct 5

Commit

9fd656b

•

1 Parent(s): 855a719

Update README.md

Browse files

Files changed (1) hide show

README.md +90 -56

README.md CHANGED Viewed

@@ -1,25 +1,93 @@
 ---
-library_name: transformers
-base_model: Dans-DiscountModels/Mistral-NeMo-Minitron-8B-Base-ChatML
 tags:
-- generated_from_trainer
-model-index:
-- name: workspace/data/8b-nemo-fft-out
-  results: []
----
-### exl2 quant (measurement.json in main branch)
----
-### check revisions for quants
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
-axolotl version: `0.4.1`
 ```yaml
 base_model: Dans-DiscountModels/Mistral-NeMo-Minitron-8B-Base-ChatML
 model_type: AutoModelForCausalLM
@@ -38,7 +106,7 @@ load_in_4bit: false
 strict: false
 datasets:
-  - path: anthracite-core/c2_logs_16k_llama_v1.1
     type: sharegpt
     conversation: chatml
   - path: anthracite-org/kalo-opus-instruct-22k-no-refusal
@@ -121,52 +189,18 @@ fsdp:
 fsdp_config:
 special_tokens:
   pad_token: <pad>
-```
-</details><br>
-# workspace/data/8b-nemo-fft-out
-This model is a fine-tuned version of [Dans-DiscountModels/Mistral-NeMo-Minitron-8B-Base-ChatML](https://huggingface.co/Dans-DiscountModels/Mistral-NeMo-Minitron-8B-Base-ChatML) on the None dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 2
-- eval_batch_size: 2
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 10
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 40
-- total_eval_batch_size: 20
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 10
-- num_epochs: 4
-### Training results
-### Framework versions
-- Transformers 4.45.0.dev0
-- Pytorch 2.4.0+cu121
-- Datasets 2.21.0
-- Tokenizers 0.19.1

 ---
+License: agpl-3.0
+Language:
+- En
+Pipeline_tag: text-generation
+Base_model: nvidia/Mistral-NeMo-Minitron-8B-Base
+Tags:
+- Chat
+license: agpl-3.0
+datasets:
+- anthracite-org/kalo-opus-instruct-22k-no-refusal
+- Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned
+- lodrick-the-lafted/kalo-opus-instruct-3k-filtered
+- anthracite-org/nopm_claude_writing_fixed
+- Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned
+- anthracite-org/kalo_opus_misc_240827
+- anthracite-org/kalo_misc_part2
 tags:
+- chat
+language:
+- en
+base_model:
+- nvidia/Mistral-NeMo-Minitron-8B-Base
 ---
+This is the fully cooked, 4 epoch version of [Tor-8B](), this is an experimental version, despite being trained for 4 epochs, the model feels fresh and new and is not overfit, This model aims to have generally good prose and writing while not falling into claude-isms, it follows the *actions* "dialogue" format heavily.
+ # These are EXL2 quantizations for Darkens-8B, for the weights, go [here](https://huggingface.co/Delta-Vector/Darkens-8B), Check revisions for quants, Main repo contains measurement.
+# Quants
+GGUF: https://huggingface.co/Delta-Vector/Darkens-8B-GGUF
+EXL2: https://huggingface.co/Delta-Vector/Darkens-8B-EXL2
+## Prompting
+Model has been Instruct tuned with the ChatML formatting. A typical input would look like this:
+```py
+"""<|im_start|>system
+system prompt<|im_end|>
+<|im_start|>user
+Hi there!<|im_end|>
+<|im_start|>assistant
+Nice to meet you!<|im_end|>
+<|im_start|>user
+Can I ask a question?<|im_end|>
+<|im_start|>assistant
+"""
+```
+## System Prompting
+I would highly recommend using Sao10k's Euryale System prompt, But the "Roleplay Simple" system prompt provided within SillyTavern will work aswell.
+```
+Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.
+<Guidelines>
+• Maintain the character persona but allow it to evolve with the story.
+• Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant.
+• All types of outputs are encouraged; respond accordingly to the narrative.
+• Include dialogues, actions, and thoughts in each response.
+• Utilize all five senses to describe scenarios within {{char}}'s dialogue.
+• Use emotional symbols such as "!" and "~" in appropriate contexts.
+• Incorporate onomatopoeia when suitable.
+• Allow time for {{user}} to respond with their own input, respecting their agency.
+• Act as secondary characters and NPCs as needed, and remove them when appropriate.
+• When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.
+</Guidelines>
+<Forbidden>
+• Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona.
+• Writing for, speaking, thinking, acting, or replying as {{user}} in your response.
+• Repetitive and monotonous outputs.
+• Positivity bias in your replies.
+• Being overly extreme or NSFW when the narrative context is inappropriate.
+</Forbidden>
+Follow the instructions in <Guidelines></Guidelines>, avoiding the items listed in <Forbidden></Forbidden>.
+```
+## Axolotl config
 <details><summary>See axolotl config</summary>
+Axolotl version: `0.4.1`
 ```yaml
 base_model: Dans-DiscountModels/Mistral-NeMo-Minitron-8B-Base-ChatML
 model_type: AutoModelForCausalLM
 strict: false
 datasets:
+  - path: PRIVATE CLAUDE LOG FILTER
     type: sharegpt
     conversation: chatml
   - path: anthracite-org/kalo-opus-instruct-22k-no-refusal
 fsdp_config:
 special_tokens:
   pad_token: <pad>
+```
+</details><br>
+## Credits
+Thank you to [Lucy Knada](https://huggingface.co/lucyknada), [Kalomaze](https://huggingface.co/kalomaze), [Kubernetes Bad](https://huggingface.co/kubernetes-bad) and the rest of [Anthracite](https://huggingface.co/anthracite-org) (But not Alpin.)
+## Training
+The training was done for 4 epochs. I used  10 x [A40s](https://www.nvidia.com/en-us/data-center/a40/) GPUs graciously provided by [Kalomaze](https://huggingface.co/kalomaze) for the full-parameter fine-tuning of the model.
+[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)