TheBloke commited on
Commit
1d95f45
1 Parent(s): b1db90f

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -6
README.md CHANGED
@@ -206,12 +206,12 @@ Windows Command Line users: You can set the environment variable by running `set
206
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
207
 
208
  ```shell
209
- ./main -ngl 35 -m chronomaid-storytelling-13b.Q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{prompt}\n\n### Response:"
210
  ```
211
 
212
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
213
 
214
- Change `-c 4096` to the desired sequence length. For extended sequence models - eg 8K, 16K, 32K - the necessary RoPE scaling parameters are read from the GGUF file and set by llama.cpp automatically. Note that longer sequence lengths require much more resources, so you may need to reduce this value.
215
 
216
  If you want to have a chat-style conversation, replace the `-p <PROMPT>` argument with `-i -ins`
217
 
@@ -260,7 +260,7 @@ from llama_cpp import Llama
260
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
261
  llm = Llama(
262
  model_path="./chronomaid-storytelling-13b.Q4_K_M.gguf", # Download the model file first
263
- n_ctx=4096, # The max sequence length to use - note that longer sequence lengths require much more resources
264
  n_threads=8, # The number of CPU threads to use, tailor to your system and the resulting performance
265
  n_gpu_layers=35 # The number of layers to offload to GPU, if you have GPU acceleration available
266
  )
@@ -321,7 +321,7 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
321
 
322
  **Special thanks to**: Aemon Algiz.
323
 
324
- **Patreon special mentions**: Brandon Frisco, LangChain4j, Spiking Neurons AB, transmissions 11, Joseph William Delisle, Nitin Borwankar, Willem Michiel, Michael Dempsey, vamX, Jeffrey Morgan, zynix, jjj, Omer Bin Jawed, Sean Connelly, jinyuan sun, Jeromy Smith, Shadi, Pawan Osman, Chadd, Elijah Stavena, Illia Dulskyi, Sebastain Graf, Stephen Murray, terasurfer, Edmond Seymore, Celu Ramasamy, Mandus, Alex, biorpg, Ajan Kanaga, Clay Pascal, Raven Klaugh, 阿明, K, ya boyyy, usrbinkat, Alicia Loh, John Villwock, ReadyPlayerEmma, Chris Smitley, Cap'n Zoog, fincy, GodLy, S_X, sidney chen, Cory Kujawski, OG, Mano Prime, AzureBlack, Pieter, Kalila, Spencer Kim, Tom X Nguyen, Stanislav Ovsiannikov, Michael Levine, Andrey, Trailburnt, Vadim, Enrico Ros, Talal Aujan, Brandon Phillips, Jack West, Eugene Pentland, Michael Davis, Will Dee, webtim, Jonathan Leane, Alps Aficionado, Rooh Singh, Tiffany J. Kim, theTransient, Luke @flexchar, Elle, Caitlyn Gatomon, Ari Malik, subjectnull, Johann-Peter Hartmann, Trenton Dambrowitz, Imad Khwaja, Asp the Wyvern, Emad Mostaque, Rainer Wilmers, Alexandros Triantafyllidis, Nicholas, Pedro Madruga, SuperWojo, Harry Royden McLaughlin, James Bentley, Olakabola, David Ziegler, Ai Maven, Jeff Scroggin, Nikolai Manek, Deo Leter, Matthew Berman, Fen Risland, Ken Nordquist, Manuel Alberto Morcote, Luke Pendergrass, TL, Fred von Graf, Randy H, Dan Guido, NimbleBox.ai, Vitor Caleffi, Gabriel Tamborski, knownsqashed, Lone Striker, Erik Bjäreholt, John Detwiler, Leonard Tan, Iucharbius
325
 
326
 
327
  Thank you to all my generous patrons and donaters!
@@ -333,13 +333,18 @@ And thank you again to a16z for their generous grant.
333
  <!-- original-model-card start -->
334
  # Original model card: Carsten Kragelund's Chronomaid Storytelling 13B
335
 
 
336
  # Chronomaid-Storytelling-13b
337
 
 
 
338
  Merge including [Noromaid-13b-v0.1.1](https://huggingface.co/NeverSleep/Noromaid-13b-v0.1.1), and [Chronos-13b-v2](https://huggingface.co/elinas/chronos-13b-v2) with the [Storytelling-v1-Lora](https://huggingface.co/Undi95/Storytelling-v1-13B-lora) applied afterwards
339
 
 
 
340
  ## Prompt Format
341
 
342
- Tested with Alpaca, the Noromaid preset's will probably also work
343
  ```
344
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
345
 
@@ -349,6 +354,19 @@ Below is an instruction that describes a task. Write a response that appropriate
349
  ### Response:
350
  ```
351
 
352
- In-depth model card coming later...
 
 
 
 
 
 
 
 
 
 
 
 
 
353
 
354
  <!-- original-model-card end -->
 
206
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
207
 
208
  ```shell
209
+ ./main -ngl 35 -m chronomaid-storytelling-13b.Q4_K_M.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{prompt}\n\n### Response:"
210
  ```
211
 
212
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
213
 
214
+ Change `-c 2048` to the desired sequence length. For extended sequence models - eg 8K, 16K, 32K - the necessary RoPE scaling parameters are read from the GGUF file and set by llama.cpp automatically. Note that longer sequence lengths require much more resources, so you may need to reduce this value.
215
 
216
  If you want to have a chat-style conversation, replace the `-p <PROMPT>` argument with `-i -ins`
217
 
 
260
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
261
  llm = Llama(
262
  model_path="./chronomaid-storytelling-13b.Q4_K_M.gguf", # Download the model file first
263
+ n_ctx=2048, # The max sequence length to use - note that longer sequence lengths require much more resources
264
  n_threads=8, # The number of CPU threads to use, tailor to your system and the resulting performance
265
  n_gpu_layers=35 # The number of layers to offload to GPU, if you have GPU acceleration available
266
  )
 
321
 
322
  **Special thanks to**: Aemon Algiz.
323
 
324
+ **Patreon special mentions**: Michael Levine, 阿明, Trailburnt, Nikolai Manek, John Detwiler, Randy H, Will Dee, Sebastain Graf, NimbleBox.ai, Eugene Pentland, Emad Mostaque, Ai Maven, Jim Angel, Jeff Scroggin, Michael Davis, Manuel Alberto Morcote, Stephen Murray, Robert, Justin Joy, Luke @flexchar, Brandon Frisco, Elijah Stavena, S_X, Dan Guido, Undi ., Komninos Chatzipapas, Shadi, theTransient, Lone Striker, Raven Klaugh, jjj, Cap'n Zoog, Michel-Marie MAUDET (LINAGORA), Matthew Berman, David, Fen Risland, Omer Bin Jawed, Luke Pendergrass, Kalila, OG, Erik Bjäreholt, Rooh Singh, Joseph William Delisle, Dan Lewis, TL, John Villwock, AzureBlack, Brad, Pedro Madruga, Caitlyn Gatomon, K, jinyuan sun, Mano Prime, Alex, Jeffrey Morgan, Alicia Loh, Illia Dulskyi, Chadd, transmissions 11, fincy, Rainer Wilmers, ReadyPlayerEmma, knownsqashed, Mandus, biorpg, Deo Leter, Brandon Phillips, SuperWojo, Sean Connelly, Iucharbius, Jack West, Harry Royden McLaughlin, Nicholas, terasurfer, Vitor Caleffi, Duane Dunston, Johann-Peter Hartmann, David Ziegler, Olakabola, Ken Nordquist, Trenton Dambrowitz, Tom X Nguyen, Vadim, Ajan Kanaga, Leonard Tan, Clay Pascal, Alexandros Triantafyllidis, JM33133, Xule, vamX, ya boyyy, subjectnull, Talal Aujan, Alps Aficionado, wassieverse, Ari Malik, James Bentley, Woland, Spencer Kim, Michael Dempsey, Fred von Graf, Elle, zynix, William Richards, Stanislav Ovsiannikov, Edmond Seymore, Jonathan Leane, Martin Kemka, usrbinkat, Enrico Ros
325
 
326
 
327
  Thank you to all my generous patrons and donaters!
 
333
  <!-- original-model-card start -->
334
  # Original model card: Carsten Kragelund's Chronomaid Storytelling 13B
335
 
336
+
337
  # Chronomaid-Storytelling-13b
338
 
339
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65221315578e7da0d74f73d8/v2fVXhCcOdvOdjTrd9dY0.webp" alt="image of a vibrant and whimsical scene with an anime-style character as the focal point. The character is a young girl with blue eyes and short brown hair, wearing a black and white maid outfit with ruffled apron and a red ribbon at her collar. She is lying amidst a fantastical backdrop filled with an assortment of floating, colorful clocks, gears, and hourglasses. The space around her is filled with sparkling stars, glowing nebulae, and swirling galaxies." height="75%" width="75%" />
340
+
341
  Merge including [Noromaid-13b-v0.1.1](https://huggingface.co/NeverSleep/Noromaid-13b-v0.1.1), and [Chronos-13b-v2](https://huggingface.co/elinas/chronos-13b-v2) with the [Storytelling-v1-Lora](https://huggingface.co/Undi95/Storytelling-v1-13B-lora) applied afterwards
342
 
343
+ Inteded for primarily RP, and will do ERP, narrator-character and group-chats without much trouble in my testing.
344
+
345
  ## Prompt Format
346
 
347
+ Tested with Alpaca, the Noromaid preset's will probably also work (check the Noromaid model card for SillyTavern presets)
348
  ```
349
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
350
 
 
354
  ### Response:
355
  ```
356
 
357
+ ## Sampler Settings
358
+
359
+ Tested at
360
+ * `temp` 1.3 `min p` 0.05 and 0.15
361
+ * `temp` 1.7, `min p` 0.08 and 0.15
362
+
363
+ ## Quantized Models
364
+ The model has been kindly quantized in GGUF, AWQ, and GPTQ by TheBloke
365
+ Find them in the [Chronomaid-Storytelling-13b Collection](https://huggingface.co/collections/NyxKrage/chronomaid-storytelling-13b-656115dd7065690d7f17c7c8)
366
+
367
+ ## Thanks ❤️
368
+
369
+ To [Undi](https://huggingface.co/Undi95) & [Ikari](https://huggingface.co/IkariDev) for Noromaid and [Elinas](https://huggingface.co/elinas) for Chronos
370
+ Support [Undi](https://ko-fi.com/undiai) and [Elinas](https://ko-fi.com/elinas) on Kofi
371
 
372
  <!-- original-model-card end -->