This is best model I've found for roleplay

#13
by MikiePsyche - opened

So I realize this model was made as a merge but is there anyway to train the new Llama3 8b on the same datasets to get an EstopianMaid Llama3 version model?

@MikiePsyche Sadly not, It could be possible with the release of more Llama3 8B finetunes! specifically the one's I used, but its unlikely to be exactly the same :p

I just wanted to add that this model really blows everything else I've tried out of the water, including the ones everyone says are "sooo good" like Kunoichi or other 13Bs or even 20Bs. I'm learning to finetune my own models so I'm excited to see what I can do, possibly with this! Thank you so much. I don't know what you did, but you did it right :D

@je0923 I've been using cgato_L3-TheSpice-8b-v0.8.3 lately based on llama3. It's actually pretty amazing has 8k context window instead of 4k. Llama3 even though an 8B model was trained on seven times more data llama2 which was the base model for EstopianMaid.

Estopianmaid was a merge of many different finetune Llama2 models. If you know how to finetune, i wonder if you would be able to track down some of the datasets used for fine tunes models that Estopianmaid and finetune them into the cgato_L3-TheSpice-8b-v0.8.3 or llama3 base model.

estopianmaid was merge with;

BlueNipples/TimeCrystal-l2-13B (merge)
SLERPS: Amethyst + Openchat Super
MythoMax + Chronos
ChronoMax + Amethyst

cgato/Thespis-13b-DPO-v0.7 (finetune datasets)
Intel/orca_dpo_pairs
NobodyExistsOnTheInternet/ToxicDPOqa

KoboldAI/LLaMA2-13B-Estopia (merge - each of these models will probably have their datasets listed)
Undi95/UtopiaXL-13B
Doctor-Shotgun/cat-v1.0-13b
PygmalionAI/mythalion-13b
CalderaAI/13B-Thorns-l2
KoboldAI/LLaMA2-13B-Tiefighter
chargoddard/rpguild-chatml-13b

NeverSleep/Noromaid-13B-0.4-DPO (finetune datasets)
Intel/orca_dpo_pairs
NobodyExistsOnTheInternet/ToxicDPOqa
Undi95/toxic-dpo-v0.1-NoWarning

Doctor-Shotgun/cat-v1.0-13b (finetuned to unaligned and uncensored)
Dataset involves work from jondurbin airoboros dataset and chatdoctor.

It seems that Estopianmaid was a merge of a merge of a merge of a merge. Most of the datasets used for these finetune are available here on hugging face and if you look up the models, I wonder if you could track down some of the datasets and finetune the a llama3 version similar to estopianmaid.

I recommend trying cgato_L3-TheSpice-8b-v0.8.3. It's not as horny as Estopianmaid 13b but it's actually better at keeping track of the stories and staying on task. It responses are also more versatile than Estopianmaid plus it requires less VRam and runs faster. Anyhoo, I don't know how to finetune myself. I've just spent 100s of hours playing with the different the models.

@MikiePsyche thank you for the awesome recommendation! I love the horniness of estopianmaid but I will check this out just to be safe. :) I really appreciate your info!!

@MikiePsyche I have no idea why but for me cgato_L3-TheSpice-8b-v0.8.3, or at least the 4 bit exl2 version I found and tried, is actually slower by about 2x :(

But also, other than that, it seems to suffer from Llama-3 censorship. It will do NSFW but use euphemisms like "And so, their twisted acts continued long into the night..." and that's very common with Llama-3, no matter how dark or specific my prompt is. :(

I really appreciate your advice though and would love it if there's any other good models to try haha

Hi :-) I am new in this... i am using Webui Oobabooga and have a RTX4090 with 24GB. "Working" with this model. It responds extremly slow...
It seems to me that I'm either using the wrong settings or my graphics card is not suitable. Can anyone help me and maybe tell me the right settings? Or do I have to use a different WebUI to generate NSFW texts? I'm less interested in chatting. Thanks :-)

@Slartibart23 Please use a GGUF of the model, not these raw FP16 files, koboldcpp_cuda would be the easiest way to run them.
ooba has a llamacpp loader for GGUF files.
https://huggingface.co/TheBloke/EstopianMaid-13B-GGUF

Something like Q6_K would do well on your GPU.

The offical repo for KoboldCPP is here:
https://github.com/LostRuins/koboldcpp
I've found KoboldCPP to be way faster than text-gen-webui with GGUF files.


A 13B should be really fast on a 4090, GGUF files allow much bigger models to run on a GPU than you would think, 24GB of vram would probably fit a GGUF 34B around Q4 ~ Q5.
KoboldCPP can seem complicated at first but if you hover your mouse over the text of an option it will give you an explanation + there's a wiki page in the repo

TheBloke explains the quants here
https://huggingface.co/TheBloke/EstopianMaid-13B-GGUF#provided-files


This space is awesome for checking if you can run a model without downloading them
NyxKrage/LLM-Model-VRAM-Calculator

Example:
image.png

You guys are wonderful... feel face hugged ;-)

hey KatyTheCutie why you not answer to me ?

Sign up or log in to comment