|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- ja |
|
tags: |
|
- finetuned |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
--- |
|
<img src="./wabisabi-logo.jpg" width="100%" height="20%" alt=""> |
|
|
|
## Model Card for Wabisabi-v1.0 |
|
|
|
The Mistral-7B--based Large Language Model (LLM) is an noveldataset fine-tuned version of the Mistral-7B-v0.1 |
|
|
|
wabisabi has the following changes compared to Mistral-7B-v0.1. |
|
- 128k context window (8k context in v0.1) |
|
- Achieving both high quality Japanese and English generation |
|
- Can be generated NSFW |
|
- Memory ability that does not forget even after long-context generation |
|
|
|
This model was created with the help of GPUs from the first LocalAI hackathon. |
|
|
|
We would like to take this opportunity to thank |
|
|
|
## List of Creation Methods |
|
|
|
- Chatvector for multiple models |
|
- Simple linear merging of result models |
|
- Domain and Sentence Enhancement with LORA |
|
- Context expansion |
|
|
|
## Instruction format |
|
|
|
Vicuna-v1.1 |
|
|
|
## Other points to keep in mind |
|
- The training data may be biased. Be careful with the generated sentences. |
|
- Memory usage may be large for long inferences. |
|
- If possible, we recommend inferring with llamacpp rather than Transformers. |