metadata

base_model:
  - senseable/WestLake-7B-v2
library_name: transformers
tags:
  - mergekit
  - merge
license: apache-2.0
language:
  - en

WestLake-10.7B-v2: Role-Play & Text Generation Specialist Model

GGUF version available here
EXL2 versions available here: 3.3bpw / 4.0bpw / 5.0bpw / 6.0bpw / 8.0bpw

This is my first viable self-merge of the fantastic WestLake-7B-v2 model, obtained after more than 12 rounds of testing different merge configurations. In my LLM Creativity Benchmark, it greatly improves over the original 7B model, and ranks between miqu-1-120b and goliath-120b! I would describe the improvements as a better writing style, with more details. It has a bit more difficulties following instructions, but not by much.

It is also the first model I have tested to obtain a perfect score with the following test:

Write a sequence of nominal groups that flow into one another, using the following rules:
- each nominal group is made of exactly 3 words
- the first word of each nominal group must be the last word of the previous nominal group
- the first word of the first nominal group is: "ball"
- the last word of the last nominal group is: "stone"
- there must be a theme, of your choosing,  pertaining to all nominal groups
- there must be exactly 7 nominal groups, leading from the first word (ball) to the last word (stone)
- a word already used at the beginning and end of a nominal group cannot be reused
Present your solution as a list numbered with roman numerals.
Finally, explain why you chose your specific theme.

Usage

Base model: senseable/WestLake-7B-v2 based of Mistral-7B-v0.1
Context size: 8192 (even though Mistral-7B is 32k, WestLake was trained with 8k, and using a larger context is likely to cause problems)
Prompt format: in general, Mistral based models are able to understand many prompt formats, but the following produce the best results, and are recommended (in order of preference)
- Alpaca (reported by senseable as working better than ChatML, and confirmed by me)
- ChatML (used during WestLake training)
- Mistral Instruct (original format from Mistral-7B)
- Zephyr (variant of ChatML which I have found to sometimes produce better results)

Merge Details

This is a merge of pre-trained language models created using mergekit.
This model was merged using the passthrough merge method.
The following models were included in the merge:

senseable/WestLake-7B-v2

The following YAML configuration was used to produce this model:

dtype: float16
merge_method: passthrough
slices:
  - sources:
    - model: senseable/WestLake-7B-v2
      layer_range: [0,9]
  - sources:
    - model: senseable/WestLake-7B-v2
      layer_range: [5,14]
  - sources:
    - model: senseable/WestLake-7B-v2
      layer_range: [10,19]
  - sources:
    - model: senseable/WestLake-7B-v2
      layer_range: [15,24]
  - sources:
    - model: senseable/WestLake-7B-v2
      layer_range: [20,32]

Original model card: Westlake-7Bv2: Role-Play & Text Generation Specialist Model

Update Notes: Version 2 trained 1 additional epoch cycle for 3 total

Welcome to the documentation of Westlake-7B, a cutting-edge language model designed for exceptional role-play and text generation tasks. This README file aims to provide an overview of our capabilities, usage guidelines, and potential applications.

About Westlake-7Bv2

Westlake-7B is built upon a vast corpus of diverse texts, enabling it to generate contextually relevant responses in various scenarios. With its impressive size of 7 billion parameters, this model excels at understanding nuances in language and producing creative outputs.

Key Features

Role-Play: Westlake-7Bv2 can seamlessly adapt to different character personas and engage in dynamic conversations while maintaining consistency throughout the interaction. It can generate believable dialogues across various genres, including fiction, non-fiction, historical events, or even fantasy worlds.
Text Generation: This model is proficient at generating original content such as stories, poems, essays, news articles, and more. Its ability to capture the essence of different writing styles makes it an ideal tool for creative writers seeking inspiration or assistance in their projects.
Contextual Understanding: Westlake-7B's extensive training allows it to comprehend complex contexts and generate responses that align with given situations. It can handle multiple topics simultaneously, making it versatile across various applications.
Continuous Learning: As a language model, Westlake-7B continuously improves its performance through ongoing training on new data sets. This ensures its capabilities remain up-to-date and relevant in an ever-evolving world of communication.

Usage Guidelines

To utilize Westlake-7Bv2 for your projects or experiments, follow these steps:

Prompting: Provide clear and concise prompts that outline the desired role-play scenario or text generation task. The quality of output depends heavily on the clarity and relevance of input instructions.
Feedback Loop: For optimal results, consider incorporating a feedback loop into your application to refine generated outputs based on user preferences or additional contextual information. This iterative process can significantly enhance the model's performance in specific domains.
Ethical Considerations: As with any AI system, ensure responsible usage of Westlake-7B by avoiding harmful content generation or misuse of its capabilities.

Potential Applications

Westlake-7Bv2's versatility makes it suitable for various applications across different industries:

Creative Writing: Assist authors in generating new ideas, expanding storylines, or even completing drafts by providing creative suggestions and textual content.
Education: Enhance language learning platforms with interactive role-play scenarios to improve students' communication skills and cultural understanding.
Gaming: Integrate Westlake-7B into game engines for dynamic non-player character interactions or generating unique questlines based on player choices.
Customer Support: Leverage the model's conversational abilities to create chatbots capable of handling complex queries and providing personalized assistance.
Social Media: Develop applications that generate engaging content such as captions, status updates, or even entire posts tailored to users' preferences and interests.