File size: 3,485 Bytes
859ce7e 0f64ac7 859ce7e 0f64ac7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
---
pretty_name: Pandora 7B Chat
base_model: google/gemma-7b
datasets:
- danilopeixoto/pandora-instruct
- danilopeixoto/pandora-tool-calling
- danilopeixoto/pandora-rlhf
task_categories:
- text-generation
tags:
- chat
- dpo
- fine-tuning
- function-calling
- instruct
- rlhf
- sft
- tool-calling
license: bsd-3-clause
---
# Pandora 7B Chat
Pandora 7B Chat is a Large Language Model (LLM) designed for chat applications.
Pandora is fine-tuned with publicly available datasets, including a tool-calling dataset for agent-based tasks and a Reinforcement Learning from Human Feedback (RLHF) dataset with Direct Preference Optimization (DPO) training for preference alignment.
The fine-tuning process incorporates Low-Rank Adaptation (LoRA) with the [MLX framework](https://ml-explore.github.io/mlx/build/html/index.html), optimized for Apple Silicon.
The model is based on the [google/gemma-7b](https://huggingface.co/google/gemma-7b) model.
![Pandora](assets/pandora.jpeg)
## Datasets
Datasets used for fine-tuning stages:
- [danilopeixoto/pandora-instruct](https://huggingface.co/datasets/danilopeixoto/pandora-instruct)
- [danilopeixoto/pandora-tool-calling](https://huggingface.co/datasets/danilopeixoto/pandora-tool-calling)
- [danilopeixoto/pandora-rlhf](https://huggingface.co/datasets/danilopeixoto/pandora-rlhf)
## Evaluation
Evaluation on [MT-Bench](https://arxiv.org/abs/2306.05685) multi-turn benchmark:
![Benchmark](assets/benchmark.svg)
## Usage
Install package dependencies:
```shell
pip install mlx-lm
```
Generate response:
```python
from mlx_lm import load, generate
model, tokenizer = load('danilopeixoto/pandora-7b-chat')
prompt = '''<|start|>system
You are Pandora, a helpful AI assistant.
<|end|>
<|start|>user
Hello!
<|end|>
<|start|>'''
response = generate(model, tokenizer, prompt)
print(response)
```
The model supports the following prompt templates:
**Question-answering with system messages**
```txt
<|start|>system
{system_message}
<|end|>
<|start|>user
{user_message}
<|end|>
<|start|>assistant
{assistant_message}
<|end|>
```
**Tool calling**
```txt
<|start|>system
{system_message}
<|end|>
<|start|>system:tools
{system_tools_message}
<|end|>
<|start|>user
{user_message}
<|end|>
<|start|>assistant:tool_calls
{assistant_tool_calls_message}
<|end|>
<|start|>tool
{tool_message}
<|end|>
<|start|>assistant
{assistant_message}
<|end|>
```
> **Note** The variables `system_tools_message`, `assistant_tool_calls_message`, and `tool_message` must contain valid YAML.
An example of a tool-calling prompt:
```python
prompt = '''<|start|>system
You are Pandora, a helpful AI assistant.
<|end|>
<|start|>system:tools
- description: Get the current weather based on a given location.
name: get_current_weather
parameters:
type: object
properties:
location:
type: string
description: The location name.
required:
- location
<|end|>
<|start|>user
What is the weather in Sydney, Australia?
<|end|>
<|start|>assistant:tool_calls
- name: get_current_weather
arguments:
location: Sydney, Australia
<|end|>
<|start|>tool
name: get_current_weather
content: 72°F
<|end|>
<|start|>'''
```
## Examples
**OpenGPTs**
![OpenGPTs](assets/opengpts.png)
## Copyright and license
Copyright (c) 2024, Danilo Peixoto Ferreira. All rights reserved.
Project developed under a [BSD-3-Clause license](LICENSE.md).
Gemma is provided under and subject to the [Gemma Terms of Use license](GEMMA_LICENSE.md).
|