File size: 2,604 Bytes
4c3d0a3
 
 
 
 
 
 
 
 
 
 
 
 
 
e3ac5be
4c3d0a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d836a26
e3ac5be
3a211f2
4c3d0a3
 
 
 
 
 
 
 
 
 
565aa51
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
datasets:
- garage-bAInd/Open-Platypus
---

# Instruction tune of Mistral-7B-v0.1 with Open-Platypus (fp16)


## Overview

This is [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1), with instruction tuning performed with the [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) dataset.

**This is a (merged) QLoRA fine-tune (rank 64)**. 

The finetune was performed with 1x RTX 6000 Ada (~9 hours).


## How to Use

As of writing, the `Mistral` architecture requires installation of `transformers` from source. With this done, load like any other model.

### Benchmarks

ARC (25 shot): 62.80

Hellaswag (10 shot): 84.12

MMLU (5 shot): 64.20


## Context Length - Relative Performance (wikitext perplexity)

| Context (tokens) | <ins>**bhenrym14/mistral-7b-platypus-fp16**</ins> | bhenrym14/airoboros-l2-13b-2.1-YaRN-64k | bhenrym14/airophin-13b-pntk-16k-fp16 | bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16  | jondurbin/airoboros-l2-13b-gpt4-1.4.1 |
| --- | --- |--- | ---| ----- | -----|
| 512 | **7.22** | 7.64 | 7.62  | 7.90 | 7.23 |
| 1024 | 6.04 | 6.15 | 6.20  | 6.17 | **5.85**  |
| 2048 | 5.50 | 5.29 | 5.38  | 5.23 | **5.07** |
| 4096 | 5.05 |4.93 | 5.08 | 4.91 | **4.77** |
| 8192 | 4.96 |**4.69** | 4.90 | Not Tested | 57.1 |
| 12000 | Not Tested | **4.53** | 4.82 | Not Tested | Not Tested |

- While the mistral model is very impressive for its size, particularly on benchmarks, the sliding window attention and/or model size impacts its competitiveness with other context extension techniques applied to larger llama2 and llama variants. Is this is more to do with sliding window attention or model size?

## Prompting:

Model was trained with legacy airoboros <2.0 system prompt. See [bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16](https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16) model card for details.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_bhenrym14__mistral-7b-platypus-fp16)

| Metric                | Value                     |
|-----------------------|---------------------------|
| Avg.                  | 56.89   |
| ARC (25-shot)         | 63.05          |
| HellaSwag (10-shot)   | 84.15    |
| MMLU (5-shot)         | 64.11         |
| TruthfulQA (0-shot)   | 45.07   |
| Winogrande (5-shot)   | 78.53   |
| GSM8K (5-shot)        | 17.36        |
| DROP (3-shot)         | 45.92         |