Phi-2 Orange

Phi-2 Orange Version 2

A two-step finetune of Phi-2, with a bit more zest.

This is an improved version of the original Phi-2-Orange that uses an updated training process on the same datasets.

It also uses the latest updated model from Microsoft's Phi-2, making it directly usable within Hugging Face's Transformers library (without the need for trust remote code).

Prompt Format

Phi-2 Orange v2 uses ChatML as the prompt format.
(Update 12th March 2024: fixed eos_token issue)

It's recommended to always prompt with a system instruction (use whatever system prompt you like):

<|im_start|>system
You are a helpful assistant for Python which outputs in Markdown format.<|im_end|>
<|im_start|>user
Write a function to calculate the Fibonacci sequence<|im_end|>
<|im_start|>assistant

For example, if you find the model's output to be overly verbose, instruct it to be short and concise:

<|im_start|>system
You are a helpful assistant. Be short and direct in your answers.<|im_end|>
<|im_start|>user
Was Tom Hanks in the movie Forrest Gump? If so, who did he play and give details of the plot.<|im_end|>
<|im_start|>assistant

Evaluations

Open LLM Leaderboard Evaluation Results
Detailed results can be found here

Metric Value
Average 63.67
AI2 Reasoning Challenge (25-Shot) 61.86
HellaSwag (10-Shot) 76.32
MMLU (5-Shot) 55.72
TruthfulQA (0-shot) 54.84
Winogrande (5-shot) 75.69
GSM8k (5-shot) 57.62

YALL - Yet Another LLM Leaderboard
Evaluation from mlabonne's alternative LLM leaderboard:

Metric Value
Average 49.64
AGIEval 34.55
GPT4All 70.96
TruthfulQA 54.87
Bigbench 38.17

Limitations

This model shares the same limitations as the underlying Phi-2 model, details of which are found here.

Downloads last month
481
Safetensors
Model size
2.78B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for rhysjones/phi-2-orange-v2

Adapters
2 models
Finetunes
2 models
Merges
6 models
Quantizations
2 models

Datasets used to train rhysjones/phi-2-orange-v2

Space using rhysjones/phi-2-orange-v2 1

Evaluation results