File size: 838 Bytes
dd0cc25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Turbcat 8b
# Release notes

This is a direct upgrade over cat 70B, with 2x the dataset size, added Chinese support.

# Data Generation
The data generation process is largely the same. Except additional Chinese data were added. 20 postdocs participated in the annotation process and standard model training for embedding scoring.

# Task coverage
In addition to standard assistant and roleplay data, the following tasks are targeted:
* GRE
* SAT
* MCAT
* Chinese Kaoyan

# Thirdparty dataset
Thanks to the following people for their tremendous support for dataset generation:
* steelskull for the medical COT dataset with gpt4o
* Gryphe for the wonderful action packed dataset
* Turbca for being turbca

# Prompt format for 8b:
llama3

# Prompt format for 72b:
chatml

# Support
Please join https://discord.gg/DwGz54Mz for model support