|
# Turbcat 8b |
|
# Release notes |
|
|
|
This is a direct upgrade over cat 70B, with 2x the dataset size, added Chinese support. |
|
|
|
# Data Generation |
|
The data generation process is largely the same. Except additional Chinese data were added. 20 postdocs participated in the annotation process and standard model training for embedding scoring. |
|
|
|
# Task coverage |
|
In addition to standard assistant and roleplay data, the following tasks are targeted: |
|
* GRE |
|
* SAT |
|
* MCAT |
|
* Chinese Kaoyan |
|
|
|
# Thirdparty dataset |
|
Thanks to the following people for their tremendous support for dataset generation: |
|
* steelskull for the medical COT dataset with gpt4o |
|
* Gryphe for the wonderful action packed dataset |
|
* Turbca for being turbca |
|
|
|
# Prompt format for 8b: |
|
llama3 |
|
|
|
# Prompt format for 72b: |
|
chatml |
|
|
|
# Support |
|
Please join https://discord.gg/DwGz54Mz for model support |
|
|