denial07
/

Qwen2-72B-Instruct-kor-dpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

This model is an improved version for Korean, based on the Qwen2-72B-Instruct model.

LogicKor Benchmark (24.07.31)

The following benchmark ranks are based on 1-shot evaluation.

Rank	Model	Reasoning	Math	Writing	Coding	Understanding	Grammar	Singleturn	Multiturn	Total	Parameters
1	openai/gpt-4o-2024-05-13	9.21	8.71	9.64	9.78	9.64	9.50	9.33	9.50	9.41	?
2	anthropic/claude-3-5-sonnet-20240620	8.64	8.42	9.85	9.78	9.92	9.21	9.26	9.35	9.30	?
4	mistralai/Mistral-Large-Instruct-2407	9.71	9.07	9.57	9.92	9.92	6.78	9.19	9.14	9.16	123B
8	meta-llama/Meta-Llama-3.1-405B-Instruct-FP8	8.78	7.14	9.28	9.64	9.64	8.57	8.97	8.71	8.84	405B
9	`denial07/Qwen2-72B-Instruct-kor-dpo`	8.85	8.21	9.14	9.71	9.64	7.21	8.88	8.71	8.79	72B
10	Qwen/Qwen2-72B-Instruct	8.00	8.14	9.07	9.85	9.78	7.28	8.61	8.76	8.69	72B
11	google/gemini-1.5-pro-001	7.00	8.00	9.57	8.85	9.35	8.64	8.61	8.52	8.57	?

KMMLU Benchmark

HAERAE-HUB/KMMLU benchmark accuracy score.

Category	Qwen2-72B-it-kor-dpo	Qwen2-72B-it	Mistral-Large-it-2407	Questions
HUMSS	0.63	0.63	0.62	5130
STEM	0.59	0.59	0.57	9900
Applied Science	0.56	0.56	0.54	11600
Other	0.58	0.58	0.54	8400
Overall Accuracy	0.58	0.58	0.56	35030

Downloads last month: 79

Safetensors

Model size

72.7B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for denial07/Qwen2-72B-Instruct-kor-dpo

Quantizations

1 model