Update README.md

bf46dfe verified 4 months ago

2.22 kB

	---
	license: other
	license_name: tongyi-qianwen
	license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/resolve/main/LICENSE
	---
	This model is an improved version for Korean, based on the [Qwen2-72B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) model.


	### LogicKor Benchmark (24.07.31)
	* [The following benchmark](https://lk.instruct.kr/) ranks are based on 1-shot evaluation.
	\| Rank \| Model \| Reasoning \| Math \| Writing \| Coding \| Understanding \| Grammar \| Singleturn \| Multiturn \| Total \| Parameters \|
	\|------\|-------\|-----------\|-------\|--------\|--------\|-------\|---------\|-----------\|-----------\|-------\|---------\|
	\| 1 \| openai/gpt-4o-2024-05-13 \| 9.21 \| 8.71 \| 9.64 \| 9.78 \| 9.64 \| 9.50 \| 9.33 \| 9.50 \| 9.41 \| ? \|
	\| 2 \| anthropic/claude-3-5-sonnet-20240620 \| 8.64 \| 8.42 \| 9.85 \| 9.78 \| 9.92 \| 9.21 \| 9.26 \| 9.35 \| 9.30 \| ? \|
	\| 4 \| mistralai/Mistral-Large-Instruct-2407 \| 9.71 \| 9.07 \| 9.57 \| 9.92 \| 9.92 \| 6.78 \| 9.19 \| 9.14 \| 9.16 \| 123B \|
	\| 8 \| meta-llama/Meta-Llama-3.1-405B-Instruct-FP8 \| 8.78 \| 7.14 \| 9.28 \| 9.64 \| 9.64\| 8.57 \| 8.97 \| 8.71 \| 8.84 \| 405B \|
	\| 9 \| ```denial07/Qwen2-72B-Instruct-kor-dpo``` \| 8.85 \| 8.21 \| 9.14 \| 9.71 \| 9.64 \| 7.21 \| 8.88 \| 8.71 \| 8.79 \| 72B \|
	\| 10 \| Qwen/Qwen2-72B-Instruct \| 8.00 \| 8.14 \| 9.07 \| 9.85 \| 9.78 \| 7.28 \| 8.61 \| 8.76 \| 8.69 \| 72B \|
	\| 11 \| google/gemini-1.5-pro-001 \| 7.00 \| 8.00 \| 9.57 \| 8.85 \| 9.35 \| 8.64 \| 8.61 \| 8.52 \| 8.57 \| ? \|

	### KMMLU Benchmark
	* [HAERAE-HUB/KMMLU](https://huggingface.co/datasets/HAERAE-HUB/KMMLU) benchmark accuracy score.
	\| Category \|Qwen2-72B-it-kor-dpo\| Qwen2-72B-it \| Mistral-Large-it-2407 \| Questions \|
	\|-----------------\|--------------------\|---------------\|-----------------------\|------------\|
	\| HUMSS \| 0.63 \| 0.63 \| 0.62 \| 5130 \|
	\| STEM \| 0.59 \| 0.59 \| 0.57 \| 9900 \|
	\| Applied Science \| 0.56 \| 0.56 \| 0.54 \| 11600 \|
	\| Other \| 0.58 \| 0.58 \| 0.54 \| 8400 \|
	\| Overall Accuracy\| 0.58 \| 0.58 \| 0.56 \| 35030 \|

	---
	license: other
	license_name: tongyi-qianwen
	license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/resolve/main/LICENSE
	---
	This model is an improved version for Korean, based on the [Qwen2-72B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) model.


	### LogicKor Benchmark (24.07.31)
	* [The following benchmark](https://lk.instruct.kr/) ranks are based on 1-shot evaluation.
	\| Rank \| Model \| Reasoning \| Math \| Writing \| Coding \| Understanding \| Grammar \| Singleturn \| Multiturn \| Total \| Parameters \|
	\|------\|-------\|-----------\|-------\|--------\|--------\|-------\|---------\|-----------\|-----------\|-------\|---------\|
	\| 1 \| openai/gpt-4o-2024-05-13 \| 9.21 \| 8.71 \| 9.64 \| 9.78 \| 9.64 \| 9.50 \| 9.33 \| 9.50 \| 9.41 \| ? \|
	\| 2 \| anthropic/claude-3-5-sonnet-20240620 \| 8.64 \| 8.42 \| 9.85 \| 9.78 \| 9.92 \| 9.21 \| 9.26 \| 9.35 \| 9.30 \| ? \|
	\| 4 \| mistralai/Mistral-Large-Instruct-2407 \| 9.71 \| 9.07 \| 9.57 \| 9.92 \| 9.92 \| 6.78 \| 9.19 \| 9.14 \| 9.16 \| 123B \|
	\| 8 \| meta-llama/Meta-Llama-3.1-405B-Instruct-FP8 \| 8.78 \| 7.14 \| 9.28 \| 9.64 \| 9.64\| 8.57 \| 8.97 \| 8.71 \| 8.84 \| 405B \|
	\| 9 \| ```denial07/Qwen2-72B-Instruct-kor-dpo``` \| 8.85 \| 8.21 \| 9.14 \| 9.71 \| 9.64 \| 7.21 \| 8.88 \| 8.71 \| 8.79 \| 72B \|
	\| 10 \| Qwen/Qwen2-72B-Instruct \| 8.00 \| 8.14 \| 9.07 \| 9.85 \| 9.78 \| 7.28 \| 8.61 \| 8.76 \| 8.69 \| 72B \|
	\| 11 \| google/gemini-1.5-pro-001 \| 7.00 \| 8.00 \| 9.57 \| 8.85 \| 9.35 \| 8.64 \| 8.61 \| 8.52 \| 8.57 \| ? \|

	### KMMLU Benchmark
	* [HAERAE-HUB/KMMLU](https://huggingface.co/datasets/HAERAE-HUB/KMMLU) benchmark accuracy score.
	\| Category \|Qwen2-72B-it-kor-dpo\| Qwen2-72B-it \| Mistral-Large-it-2407 \| Questions \|
	\|-----------------\|--------------------\|---------------\|-----------------------\|------------\|
	\| HUMSS \| 0.63 \| 0.63 \| 0.62 \| 5130 \|
	\| STEM \| 0.59 \| 0.59 \| 0.57 \| 9900 \|
	\| Applied Science \| 0.56 \| 0.56 \| 0.54 \| 11600 \|
	\| Other \| 0.58 \| 0.58 \| 0.54 \| 8400 \|
	\| Overall Accuracy\| 0.58 \| 0.58 \| 0.56 \| 35030 \|