Upload folder using huggingface_hub

e0f212a verified 13 days ago

5.44 kB

	---
	license: apache-2.0
	library_name: transformers
	base_model:
	- openchat/openchat-3.5-0106
	datasets:
	- Yukang/LongAlpaca-12k
	model-index:
	- name: OpenChat-3.5-0106_32K-PoSE
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 39.69
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_32K-PoSE
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 8.83
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_32K-PoSE
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 1.44
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_32K-PoSE
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 3.47
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_32K-PoSE
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 11.33
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_32K-PoSE
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 11.46
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_32K-PoSE
	name: Open LLM Leaderboard
	---
	<p align="center">
	<a href="https://ko-fi.com/pretergeek">Buy me a Ko-Fi</a> •
	<a href="https://patreon.com/Pretergeek">Support my work using Patreon</a>
	</p>

	# OpenChat-3.5-0106_32K-PoSE

	## Description

	This model is [Openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) with the context length extended from 8192 tokens to 32768 tokens using [PoSE](https://huggingface.co/papers/2309.10400).

	The model was fine-tuned using [Rank-Stabilized LoRA](https://huggingface.co/blog/damjan-k/rslora) and the [LongAlpaca-12K](Yukang/LongAlpaca-12k) dataset. I hope to continue extending the context in future versions and then apply the same methods to my [upscaled versions of OpenChat-3.5](https://huggingface.co/collections/Pretergeek/openchat-35-0106-with-additional-layers-66a8d3262c7c3ebdd7783a29) that were created using Block Expansion instead of Depth UP Scaling.

	After fine-tuning, the model was tested using passkey retrieval and achieved a score of 100%. Below you can also find the results of the Open LLM Leaderboard evaluations and I am a bit disappointed with those. The model ended up with a significant reduction in performance compared to the original model in all but one test (MUSR). I expected it to do better than the original model on MUSR since that test benefits from long context understanding but I didn't expect such a negative impact on the other tasks. Anyway, I will be addressing this on a future version. I used the LongAlpaca-12K dataset because it is small and I have limited computational resources but I might have to try a larger dataset for the next attempt. If you would like to help me, there are links on the top of the model card for my Patreon and Ko-Fi.

	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Pretergeek__OpenChat-3.5-0106_32K-PoSE)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|12.70\|
	\|IFEval (0-Shot) \|39.69\|
	\|BBH (3-Shot) \| 8.83\|
	\|MATH Lvl 5 (4-Shot)\| 1.44\|
	\|GPQA (0-shot) \| 3.47\|
	\|MuSR (0-shot) \|11.33\|
	\|MMLU-PRO (5-shot) \|11.46\|

	# Citation
	```
	@misc{zhu2024poseefficientcontextwindow,
	title={PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training},
	author={Dawei Zhu and Nan Yang and Liang Wang and Yifan Song and Wenhao Wu and Furu Wei and Sujian Li},
	year={2024},
	eprint={2309.10400},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2309.10400},
	}
	```