beomi
/

KoRWKV-1.5B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

KoRWKV-1.5B / README.md

beomi's picture

Update README.md

3503544 over 1 year ago

|

2.29 kB

	---
	license: mit
	language:
	- ko
	pipeline_tag: text-generation
	tags:
	- KoRWKV
	---

	> 🚧 Note: this repo is under construction, current uploaded version is 1% trained ckpt (with ~1Billion tokens) 🚧
	>
	>

	## Todo

	- ⏳ Train 1.5B
	- Todo: Train Bigger Models


	# KoRWKV Model Card

	KoRWKV (1.5B) trained on Korean dataset with RWKVv4 Neo Architecture.

	## Model details

	Researcher developing the model

	Junbum Lee (aka Beomi)

	Model date

	KoRWKV was trained between 2022.05~

	Model version

	This is alpha version of the model.

	Model type

	Find more about RWKV at https://github.com/BlinkDL/RWKV-LM

	License

	MIT

	## Intended use
	Primary intended uses

	The primary use of KoRWKV is research on Korean Opensource large language models

	Primary intended users

	The primary intended users of the model are researchers in natural language processing, machine learning and artificial intelligence.

	Out-of-scope use cases

	KoRWKV is a base, or foundational, model. As such, it should not be used on downstream applications without further risk evaluation and mitigation. In particular, our model has not been trained with human feedback, and can thus generate toxic or offensive content, incorrect information or generally unhelpful answers.

	## Ethical considerations

	Data

	The data used to train the model is collected from various sources, mostly from the Web. As such, it contains offensive, harmful and biased content. We thus expect the model to exhibit such biases from the training data.

	Human life

	The model is not intended to inform decisions about matters central to human life, and should not be used in such a way.

	Risks and harms

	Risks and harms of large language models include the generation of harmful, offensive or biased content. These models are often prone to generating incorrect information, sometimes referred to as hallucinations. We do not expect our model to be an exception in this regard.

	Use cases

	KoRWKV is a foundational model, and as such, it should not be used for downstream applications without further investigation and mitigations of risks. These risks and potential fraught use cases include, but are not limited to: generation of misinformation and generation of harmful, biased or offensive content.