afrizalha
/

Bakpia-V1-1.5B-Javanese

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Bakpia-V1-1.5B-Javanese / README.md

afrizalha's picture

Update README.md

f425bfb verified 5 months ago

|

history blame contribute delete

3.03 kB

	---
	language:
	- jv
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- qwen2
	- trl
	- sft
	datasets:
	- afrizalha/Gatra-2-Javanese
	---
	<!DOCTYPE html>
	<html lang="en">
	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<title>Document Title</title>
	<style>
	h1 {
	font-size: 36px;
	color: navy;
	font-family: 'Tahoma';
	text-align: center;
	}
	</style>
	</head>
	<body>
	<h1> Open models for indigenous Indonesian languages</h1>
	</body>
	</html>

	<center>
	<img src="https://imgur.com/PutckEK.png" alt="Bakpia" width="500" height="250">
	<p><em>Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.</em></p>
	<p><em style="color: black; font-weight: bold;">Beta preview</em></p>
	</center>
	Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku.

	This repository contains the fp16 version of Bakpia V1 1.5B.

	\| Version \| Base Model \| URL \| Training \|
	\|---------\|------------\|-----\|----------\|
	\| V1 0.5B \| Qwen 2 0.5B Instruct \| [fp16](https://huggingface.co/afrizalha/Bakpia-V1-0.5B-Javanese/) \| Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule\|
	\| V1 1.5B \| Qwen 2 1.5B Instruct \| [fp16](https://huggingface.co/afrizalha/Bakpia-V1-1.5B-Javanese) \| Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule\|
	\| V1 9B \| Gemma 2 9B Instruct \| [fp16](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-fp16)/[4bit](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-4bit/) \|Batch size = 16\*8, lr = 4e-5, linear schedule\|

	Training data is accessible [here](https://huggingface.co/datasets/afrizalha/Gatra-2-Javanese).

	## Version 1.0

	This is the first version of Bakpia.

	✨ Training
	- 36K input-output pairs
	- 64/128 lora r/alpha
	- Rank-stabilized lora

	✨ Features
	- Single-turn QA across various domains.
	- Ngoko Javanese not currently supported.

	## Generate with template
	```
	from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer

	tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
	model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
	model.to("cuda")

	template = """<\|im_start\|>system
	<\|im_end\|>
	<\|im_start\|>user
	{prompt}<\|im_end\|>
	<\|im_start\|>assistant
	"""

	input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
	input = tokenizer([input], return_tensors = "pt").to("cuda")
	outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)
	```

	## Acknowledgments

	- Developed by: Afrizal Hasbi Azizy
	- License: Apache-2.0