Bakpia
Collection
Open models instructioned-tuned to respond in Javanese! Comes in 0.5B, 1.5B, and 9B parameters.
•
4 items
•
Updated
•
1
Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.
Beta preview
This repository contains the fp16 version of Bakpia V1 1.5B.
Version | Base Model | URL | Training |
---|---|---|---|
V1 0.5B | Qwen 2 0.5B Instruct | fp16 | Epoch = 1, Batch = 16*8, lr = 5e-5, linear schedule |
V1 1.5B | Qwen 2 1.5B Instruct | fp16 | Epoch = 1, Batch = 16*8, lr = 5e-5, linear schedule |
V1 9B | Gemma 2 9B Instruct | fp16/4bit | Batch size = 16*8, lr = 4e-5, linear schedule |
Training data is accessible here.
This is the first version of Bakpia.
✨ Training
✨ Features
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
model.to("cuda")
template = """<|im_start|>system
<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""
input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)