--- license: mit datasets: - izumi-lab/llm-japanese-dataset language: - ja tags: - llama - causal-lm --- This repo contains a low-rank adapter for LLaMA-13b fit on the [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset) dataset. This version of the weights was trained with the following hyperparameters: - Epochs: 1 - Batch size: 130 - Cutoff length: 256 - Learning rate: 3e-4 - Lora _r_: 4 - Lora target modules: q_proj, v_proj ```python import torch from transformers import LlamaForCausalLM, LlamaTokenizer from peft import PeftModel base_model = "decapoda-research/llama-13b-hf" model = LlamaForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16) tokenizer = LlamaTokenizer.from_pretrained(base_model) model = PeftModel.from_pretrained( model, "izumi-lab/llama-13b-japanese-lora-v0", torch_dtype=torch.float16, ) ```