Safetensors
Lithuanian
llama
Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for Model ID

Lt-Llama2 is a family of pretrained and fine-tuned generative text models for Lithuanian. This is the repository for the foundational 7B model. Links to other models can be found at the bottom of this page.

Model Details

Model Description

Neurotechnology company marks the first open-source initiative dedicated to developing a large language model (LLM) specialized in Lithuanian. The company has created and publicly released a collection of Lithuanian LLMs, available both as foundational models and instructional variants.

  • Developed by: Neurotechnology
  • Language(s): Lithuanian
  • License: Llama2 Community License Agreement
  • Continual pretrained from model: Llama-2-13b

Model Sources

Intended Use

Intended Use Cases

Lt-Llama2 is designed for research purposes in Lithuanian. The base models can be tailored for various natural language tasks, while the instruction models are geared towards assistant-like conversational interactions.

Prohibited use

Utilizing the model in ways that breach the license, violate any applicable laws or regulations, or involve languages other than Lithuanian.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf")
model = AutoModelForCausalLM.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf")
input_text = "Kartą gyveno senelis ir senelė "
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))

Benchmarks

Model Average ARC MMLU Winogrande HellaSwag GSM8k TruthfulQA
Llama-2-13b 30.53 28.66 31.34 50.90 28.91 5.91 37.48
Llama2-13b-Base 36.42 54.50 26.01 61.72 40.61 0.45 35.23

RoLlama2 Model Family

Model Link
Lt-Llama2-7b link
Lt-Llama2-7b-instruct link
Lt-Llama2-13b link
Lt-Llama2-13b-instruct link

Citation

@misc{nakvosas2024openllama2modellithuanian,
      title={Open Llama2 Model for the Lithuanian Language},
      author={Artūras Nakvosas and Povilas Daniušis and Vytas Mulevičius},
      year={2024},
      eprint={2408.12963},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.12963},
}
Downloads last month
22
Safetensors
Model size
13B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Dataset used to train neurotechnology/Lt-Llama-2-13b-hf

Collection including neurotechnology/Lt-Llama-2-13b-hf