metadata

license: mit
language:
  - ru
library_name: transformers

llama-600M-rus

Simple amateur experimental model trained on approximately 60 Mb of text books from beginner in LLMs. No resources and time to collect bigger dataset. It could generate amateur, but more or less adequate output as well (in respect of training tokens)/ The work can be used as a checkpoint for the further training or for experiments.

Simle usage example:

from transformers import LlamaTokenizerFast, LlamaForCausalLM
model = LlamaForCausalLM.from_pretrained('demetera/llama-600M-rus')
tokenizer = LlamaTokenizerFast.from_pretrained('demetera/llama-600M-rus')

prompt = "Я вышел и улицу и"
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(inputs.input_ids, attention_mask = inputs.attention_mask, max_new_tokens=250, do_sample=True, top_k=50, top_p=0.95)

print (tokenizer.decode(outputs[0], skip_special_tokens=True))