Edit model card

Yugo55A-GPT 4bit

  • Developed by: datatab
  • License: mit

🏆 Results

Results obtained through the Serbian LLM evaluation, released by Aleksa Gordić: serbian-llm-eval

  • Evaluation was conducted on a 4-bit version of the model due to hardware resource constraints.
MODEL ARC-E ARC-C Hellaswag BoolQ Winogrande OpenbookQA PiQA
*Yugo55-GPT-v4-4bit 51.41 36.00 57.51 80.92 65.75 34.70 70.54
Yugo55A-GPT 51.52 37.78 57.52 84.40 65.43 35.60 69.43

🔗 Merge Details

Merge Method

This is a merge of pre-trained language models created using mergekit. This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

🧩 Configuration

The following YAML configuration was used to produce this model:

models:
  - model: datatab/Yugo55-GPT-v4
    parameters:
      weight: 1.0
  - model: datatab/Yugo55-GPT-DPO-v1-chkp-300
    parameters:
      weight: 1.0
  - model: mlabonne/AlphaMonarch-7B
    parameters:
      weight: 0.5
  - model: NousResearch/Nous-Hermes-2-Mistral-7B-DPO
    parameters:
      weight: 0.5
merge_method: linear
dtype: float16

💻 Usage

!pip -q install git+https://github.com/huggingface/transformers # need to install from github
!pip install -q datasets loralib sentencepiece
!pip -q install bitsandbytes accelerate
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "datatab/Yugo55A-GPT", torch_dtype="auto"
)

tokenizer = AutoTokenizer.from_pretrained(
    "datatab/Yugo55A-GPT", torch_dtype="auto"
)

from typing import Optional
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer


def generate(
    user_content: str, system_content: Optional[str] = ""
) -> str:
    system_content = "Ispod je uputstvo koje opisuje zadatak, upareno sa unosom koji pruža dodatni kontekst. Napišite odgovor koji na odgovarajući način kompletira zahtev."

    messages = [
        {
            "role": "system",
            "content": system_content,
        },
        {"role": "user", "content": user_content},
    ]

    tokenized_chat = tokenizer.apply_chat_template(
        messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
    ).to("cuda")

    text_streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
    output = model.generate(
        tokenized_chat,
        streamer=text_streamer,
        max_new_tokens=2048,
        temperature=0.1,
        repetition_penalty=1.11,
        top_p=0.92,
        top_k=1000,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
        do_sample=True,
    )

    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

generate("Nabroj mi sve planete suncevog sistemai reci mi koja je najveca planeta")
generate("Koja je razlika između lame, vikune i alpake?")
generate("Napišite kratku e-poruku Semu Altmanu dajući razloge za GPT-4 otvorenog koda")
Downloads last month
59
Safetensors
Model size
3.86B params
Tensor type
F32
·
FP16
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for datatab/Yugo55A-4bit

Datasets used to train datatab/Yugo55A-4bit

Collection including datatab/Yugo55A-4bit