Edit model card

Gugugo-koen-7B-V1.1-GPTQ

Detail repo: https://github.com/jwj7140/Gugugo Gugugo

This is GPTQ model from squarelike/Gugugo-koen-7B-V1.1

Base Model: Llama-2-ko-7b

Training Dataset: sharegpt_deepl_ko_translation.

I trained with 1x A6000 GPUs for 90 hours.

Prompt Template

KO->EN

### ν•œκ΅­μ–΄: {sentence}</끝>
### μ˜μ–΄:

EN->KO

### μ˜μ–΄: {sentence}</끝>
### ν•œκ΅­μ–΄:

Implementation Code

from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList
import torch
repo = "squarelike/Gugugo-koen-7B-V1.1-GPTQ"
model = AutoModelForCausalLM.from_pretrained(
        repo,
        device_map='auto'
)
tokenizer = AutoTokenizer.from_pretrained(repo)

model.eval()
model.config.use_cache = True

class StoppingCriteriaSub(StoppingCriteria):
    def __init__(self, stops = [], encounters=1):
        super().__init__()
        self.stops = [stop for stop in stops]

    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor):
        for stop in self.stops:
            if torch.all((stop == input_ids[0][-len(stop):])).item():
                return True

        return False

stop_words_ids = torch.tensor([[829, 45107, 29958], [1533, 45107, 29958], [829, 45107, 29958], [21106, 45107, 29958]]).to("cuda")
stopping_criteria = StoppingCriteriaList([StoppingCriteriaSub(stops=stop_words_ids)])

def gen(lan="en", x=""):
    if (lan == "ko"):
        prompt = f"### ν•œκ΅­μ–΄: {x}</끝>\n### μ˜μ–΄:"
    else:
        prompt = f"### μ˜μ–΄: {x}</끝>\n### ν•œκ΅­μ–΄:"
    gened = model.generate(
        **tokenizer(
            prompt,
            return_tensors='pt',
            return_token_type_ids=False
        ).to("cuda"),
        max_new_tokens=2000,
        temperature=0.3,
        # no_repeat_ngram_size=5,
        num_beams=5,
        stopping_criteria=stopping_criteria
    )
    return tokenizer.decode(gened[0][1:]).replace(prompt+" ", "").replace("</끝>", "")


print(gen(lan="en", x="Hello, world!"))
Downloads last month
91
Safetensors
Model size
1.25B params
Tensor type
I32
Β·
FP16
Β·
Inference API
Examples
This model can be loaded on Inference API (serverless).

Dataset used to train squarelike/Gugugo-koen-7B-V1.1-GPTQ