Model card

英日、日英翻訳用モデルC3TR-AdapterのGPTQ4bit量子化版です。
This is the GPTQ 4bit quantized version of the C3TR-Adapter, model for English-Japanese and Japanese-English translation.

簡単に動かす方法 (A quick way to try it)

Colab無料版で動かす事ができます。(有料版(L4かA100)の方が品質が高くなります)
You can run it with the free version of Colab. (The paid version (L4 or A100) is of higher quality.)
C3TR-Adapter_gptq_v2_Free_Colab_sample

install

AutoGPTQの公式サイトをご確認下さい
Check official AutoGPTQ page

私はソースからインストールしないと動かす事ができませんでした。
I couldn't get it to work without installing from source.

git clone https://github.com/PanQiWei/AutoGPTQ.git && cd AutoGPTQ
pip install -vvv --no-build-isolation -e .
pip install optimum

Sample code

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
model_name = "webbigdata/C3TR-Adapter_gptq"

# thanks to tk-master
# https://github.com/AutoGPTQ/AutoGPTQ/issues/406
config = AutoConfig.from_pretrained(model_name)
config.quantization_config["use_exllama"] = False
config.quantization_config["exllama_config"] = {"version":2}

# adjust your gpu memory size. 0 means first gpu.
max_memory={0: "12GiB", "cpu": "10GiB"}

quantized_model = AutoModelForCausalLM.from_pretrained(model_name
        , torch_dtype=torch.bfloat16  # change torch.float16 if you use free colab or something not support bfloat16.
        , device_map="auto", max_memory=max_memory
        , config=config)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.unk_token

prompt_text = """You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating.

### Instruction:
Translate English to Japanese.
When translating, please use the following hints:
[writing_style: web-fiction]
[Madoka: まどか]
[Madoka_first_person_and_ending: だね, よね]
[Mami: マミ]
[Mami_first_person_and_ending: 私, わね]
[Sayaka: さやか]
[Sayaka_first_person_and_ending: 私, かな]
[Kyubey: キュゥべぇ]
[Kyubey_first_person_and_ending: 僕, てよ]

### Input:
Madoka: "Thank you all for watching! You might've seen a bit of my dark side, but... don't mind that, okay?"
Sayaka: "Well, thanks! Did my cuteness come across 100%?"
Mami: "I'm glad you watched, but it's a bit embarrassing..."
Kyubey: "Make a contract with me, and become a magical girl."
### Response:
"""

tokens = tokenizer(prompt_text, return_tensors="pt",
        padding=True, max_length=1600, truncation=True).to("cuda:0").input_ids

output = quantized_model.generate(
        input_ids=tokens,
        max_new_tokens=800,
        do_sample=True,
        num_beams=3, temperature=0.5, top_p=0.3,
        repetition_penalty=1.0)
print(tokenizer.decode(output[0]))

See also

詳細はC3TR-Adapterを見てください
See also C3TR-Adapter

Downloads last month
20
Safetensors
Model size
1.83B params
Tensor type
I32
·
BF16
·
Inference Examples
Inference API (serverless) does not yet support gptq models for this pipeline type.

Model tree for webbigdata/C3TR-Adapter_gptq

Base model

google/gemma-7b
Quantized
(22)
this model