Text Generation
Transformers
Safetensors
Japanese
English
qwen
custom_code
File size: 5,292 Bytes
8322cb1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
datasets:
- databricks/databricks-dolly-15k
- kunishou/databricks-dolly-15k-ja
- izumi-lab/llm-japanese-dataset
language:
- ja
- en
tags:
- qwen
inference: false
---

# `rinna/nekomata-14b-instruction`

![rinna-icon](./rinna.png)

# Overview
The model is the instruction-tuned version of [`rinna/nekomata-14b`](https://huggingface.co/rinna/nekomata-14b). It adopts the Alpaca input format.

* **Model architecture**

    A 40-layer, 5120-hidden-size transformer-based language model. Please refer to the [Qwen paper](https://arxiv.org/abs/2309.16609) for architecture details.

* **Fine-tuning**
    
    The fine-tuning data is the subset of the following datasets.
    * [Databricks Dolly data](https://huggingface.co/datasets/databricks/databricks-dolly-15k)
    * [Japanese Databricks Dolly data](https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja)
    * [FLAN Instruction Tuning data](https://github.com/google-research/FLAN) and its Japanese translation
    * [Izumi lab LLM Japanese dataset](https://github.com/masanorihirano/llm-japanese-dataset/tree/main)
      * The following sections are used
        * alt
        * aozora-txt
        * CourseraParallel
        * ParaNatCom
        * Tab-delimited_Bilingual_Sentence_Pairs
        * tanaka-corpus
        * wikinews
        * wordnet
        * yasashi-japanese
      * The [remaining sections](https://github.com/masanorihirano/llm-japanese-dataset/tree/main/datasets-cc-by-sa) contain commonly used evaluation corpora so they are skipped to prevent data leak.

* **Authors**

    - [Tianyu Zhao](https://huggingface.co/tianyuz)
    - [Kei Sawada](https://huggingface.co/keisawada)
    
---

# Benchmarking
Please refer to [rinna's LM benchmark page](https://rinnakk.github.io/research/benchmarks/lm/index.html).

---

# How to use the model

~~~~python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("rinna/nekomata-14b-instruction", trust_remote_code=True)

# Use GPU with bf16
# model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="auto", trust_remote_code=True, bf16=True)

# Use GPU with fp16
# model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="auto", trust_remote_code=True, fp16=True)

# Use CPU
# model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="cpu", trust_remote_code=True)

# Automatically select device and precision
model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="auto", trust_remote_code=True)

instruction = "次の日本語を英語に翻訳してください。"
input = "大規模言語モデル(だいきぼげんごモデル、英: large language model、LLM)は、多数のパラメータ(数千万から数十億)を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使用して自己教師あり学習または半教師あり学習によって訓練が行われる。"
prompt = f"""
以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。

### 指示:
{instruction}

### 入力:
{input}

### 応答:
"""
token_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")

with torch.no_grad():
    output_ids = model.generate(
        token_ids.to(model.device),
        max_new_tokens=200,
        do_sample=True,
        temperature=0.5,
        pad_token_id=tokenizer.pad_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id
    )

output = tokenizer.decode(output_ids.tolist()[0])
print(output)
"""
以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。

### 指示:
次の日本語を英語に翻訳してください。

### 入力:
大規模言語モデル(だいきぼげんごモデル、英: large language model、LLM)は、多数のパラメータ(数千万から数十億)を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使 用して自己教師あり学習または半教師あり学習によって訓練が行われる。

### 応答:
 A large language model (LLM) is a computer language model composed of artificial neural networks with many parameters (from tens of millions to billions) trained by self-supervised learning or semi-supervised learning using a large amount of unlabeled text.<|endoftext|>
"""
~~~~

---

# Tokenization
Please refer to [`rinna/nekomata-14b`](https://huggingface.co/rinna/nekomata-14b) for tokenization details.

---

# How to cite
~~~
@misc{RinnaNekomataInstruction14b, 
    url={https://huggingface.co/rinna/nekomata-14b-instruction}, 
    title={rinna/nekomata-14b-instruction}, 
    author={Zhao, Tianyu and Sawada, Kei}
}
~~~
---

# License
[Tongyi Qianwen LICENSE AGREEMENT](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)