Text Generation
Transformers
PyTorch
Japanese
English
qwen
custom_code
File size: 5,962 Bytes
50fd65a
 
 
 
 
 
 
 
 
 
 
 
d0b7a9c
 
 
50fd65a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d0b7a9c
50fd65a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d0b7a9c
 
0a96f58
d0b7a9c
 
 
 
 
 
 
 
 
 
50fd65a
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
datasets:
- databricks/databricks-dolly-15k
- kunishou/databricks-dolly-15k-ja
- izumi-lab/llm-japanese-dataset
language:
- ja
- en
tags:
- qwen
inference: false
license: other
license_name: tongyi-qianwen-license-agreement
license_link: https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT
---

# `rinna/nekomata-14b-instruction`

![rinna-icon](./rinna.png)

# Overview
The model is the instruction-tuned version of [`rinna/nekomata-14b`](https://huggingface.co/rinna/nekomata-14b). It adopts the Alpaca input format.

* **Model architecture**

    A 40-layer, 5120-hidden-size transformer-based language model. Please refer to the [Qwen paper](https://arxiv.org/abs/2309.16609) for architecture details.

* **Fine-tuning**
    
    The fine-tuning data is the subset of the following datasets.
    * [Databricks Dolly data](https://huggingface.co/datasets/databricks/databricks-dolly-15k)
    * [Japanese Databricks Dolly data](https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja)
    * [FLAN Instruction Tuning data](https://github.com/google-research/FLAN) and its Japanese translation
    * [Izumi lab LLM Japanese dataset](https://github.com/masanorihirano/llm-japanese-dataset/tree/main)
      * The following sections are used
        * alt
        * aozora-txt
        * CourseraParallel
        * ParaNatCom
        * Tab-delimited_Bilingual_Sentence_Pairs
        * tanaka-corpus
        * wikinews
        * wordnet
        * yasashi-japanese
      * The [remaining sections](https://github.com/masanorihirano/llm-japanese-dataset/tree/main/datasets-cc-by-sa) contain commonly used evaluation corpora so they are skipped to prevent data leak.

* **Contributors**

    - [Tianyu Zhao](https://huggingface.co/tianyuz)
    - [Kei Sawada](https://huggingface.co/keisawada)
    
---

# Benchmarking
Please refer to [rinna's LM benchmark page](https://rinnakk.github.io/research/benchmarks/lm/index.html).

---

# How to use the model

~~~~python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("rinna/nekomata-14b-instruction", trust_remote_code=True)

# Use GPU with bf16
# model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="auto", trust_remote_code=True, bf16=True)

# Use GPU with fp16
# model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="auto", trust_remote_code=True, fp16=True)

# Use CPU
# model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="cpu", trust_remote_code=True)

# Automatically select device and precision
model = AutoModelForCausalLM.from_pretrained("rinna/nekomata-14b-instruction", device_map="auto", trust_remote_code=True)

instruction = "次の日本語を英語に翻訳してください。"
input = "大規模言語モデル(だいきぼげんごモデル、英: large language model、LLM)は、多数のパラメータ(数千万から数十億)を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使用して自己教師あり学習または半教師あり学習によって訓練が行われる。"
prompt = f"""
以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。

### 指示:
{instruction}

### 入力:
{input}

### 応答:
"""
token_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")

with torch.no_grad():
    output_ids = model.generate(
        token_ids.to(model.device),
        max_new_tokens=200,
        do_sample=True,
        temperature=0.5,
        pad_token_id=tokenizer.pad_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id
    )

output = tokenizer.decode(output_ids.tolist()[0])
print(output)
"""
以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。

### 指示:
次の日本語を英語に翻訳してください。

### 入力:
大規模言語モデル(だいきぼげんごモデル、英: large language model、LLM)は、多数のパラメータ(数千万から数十億)を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使 用して自己教師あり学習または半教師あり学習によって訓練が行われる。

### 応答:
 A large language model (LLM) is a computer language model composed of artificial neural networks with many parameters (from tens of millions to billions) trained by self-supervised learning or semi-supervised learning using a large amount of unlabeled text.<|endoftext|>
"""
~~~~

---

# Tokenization
Please refer to [`rinna/nekomata-14b`](https://huggingface.co/rinna/nekomata-14b) for tokenization details.

---

# How to cite
~~~
@misc{rinna-nekomata-14b-instruction,
    title = {rinna/nekomata-14b-instruction},
    author={Zhao, Tianyu and Sawada, Kei}
    url = {https://huggingface.co/rinna/nekomata-14b-instruction},
}

@inproceedings{sawada2024release,
    title = {Release of Pre-Trained Models for the {J}apanese Language},
    author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
    booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
    month = {5},
    year = {2024},
    url = {https://arxiv.org/abs/2404.01657},
}
~~~
---

# License
[Tongyi Qianwen LICENSE AGREEMENT](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)