Mizuiro-sakura
commited on
Commit
•
6424a51
1
Parent(s):
a7f2cd0
Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,31 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
language: ja
|
4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
```python
|
8 |
|
@@ -55,8 +78,7 @@ def generate_prompt(data_point):
|
|
55 |
def generate(instruction,input=None,maxTokens=256):
|
56 |
# 推論
|
57 |
prompt = generate_prompt({'instruction':instruction,'input':input})
|
58 |
-
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
|
59 |
-
outputs = model.generate(
|
60 |
input_ids=input_ids,
|
61 |
max_new_tokens=maxTokens,
|
62 |
do_sample=True,
|
@@ -84,3 +106,46 @@ def generate(instruction,input=None,maxTokens=256):
|
|
84 |
|
85 |
generate("自然言語処理とは?")
|
86 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
language: ja
|
4 |
+
datasets:
|
5 |
+
- kunishou/databricks-dolly-15k-ja
|
6 |
+
- wikipedia
|
7 |
+
- cc100
|
8 |
+
- mc4
|
9 |
+
tags:
|
10 |
+
- japanese
|
11 |
+
- causal-lm
|
12 |
+
- open-calm
|
13 |
+
inference: false
|
14 |
---
|
15 |
+
# OpenCALM-LARGE
|
16 |
+
|
17 |
+
## Model Description
|
18 |
+
|
19 |
+
OpenCALM is a suite of decoder-only language models pre-trained on Japanese datasets, developed by CyberAgent, Inc.
|
20 |
+
|
21 |
+
このモデルはpeftを用いてopen-calm-largeをLoRAファインチューニングしたものです。
|
22 |
+
|
23 |
+
## Usage
|
24 |
+
pytorchおよびtransformers, peftをインストールして下記コードを実行してください
|
25 |
+
|
26 |
+
(pip install torch, transformers, peft)
|
27 |
+
|
28 |
+
and please execute this code.
|
29 |
|
30 |
```python
|
31 |
|
|
|
78 |
def generate(instruction,input=None,maxTokens=256):
|
79 |
# 推論
|
80 |
prompt = generate_prompt({'instruction':instruction,'input':input})
|
81 |
+
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
|
|
|
82 |
input_ids=input_ids,
|
83 |
max_new_tokens=maxTokens,
|
84 |
do_sample=True,
|
|
|
106 |
|
107 |
generate("自然言語処理とは?")
|
108 |
```
|
109 |
+
|
110 |
+
## Model Details
|
111 |
+
|
112 |
+
|Model|Params|Layers|Dim|Heads|Dev ppl|
|
113 |
+
|:---:|:---: |:---:|:---:|:---:|:---:|
|
114 |
+
|[cyberagent/open-calm-small](https://huggingface.co/cyberagent/open-calm-small)|160M|12|768|12|19.7|
|
115 |
+
|[cyberagent/open-calm-medium](https://huggingface.co/cyberagent/open-calm-medium)|400M|24|1024|16|13.8|
|
116 |
+
|[cyberagent/open-calm-large](https://huggingface.co/cyberagent/open-calm-large)|830M|24|1536|16|11.3|
|
117 |
+
|[cyberagent/open-calm-1b](https://huggingface.co/cyberagent/open-calm-1b)|1.4B|24|2048|16|10.3|
|
118 |
+
|[cyberagent/open-calm-3b](https://huggingface.co/cyberagent/open-calm-3b)|2.7B|32|2560|32|9.7|
|
119 |
+
|[cyberagent/open-calm-7b](https://huggingface.co/cyberagent/open-calm-7b)|6.8B|32|4096|32|8.2|
|
120 |
+
|
121 |
+
* **Developed by**: [CyberAgent, Inc.](https://www.cyberagent.co.jp/)
|
122 |
+
* **Model type**: Transformer-based Language Model
|
123 |
+
* **Language**: Japanese
|
124 |
+
* **Library**: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
|
125 |
+
* **License**: OpenCALM is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License ([CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)). When using this model, please provide appropriate credit to CyberAgent, Inc.
|
126 |
+
* Example (en): This model is a fine-tuned version of OpenCALM-XX developed by CyberAgent, Inc. The original model is released under the CC BY-SA 4.0 license, and this model is also released under the same CC BY-SA 4.0 license. For more information, please visit: https://creativecommons.org/licenses/by-sa/4.0/
|
127 |
+
* Example (ja): 本モデルは、株式会社サイバーエージェントによるOpenCALM-XXをファインチューニングしたものです。元のモデルはCC BY-SA 4.0ライセンスのもとで公開されており、本モデルも同じくCC BY-SA 4.0ライセンスで公開します。詳しくはこちらをご覧ください: https://creativecommons.org/licenses/by-sa/4.0/
|
128 |
+
|
129 |
+
|
130 |
+
## Training Dataset
|
131 |
+
|
132 |
+
* Wikipedia (ja)
|
133 |
+
* Common Crawl (ja)
|
134 |
+
|
135 |
+
## Author
|
136 |
+
|
137 |
+
[Ryosuke Ishigami](https://huggingface.co/rishigami)
|
138 |
+
|
139 |
+
## Citations
|
140 |
+
|
141 |
+
```bibtext
|
142 |
+
@software{gpt-neox-library,
|
143 |
+
title = {{GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch}},
|
144 |
+
author = {Andonian, Alex and Anthony, Quentin and Biderman, Stella and Black, Sid and Gali, Preetham and Gao, Leo and Hallahan, Eric and Levy-Kramer, Josh and Leahy, Connor and Nestler, Lucas and Parker, Kip and Pieler, Michael and Purohit, Shivanshu and Songz, Tri and Phil, Wang and Weinbach, Samuel},
|
145 |
+
url = {https://www.github.com/eleutherai/gpt-neox},
|
146 |
+
doi = {10.5281/zenodo.5879544},
|
147 |
+
month = {8},
|
148 |
+
year = {2021},
|
149 |
+
version = {0.0.1},
|
150 |
+
}
|
151 |
+
```
|