mpasila commited on
Commit
66e2dd1
1 Parent(s): 1da5b7d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - ja
5
+ - en
6
+ tags:
7
+ - japanese
8
+ - causal-lm
9
+ inference: false
10
+ ---
11
+ This is a conversion of [cyberagent/calm2-7b](https://huggingface.co/cyberagent/calm2-7b) to safetensors so you don't have to worry about getting hacked by downloading dirty pickled files.
12
+ # Original model card:
13
+
14
+ # CyberAgentLM2-7B (CALM2-7B)
15
+
16
+ ## Model Description
17
+
18
+ CyberAgentLM2 is a decoder-only language model pre-trained on the 1.3T tokens of publicly available Japanese and English datasets.
19
+
20
+ Variant: [CyberAgentLM2-Chat](https://huggingface.co/cyberagent/calm2-7b-chat)
21
+
22
+ ## Requirements
23
+ - transformers >= 4.34.1
24
+ - accelerate
25
+
26
+ ## Usage
27
+
28
+ ```python
29
+ import transformers
30
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
31
+
32
+ assert transformers.__version__ >= "4.34.1"
33
+
34
+ model = AutoModelForCausalLM.from_pretrained("cyberagent/calm2-7b", device_map="auto", torch_dtype="auto")
35
+ tokenizer = AutoTokenizer.from_pretrained("cyberagent/calm2-7b")
36
+ streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
37
+
38
+ prompt = "AIによって私達の暮らしは、"
39
+
40
+ token_ids = tokenizer.encode(prompt, return_tensors="pt")
41
+ output_ids = model.generate(
42
+ input_ids=token_ids.to(model.device),
43
+ max_new_tokens=100,
44
+ do_sample=True,
45
+ temperature=0.9,
46
+ streamer=streamer,
47
+ )
48
+ ```
49
+
50
+ ## Model Details
51
+
52
+ * **Model size**: 7B
53
+ * **Trained tokens**: 1.3T tokens
54
+ * **Context length**: 4096
55
+ * **Model type**: Transformer-based Language Model
56
+ * **Language(s)**: Japanese, English
57
+ * **Developed by**: [CyberAgent, Inc.](https://www.cyberagent.co.jp/)
58
+ * **License**: Apache-2.0
59
+
60
+ ## Author
61
+
62
+ [Ryosuke Ishigami](https://huggingface.co/rishigami)
63
+
64
+ ## Citations
65
+ ```tex
66
+ @article{touvron2023llama,
67
+ title={LLaMA: Open and Efficient Foundation Language Models},
68
+ author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume},
69
+ journal={arXiv preprint arXiv:2302.13971},
70
+ year={2023}
71
+ }
72
+ ```