File size: 2,529 Bytes
a94c516 e583534 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a83de2c a25d0a7 a94c516 0544092 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 0fa8f1c e15785f 0fa8f1c a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 a94c516 a25d0a7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
library_name: transformers
tags: []
---
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/e2VLH4eBlq3678PsI_itw.png" alt="drawing" width="512"/>
</p>
# How to use ・ 使い方
We recommend on running with at least 4 A100 cards
A100の4枚の環境がおすすめです
### Huggingface
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
tokenizer = AutoTokenizer.from_pretrained("lightblue/ao-karasu-72B")
model = AutoModelForCausalLM.from_pretrained("lightblue/ao-karasu-72B", device_map="auto")
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
messages = [{"role": "system", "content": "あなたはAIアシスタントです。"}]
messages.append({"role": "user", "content": "イギリスの首相は誰ですか?"})
prompt = tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)
pipe(prompt, max_new_tokens=100, do_sample=False, temperature=0.0, return_full_text=False)
```
### vLLM
```python
from vllm import LLM, SamplingParams
sampling_params = SamplingParams(temperature=0.0, max_tokens=100)
llm = LLM(model="lightblue/aokarasu-72B", tensor_parallel_size=4)
messages = [{"role": "system", "content": "あなたはAIアシスタントです。"}]
messages.append({"role": "user", "content": "イギリスの首相は誰ですか?"})
prompt = llm.llm_engine.tokenizer.tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)
prompts = [prompt]
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
```
# Training details 学習詳細
[English dev blog](https://note.com/peter_lightblue/n/n483d194d3614?sub_rt=share_pw)
[日本語ブログ](https://note.com/lightblue_tech/n/nfda12435b262?sub_rt=share_pw)
# Training data 学習データ
Roughly 20 million characters samples from a dataset of more than 1.1 billion characters, which was made up of:
~450 million characters from Wikipedia-based QA (same as Qarasu)
~200 million characters from technical blogs (new)
~200 million characters from Japanese QA site answers (new)
~100 million characters from LLM generated prompts and responses (same as Qarasu)
~70 million characters from news articles (new)
# Training schedule
Training for ~1 day on a A100 (80GB) GPU |