metadata
license: apache-2.0
language:
- ja
- en
tags:
- japanese
- causal-lm
inference: false
CyberAgentLM3-22B-Chat (CALM3-22B-Chat)
Model Description
CyberAgentLM3 is a decoder-only language model pre-trained on 2.0 trillion tokens from scratch.
CyberAgentLM3-Chat is a fine-tuned model specialized for dialogue use cases.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model = AutoModelForCausalLM.from_pretrained("cyberagent/calm3-22b-chat", device_map="auto", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("cyberagent/calm3-22b-chat")
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
messages = [
{"role": "system", "content": "あなたは親切なAIアシスタントです。"},
{"role": "user", "content": "AIによって私たちの暮らしはどのように変わりますか?"}
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
output_ids = model.generate(input_ids,
max_new_tokens=1024,
temperature=0.5,
streamer=streamer)
Prompt Format
CALM3-Chat uses ChatML as the prompt format.
<|im_start|>system
あなたは親切なAIアシスタントです。<|im_end|>
<|im_start|>user
AIによって私たちの暮らしはどのように変わりますか?<|im_end|>
<|im_start|>assistant
Model Details
- Model size: 22B
- Context length: 16384
- Model type: Transformer-based Language Model
- Language(s): Japanese, English
- Developed by: CyberAgent, Inc.
- License: Apache-2.0
Author
How to cite
@misc{cyberagent-calm3-22b-chat,
title={cyberagent/calm3-22b-chat},
url={https://huggingface.co/cyberagent/calm3-22b-chat},
author={Ryosuke Ishigami},
year={2024},
}