update model card

ce04b71 verified 10 days ago

7.09 kB

	---
	license: apache-2.0
	base_model:
	- Rakuten/RakutenAI-7B
	---
	---
	license: apache-2.0
	---
	# RakutenAI-2.0-8x7B
	## Model Description
	RakutenAI-2.0-8x7B is an MoE-based foundation model derived from [RakutenAI-7B](https://huggingface.co/Rakuten/RakutenAI-7B), first introduced in March 2024. As part of a broader initiative to advance Japanese LLM technology, RakutenAI-2.0-8x7B adopts a Mixture of Experts (MoE) architecture with two active experts, resulting in 13B active parameters. This design enables dynamic expert selection based on input tokens, enhancing computational efficiency while maintaining high performance. RakutenAI-2.0-8x7B achieves state-of-the-art results on Japanese language understanding benchmarks while also demonstrating competitive performance on English evaluation tasks compared to similar models, including Swallow-MX-8x7B-NVE-0.1, Llama-3-Swallow-70B-v0.1, Sarashina2-70B, and PLaMo 100B.

	If you are looking for an instruction-tuned model, check [RakutenAI-2.0-8x7B-instruct](https://huggingface.co/Rakuten/RakutenAI-2.0-8x7B-instruct).

	## Model Evaluation Results

	\| Foundation Model Name \| Japanese Score \| English Score \| Average \|
	\|-----------------------------------------------\|---------------\|--------------\|---------\|
	\| Rakuten/RakutenAI-7B \| 62.93 \| 34.86 \| 48.90 \|
	\| Rakuten/RakutenAI-2.0-8x7B \| 72.29 \| 41.32 \| 56.80 \|
	\| Tokyotech/Swallow-MX-8x7B-NVE-0.1 \| 66.17 \| 44.33 \| 55.25 \|
	\| Tokyotech/Llama-3-Swallow-70B-v0.1 \| 68.15 \| 51.52 \| 59.84 \|
	\| SBIntuitions/Sarashina2-70B \| 71.09 \| 39.22 \| 55.16 \|
	\| PreferredNetworks/PLaMo 100B \| 71.45 \| 36.48 \| 53.96 \|

	<div style="text-align: center;">Table1: RakutenAI-2.0-8x7B foundation model average performance scores on LM-Harness in comparison with other Japanese open models.</div>

	Detailed scores are as follows:

	\| Metric \| jcommonsense_qa \| jnli \| marc_ja \| jsquad \| jaqket_v2 \| xlsum_ja \| xwinograd \| mgsm \| arc_challenge \| hellaswag \| mmlu \| truthfulqa_mc2 \| gsm8k \| winogrande \| musr \| math_hard \| gpqa \| bbh \| ifeval \| mmlu_pro \|
	\|----------------------\|-----------------\|-------\|---------\|--------\|-----------\|----------\|-----------\|-------\|---------------\|-----------\|-------\|----------------\|-------\|------------\|-------\|-----------\|-------\|-------\|--------\|----------\|
	\| Model Name \| accuracy-3shot \| accuracy-3shot \| accuracy-3shot \| exact_match-2shot \| exact_match-1shot \| rouge2-1shot \| accuracy-0shot \| accuracy-5shot \| accuracy_norm-25shot \| accuracy_norm-10shot \| accuracy-5shot \| accuracy-0shot \| exact_match-5shot \| accuracy-5shot \| accuracy_norm-0shot \| exact_match-4shot \| accuracy_norm-0shot \| accuracy_norm-3shot \| avg_inst_prompt_strict_acc-0shot \| accuracy-5shot \|
	\| RakutenAI-7B \| 85.88 \| 56.61 \| 96.52 \| 69.56 \| 81.44 \| 15.69 \| 74.14 \| 23.60 \| 60.75 \| 82.26 \| 59.83 \| 38.33 \| 32.6 \| 77.43 \| 4.93 \| 2.16 \| 5.02 \| 20.34 \| 14.04 \| 20.57 \|
	\| RakutenAI-2.0-8x7B \| 93.12 \| 87.43 \| 97.72 \| 74.49 \| 86.00 \| 15.70 \| 78.62 \| 45.20 \| 66.38 \| 85.84 \| 65.50 \| 48.19 \| 51.40 \| 80.51 \| 13.88 \| 3.30 \| 5.71 \| 27.02 \| 22.90 \| 25.22 \|
	\| Swallow-MX-8x7B-NVE-0.1 \| 89.28 \| 43.06 \| 97.15 \| 76.29 \| 87.37 \| 17.09 \| 82.69 \| 40.40 \| 65.87 \| 85.13 \| 69.48 \| 50.38 \| 58.45 \| 82.87 \| 8.78 \| 7.50 \| 13.33 \| 29.41 \| 28.38 \| 32.32 \|
	\| Llama-3-Swallow-70B-v0.1 \| 92.58 \| 66.15 \| 93.46 \| 70.94 \| 71.74 \| 12.58 \| 83.32 \| 54.40 \| 67.58 \| 87.53 \| 77.47 \| 55.29 \| 81.50 \| 85.16 \| 22.05 \| 13.92 \| 16.60 \| 49.53 \| 20.91 \| 40.70 \|
	\| Sarashina2-70B \| 95.35 \| 60.44 \| 94.50 \| 76.90 \| 88.49 \| 18.24 \| 80.81 \| 54.00 \| 62.63 \| 83.23 \| 63.10 \| 48.68 \| 24.49 \| 79.95 \| 13.52 \| 5.29 \| 5.54 \| 29.73 \| 30.32 \| 24.13 \|
	\| PLaMo 100B \| 92.05 \| 68.82 \| 97.49 \| 78.01 \| 89.43 \| 20.38 \| 81.02 \| 44.40 \| 49.91 \| 80.98 \| 55.17 \| 44.91 \| 56.10 \| 71.35 \| 6.67 \| 0.00 \| 4.00 \| 23.99 \| 23.39 \| 21.31 \|

	<div style="text-align: center;">Table2: RakutenAI-2.0-8x7B foundation model performance on LM-Harness metrics in comparison with other Japanese open models.</div>

	## Usage
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	model_path = "Rakuten/RakutenAI-2.0-8x7B"
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
	model.eval()

	requests = [
	"南硫黄島原生自然環境保全地域は、自然",
	"The capybara is a giant cavy rodent",
	]

	for req in requests:
	input_text = tokenizer(req, return_tensors="pt").to(device=model.device)
	tokens = model.generate(
	**input_text,
	max_new_tokens=512,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id,
	)
	out = tokenizer.decode(tokens[0], skip_special_tokens=True)
	print("INPUT:\n" + req)
	print("OUTPUT:\n" + out)

	```
	Note on Evaluation Scores:
	- Evaluation tests were carried out on LM Evaluation Harness during October - December 2024. We use default task definitions from the following commit: https://github.com/EleutherAI/lm-evaluation-harness/commit/26f607f5432e1d09c55b25488c43523e7ecde657
	- The tasks considered for Japanese evaluations are listed here: https://github.com/EleutherAI/lm-evaluation-harness/blob/26f607f5432e1d09c55b25488c43523e7ecde657/lm_eval/tasks/japanese_leaderboard/README.md
	- The tasks considered for English evaluations are listed here: https://huggingface.co/docs/leaderboards/en/open_llm_leaderboard/archive
	https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/leaderboard/README.md

	## Model Details

	* Developed by: [Rakuten Group, Inc.](https://ai.rakuten.com/)
	* Language(s): Japanese, English
	* License: This model is licensed under [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
	* Model Architecture: Mixture of Experts (2 active experts)

	### Limitations and Bias

	The suite of RakutenAI-2.0 models is capable of generating human-like text on a wide range of topics. However, like all LLMs, they have limitations and can produce biased, inaccurate, or unsafe outputs. Please exercise caution and judgement while interacting with them.

	## Citation
	For citing our work on the suite of RakutenAI-2.0 models, please use:

	```
	@misc{rakutengroup2025rakutenai2.0,
	author = {Rakuten Group, Inc.},
	title = {RakutenAI-2.0},
	year = {2025},
	publisher = {Hugging Face},
	url = {https://huggingface.co/Rakuten},
	}

	```