|
--- |
|
language: |
|
- en |
|
- ja |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
license: llama3 |
|
model_type: llama |
|
--- |
|
|
|
# Llama3-Preferred-MedSwallow-70B |
|
|
|
## Model Description |
|
|
|
Llama3-Preferred-MedSwallow-70B is a finetuned model based on [tokyotech-llm/Llama-3-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-70B-v0.1), which has undergone continued pretraining on an original corpus of medical-related text. |
|
For more details, please refer to our blog post at https://tech.preferred.jp/ja/blog/llama3-preferred-medswallow-70b/. |
|
The model is released under the [META LLAMA 3 COMMUNITY LICENSE](https://llama.meta.com/llama3/license/). |
|
|
|
## Model Performance |
|
|
|
The table below shows the performance on the Japanese national medical licensing examinations from 2018 to 2022 ([IgakuQA](https://github.com/jungokasai/IgakuQA)). |
|
| Model ID | Average | 2018 | 2019 | 2020 | 2021 | 2022 | |
|
|:--------------------------------------------------------------------------------------------------------------------|-------------------:|-------:|-------:|-------:|-------:|-------:| |
|
| **Llama3-Preferred-MedSwallow-70B** | **395.2** | **407** | **390** | **391** | 393 | **395** | |
|
GPT-4 | 388.8 | 382 | 385 | 387 | **398** | 392 | |
|
| [Llama-3-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-70B-v0.1) | 348.6 | 353 | 347 | 353 | 345 | 345 | |
|
| [Meta-Llama-3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B) | 334.6 | 353 | 340 | 348 | 314 | 318 | |
|
| [Qwen2-72B](https://huggingface.co/Qwen/Qwen2-72B) | 331.2 | 320 | 325 | 325 | 326 | 360 | |
|
| [gemma-2-27b](https://huggingface.co/google/gemma-2-27b) | 316 | 337 | 298 | 327 | 296 | 322 | |
|
| [Swallow-70b-NVE-hf](https://huggingface.co/tokyotech-llm/Swallow-70b-NVE-hf) | 291.6 | 283 | 280 | 300 | 295 | 300 | |
|
| [Swallow-MX-8x7b-NVE-v0.1](https://huggingface.co/tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1) | 280.8 | 262 | 273 | 291 | 284 | 294 | |
|
ChatGPT | 273.2 | 266 | 250 | 266 | 297 | 287 | |
|
|
|
## Limitations |
|
|
|
The model was developed for research purposes and is not intended for clinical diagnosis. |
|
It is the users' responsibility to ensure compliance with applicable rules and regulations. |
|
|
|
## Contributors |
|
|
|
Preferred Networks, Inc. |
|
- Junichiro Iwasawa |
|
- Keita Suzuki |
|
- Wataru Kawakami |
|
|
|
## License |
|
|
|
[META LLAMA 3 COMMUNITY LICENSE](https://llama.meta.com/llama3/license/) |
|
|