Merge branch 'main' of https://huggingface.co/rinna/nekomata-14b into main
Browse files
README.md
CHANGED
@@ -44,6 +44,10 @@ The name `nekomata` comes from the Japanese word [`猫又/ねこまた/Nekomata`
|
|
44 |
- [Wikipedia](https://dumps.wikimedia.org/other/cirrussearch)
|
45 |
- rinna curated Japanese dataset
|
46 |
|
|
|
|
|
|
|
|
|
47 |
* **Authors**
|
48 |
|
49 |
- [Tianyu Zhao](https://huggingface.co/tianyuz)
|
@@ -117,7 +121,7 @@ We compared the `Qwen` tokenizer (as used in `nekomata`) and the `llama-2` token
|
|
117 |
@misc{RinnaNekomata14b,
|
118 |
url={https://huggingface.co/rinna/nekomata-14b},
|
119 |
title={rinna/nekomata-14b},
|
120 |
-
author={Zhao, Tianyu and Kaga, Akio and
|
121 |
}
|
122 |
~~~
|
123 |
---
|
|
|
44 |
- [Wikipedia](https://dumps.wikimedia.org/other/cirrussearch)
|
45 |
- rinna curated Japanese dataset
|
46 |
|
47 |
+
* **Training Infrastructure**
|
48 |
+
|
49 |
+
`nekomata-14B` was trained on 16 nodes of Amazon EC2 trn1.32xlarge instance powered by AWS Trainium purpose-built ML accelerator chip. The pre-training job was completed within a timeframe of approximately 7 days.
|
50 |
+
|
51 |
* **Authors**
|
52 |
|
53 |
- [Tianyu Zhao](https://huggingface.co/tianyuz)
|
|
|
121 |
@misc{RinnaNekomata14b,
|
122 |
url={https://huggingface.co/rinna/nekomata-14b},
|
123 |
title={rinna/nekomata-14b},
|
124 |
+
author={Zhao, Tianyu and Kaga, Akio and Sawada, Kei}
|
125 |
}
|
126 |
~~~
|
127 |
---
|