Safetensors
Russian
mistral
IlyaGusev commited on
Commit
2ae4ce5
1 Parent(s): 1c13507
README.md CHANGED
@@ -70,38 +70,42 @@ for query in inputs:
70
 
71
  ## Output examples
72
  ```
73
- User: Почему трава зеленая?
74
- Saiga: Трава зеленая по нескольким основным причинам:
75
- 1. **Хлорофилл**: Основной причиной зеленого цвета травы является наличие хлорофилла, пигмента в клетках растений, который играет ключевую роль в процессе фотосинтеза. Хлорофилл поглощает световые волны синего и красного спектра, а отражение зелёных волн обуславливает именно этот цвет.
76
- 2. **Процесс фотосинтеза**: Хлорофилл не только придаёт растениям зелёный окрас, но и обеспечивает процесс фотосинтеза, благодаря которому растение превращает солнечный свет, углекислый газ и воду в питательные вещества (в основном глюкозу), кислород и энергию. Этот процесс необходим для жизни растений и их роста.
77
- 3. **Адаптация к окружающей среде**: Зеленый цвет оптимален для максимального поглощения света в условиях на Земле. Наша атмосфера рассеивает больше синего и красного света, чем зелёного, что делает его наиболее доступным источником света для растений на поверхности планеты.
78
- 4. **Взаимодействие с окружающими условиями**: При различных условиях (например, недостатке воды или изменении уровня освещённости) трава может менять интенсивность своего зелёного цвета, иногда приобретая желтизну из-за уменьшения содержания хлорофилла или коричневый оттенок из-за образования других пигментов, таких как каротиноиды и антоцианы.
79
- Таким образом, зеленый цвет травы — это результат эволюционной адаптации растений к земным условиям, определяемый прежде всего присутствием хлорофилла и необходимостью эффективно использовать солнечный свет в процессе фотосинтеза.
80
  ```
81
 
82
  ```
83
  User: Сочини длинный рассказ, обязательно упоминая следующие объекты. Дано: Таня, мяч
84
- Saiga: ### Сказка о Тане и чудесной встрече
85
- В далеком пригороде, где улицы были пронизаны зеленью деревьев, а дома казались игрушками в огромной коробке, жила девочка по имени Таня. Её волосы цвета осенней листвы развевались на ветру, а глаза сверкали яркостью летнего неба. И была у неё одна особенность — любовь к мячу. Этот мяч был не просто предметом для игры; он был её верным спутником, другом и даже союзником во всех приключениях.
86
- Мяч этот, с замысловатым узором из разноцветных полос, словно обладал собственной жизнью. Он мог подсказать, когда пора отправиться на поиски новых друзей или рассказать истории о далеких странах, которые отражались в его шелковистой поверхности.
87
- Однажды утром, когда первые лучи солнца только начинали пробиваться сквозь листву, Таня решила провести день необычно. Она взяла свой волшебный мяч и направилась в сторону старого парка, что находился на краю города, там, где дороги становились менее знакомыми, а воздух наполнялся загадочными запахами. Мяч пружинил в руках, будто предвкушая что-то особенное.
88
- Парк встретил её тишиной и полусонными тенистыми аллеями. Таня побежала, играя со своим мячом, который словно летел впереди, маня все дальше. Вдруг, посреди поляны, мяч вырвался из рук, взмыл вверх и... замер, зависнув над землей! Таня остолбенела. Мелькнуло воспоминание о детских сказках, где обыденные вещи вдруг обретали магические свойства.
89
- Это сон? шептала она, делая шаг навстречу парителю.
90
- В то же мгновение, когда мяч начал медленно снижаться, из-за кустов показалось существо, которое можно было бы принять за маленького человека-животного. Его одежда переливалась всеми цветами радуги, как и узоры на мяче Тани. Существо оказалось Лесным Хранителемдобрым духом парка, хранителем всех тайн и чудес.
91
- Здравствуй, Таня, сказал Хранитель, голос его звучал как мелодичное журчание ручья.Я ждал тебя. Твой мяч не просто игрушка, это ключ, открывающий сердца. Ты обладаешь даром видеть мир иначе, и я здесь, чтобы научить тебя этим силам.
92
- Таня, хоть и испугалась сначала, но любопытство и искреннее желание понять новое быстро преодолели страх. Вместе они начали путешествие по волшебному миру парка. Мяч теперь светился, указывая дорогу, и каждый раз, когда он касался земли, вокруг оживали старые легенды: танцевали светлячки, разговаривали деревья, а птицы пели песни забытых эпох.
93
- Вскоре Таня узнала, что каждый цвет на её мяче символизировал силу: синий — мудрость, красный — смелость, зелёный — связь с природой, а белый — чистоту намерений. Используя эти силы, она помогала обитателям парка: спасла заблудившихся детей, исцелила больную птицу и даже примирила два враждующих духа-деревьев.
94
- Когда сумерки опустились на парк, Лесной Хранитель вернулся к своей вековой задаче, но перед этим подарил Танье маленький семечко:
95
- — Посади его рядом с твоим домом, и пусть растет вместе с тобой. Каждый раз, глядя на него, ты будешь помнить о мире, где всё возможно.
96
- На следующий день Таня вернулась домой, держа в руках не только любимый мяч, но и новый источник магии. Семя посадили, и вскоре оно превратилось в прекрасное дерево, расцветающее разноцветными цветами — словно продолжение самого мяча. Теперь каждая встреча с ним напоминала о тех приключениях и тайнах, что живут среди нас, если знать, куда и как смотреть.
97
- И так, благодаря своему верному мячу, Таня стала хранительницей маленьких чудес в своём городе, доказывая всем, что волшебство может быть самым обычным предметом, лишь бы сердце было открыто для приключений �� добра.
98
  ```
99
 
100
 
101
  ## Versions
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  v1:
103
- - [87a83ce252ff0142cd4cc918fb3e6a9875ca4638](https://huggingface.co/IlyaGusev/saiga_nemo_12b/commit/87a83ce252ff0142cd4cc918fb3e6a9875ca4638)
104
- - Other name: saiga_nemo_12b_sft_m9_d14_simpo_m19_d31
105
  - SFT dataset config: [sft_d14.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/sft_d14.json)
106
  - SFT model config: [saiga_nemo_12b_sft_m9.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_nemo_12b_sft_m9.json)
107
  - SimPO dataset config: [pref_d31.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/pref_d31.json)
@@ -112,6 +116,11 @@ v1:
112
 
113
  ## Evaluation
114
 
 
 
 
 
 
115
  RuArenaHard:
116
 
117
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/5fc2346dea82dd667bb0ffbc/-uG--3Wu9oUi9_bC_ZFP4.png)
 
70
 
71
  ## Output examples
72
  ```
73
+ User: Почему трава зеленая?
74
+ Saiga: Трава зеленая из-за хлорофилла, который содержится в клетках растений. Хлорофилл — это пигмент, отвечающий за процесс фотосинтеза, при котором растения преобразуют световую энергию солнца в химическую энергию для роста и развития. Зеленый цвет хлорофилла маскирует все другие пигменты клеток, такие как каротиноиды (которые обычно дают оранжевый или красный оттенок), делая листья и траву преимущественно зелеными.
 
 
 
 
 
75
  ```
76
 
77
  ```
78
  User: Сочини длинный рассказ, обязательно упоминая следующие объекты. Дано: Таня, мяч
79
+ Saiga: Таня и Мяч
80
+ Таня стояла на берегу широкой речки, вдыхая свежий утренний воздух. В руках она держала небольшой, но такой яркий мяч, что он казался маленьким солнцем, вырвавшимся из ладоней. Этот мяч был не просто игрушкой; для Тани он хранил воспоминания о самых счастливых моментах её жизни.
81
+ Мяч был найден ею в прошлом году в старом заброшенном саду, который по слухам принадлежал давно ушедшему волшебнику. Внешне он выглядел как обычный резиновый шар, только покрытый узорами, напоминающими созвездия. Именно после того дня, когда Таня впервые коснулась этого мяча, в её жизнь вошли чудеса. Сначала незначительные — мяч всегда возвращался к ней после броска, даже если его отправляли за пределы видимости. А однажды, когда Таня грустно сидела под дождём, мяч вдруг засветился мягким тёплым светом, защитив девочку от мокрых капель. С тех пор между ними установилась особая связь.
82
+ Речка, плещущая перед Таней, была границей двух миров: серого повседневного и той тайны, которая, как ей казалось, начиналась здесь. Сегодня утром, проснувшись от первых лучей солнца, Таня почувствовала зов приключения. Её рука сама потянулась к мячу, и тот, будто живой, тепло отозвался на прикосновение.
83
+ "Пора," шепнула Таня, крепче обхватив мяч. Она осторожно подняла его над водной гладью, закрыв глаза. "Покажи мне, куда идти". Мяч, словно послушный друг, взмыл вверх, ослепительно вспыхнул и, плавно описывая дугу, полетел над рекой, остановившись лишь там, где вода темнела, обещая скрывать нечто таинственное. Таня без колебаний прыгнула вслед, ощутив, как мяч окутал её мягким сиянием, облегчая падение в холодную воду.
84
+ Подводный мир встретил её неожиданным великолепием: сверкающие рыбы, подобные драгоценностям, плавали вокруг, а камни, устилавшие дно, складывались в странные узоры, похожие на письмена древних языков. Мяч светил, указывая дорогу сквозь водоросли, пока Таня не увидела вход в пещеру, украшенный морскими раковинами и блестящими кристаллами. Там, где кончались воды реки, начинался новый путь.
85
+ В глубинах пещеры Таня обнаружила комнату, где стены были сплошь покрыты картинами. На них оживала история мира, полная загадок и приключений, таких, о которых она читала в книгах, но которые теперь становились реальностью. Среди героев одной из сцен Таня узнала себя девочку с мячом, освещающим путь. Рядом с ней стоял мальчик, которого она никогда раньше не видела, но почему-то сразу поняла, что это её верный спутник, ждавший встречи в этом подземелье.
86
+ Когда они встретились, мяч закружился вокруг обоих, создавая танец света, который сливал их судьбы воедино. Мальчик представился его звали Миша. Оказалось, что мяч тоже принадлежит ему, но однажды улетел далеко-далеко и вернулся к Тане, чтобы вернуть их вместе. Так началось их совместное путешествие через миры, созданные фантазией и магией, где каждый шаг раскрывал новые страницы их общей истории.
87
+ С годами Таня и Миша оставались верными друзьями, хранителями секретов и защитниками чудес, которые прятались в самых обычных предметах, как в случае с мячом. А когда кто-то спрашивал, как же всё это началось, они просто улыбались и показывали мяч, уже немного потрепанный, но все так же сияющий звёздным светом. Их взгляд говорил о приключениях, о смелости и вере в то, что каждое детство способно стать дверью в настоящую магию.
88
+
 
 
 
 
89
  ```
90
 
91
 
92
  ## Versions
93
+
94
+ v2:
95
+ - [TBD](https://huggingface.co/IlyaGusev/saiga_nemo_12b/commit/TBD)
96
+ - Other names: saiga_nemo_12b_sft_m9_d16_slerp, saiga_nemo_12b_sft_m9_d16_simpo_m21_d36_doestoevsky_orpo_m1_slerp
97
+ - SFT dataset config: [sft_d16.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/sft_d16.json)
98
+ - SFT model config: [saiga_nemo_12b_sft_m9.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_nemo_12b_sft_m9.json)
99
+ - SimPO dataset config: [pref_d36.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/pref_d36.json)
100
+ - SimPO model config: [saiga_nemo_12b_simpo_m21.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_nemo_12b_simpo_m21.json)
101
+ - SFT wandb: [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/yod78hiq)
102
+ - SimPO wandb: [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/bv5w0lxl)
103
+ - Dostoevsky wandb: [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/senqj9t1)
104
+ - Merge config: [link](https://huggingface.co/IlyaGusev/saiga_nemo_12b_sft_m9_d16_slerp/blob/main/mergekit_config.yml)
105
+
106
  v1:
107
+ - [1c13507be4b5b5edd5586e48c902ef61c0343299](https://huggingface.co/IlyaGusev/saiga_nemo_12b/commit/1c13507be4b5b5edd5586e48c902ef61c0343299)
108
+ - Other name: saiga_nemo_12b_sft_m9_d14_simpo_m19_d31
109
  - SFT dataset config: [sft_d14.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/sft_d14.json)
110
  - SFT model config: [saiga_nemo_12b_sft_m9.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_nemo_12b_sft_m9.json)
111
  - SimPO dataset config: [pref_d31.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/pref_d31.json)
 
116
 
117
  ## Evaluation
118
 
119
+ ### v2
120
+ TBD
121
+
122
+ ### v1
123
+
124
  RuArenaHard:
125
 
126
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/5fc2346dea82dd667bb0ffbc/-uG--3Wu9oUi9_bC_ZFP4.png)
model-00001-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3b8db4c6d81aec7bc3c678ecb1fe57428c13e3a9b0d10b1dc26e6b4da76f2adf
3
- size 4865522496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96f966cf65a1577f7f5008a938d626ddbb8c97fda655feed91d7fe74669b71ab
3
+ size 4865489336
model-00002-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:64f7e6f6235841e0a193e9d9dca3e95d1021264464fca8f0eb9a036cce10e869
3
- size 4907529424
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d34ccde7d156204c266c2dccf106df139de68966a0b55df5aa14995631b87f4
3
+ size 4907529456
model-00003-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:10f82bd4235f80e017ff461d001f18e2b478f5479cf1509683a5a067f1db5e5d
3
- size 4907529456
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83e1809bb069217c4b17c324dbe3980256fda159d968665a5a39e15cfad2c014
3
+ size 4907529464
model-00004-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:68812995b1a257b0b4cc9fed0accce93f675ff74e19eb5d3763fe0e565b88241
3
  size 4907529456
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be2fc1df89c89dbb605e92360d8af8742f9a92e7774a1935a75d48e421d57b8e
3
  size 4907529456
model-00005-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f8846c3f4a3c5f4853c48f663d00319d16cf588387c5c6acf8796dd1955350bf
3
- size 4907496272
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb3fc2d105934c3489b283b54c8b64b9bdd7bcd00644e444c2deb8cf7e66bef0
3
+ size 4907529392
model.safetensors.index.json CHANGED
@@ -1,370 +1 @@
1
- {
2
- "metadata": {
3
- "total_size": 24495564800
4
- },
5
- "weight_map": {
6
- "lm_head.weight": "model-00005-of-00005.safetensors",
7
- "model.embed_tokens.weight": "model-00001-of-00005.safetensors",
8
- "model.layers.0.input_layernorm.weight": "model-00001-of-00005.safetensors",
9
- "model.layers.0.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
10
- "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
11
- "model.layers.0.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
12
- "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
13
- "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
14
- "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
15
- "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
16
- "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
17
- "model.layers.1.input_layernorm.weight": "model-00001-of-00005.safetensors",
18
- "model.layers.1.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
19
- "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
20
- "model.layers.1.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
21
- "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
22
- "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
23
- "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
24
- "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
25
- "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
26
- "model.layers.10.input_layernorm.weight": "model-00002-of-00005.safetensors",
27
- "model.layers.10.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
28
- "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
29
- "model.layers.10.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
30
- "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
31
- "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
32
- "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
33
- "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
34
- "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
35
- "model.layers.11.input_layernorm.weight": "model-00002-of-00005.safetensors",
36
- "model.layers.11.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
37
- "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
38
- "model.layers.11.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
39
- "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
40
- "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
41
- "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
42
- "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
43
- "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
44
- "model.layers.12.input_layernorm.weight": "model-00002-of-00005.safetensors",
45
- "model.layers.12.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
46
- "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
47
- "model.layers.12.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
48
- "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
49
- "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
50
- "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
51
- "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
52
- "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
53
- "model.layers.13.input_layernorm.weight": "model-00002-of-00005.safetensors",
54
- "model.layers.13.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
55
- "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
56
- "model.layers.13.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
57
- "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
58
- "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
59
- "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
60
- "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
61
- "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
62
- "model.layers.14.input_layernorm.weight": "model-00002-of-00005.safetensors",
63
- "model.layers.14.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
64
- "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
65
- "model.layers.14.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
66
- "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
67
- "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
68
- "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
69
- "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
70
- "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
71
- "model.layers.15.input_layernorm.weight": "model-00003-of-00005.safetensors",
72
- "model.layers.15.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
73
- "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
74
- "model.layers.15.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
75
- "model.layers.15.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
76
- "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
77
- "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
78
- "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
79
- "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
80
- "model.layers.16.input_layernorm.weight": "model-00003-of-00005.safetensors",
81
- "model.layers.16.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
82
- "model.layers.16.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
83
- "model.layers.16.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
84
- "model.layers.16.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
85
- "model.layers.16.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
86
- "model.layers.16.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
87
- "model.layers.16.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
88
- "model.layers.16.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
89
- "model.layers.17.input_layernorm.weight": "model-00003-of-00005.safetensors",
90
- "model.layers.17.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
91
- "model.layers.17.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
92
- "model.layers.17.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
93
- "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
94
- "model.layers.17.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
95
- "model.layers.17.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
96
- "model.layers.17.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
97
- "model.layers.17.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
98
- "model.layers.18.input_layernorm.weight": "model-00003-of-00005.safetensors",
99
- "model.layers.18.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
100
- "model.layers.18.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
101
- "model.layers.18.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
102
- "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
103
- "model.layers.18.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
104
- "model.layers.18.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
105
- "model.layers.18.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
106
- "model.layers.18.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
107
- "model.layers.19.input_layernorm.weight": "model-00003-of-00005.safetensors",
108
- "model.layers.19.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
109
- "model.layers.19.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
110
- "model.layers.19.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
111
- "model.layers.19.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
112
- "model.layers.19.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
113
- "model.layers.19.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
114
- "model.layers.19.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
115
- "model.layers.19.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
116
- "model.layers.2.input_layernorm.weight": "model-00001-of-00005.safetensors",
117
- "model.layers.2.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
118
- "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
119
- "model.layers.2.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
120
- "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
121
- "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
122
- "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
123
- "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
124
- "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
125
- "model.layers.20.input_layernorm.weight": "model-00003-of-00005.safetensors",
126
- "model.layers.20.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
127
- "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
128
- "model.layers.20.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
129
- "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
130
- "model.layers.20.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
131
- "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
132
- "model.layers.20.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
133
- "model.layers.20.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
134
- "model.layers.21.input_layernorm.weight": "model-00003-of-00005.safetensors",
135
- "model.layers.21.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
136
- "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
137
- "model.layers.21.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
138
- "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
139
- "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
140
- "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
141
- "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
142
- "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
143
- "model.layers.22.input_layernorm.weight": "model-00003-of-00005.safetensors",
144
- "model.layers.22.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
145
- "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
146
- "model.layers.22.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
147
- "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
148
- "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
149
- "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
150
- "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
151
- "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
152
- "model.layers.23.input_layernorm.weight": "model-00003-of-00005.safetensors",
153
- "model.layers.23.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
154
- "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
155
- "model.layers.23.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
156
- "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
157
- "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
158
- "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
159
- "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
160
- "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
161
- "model.layers.24.input_layernorm.weight": "model-00004-of-00005.safetensors",
162
- "model.layers.24.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
163
- "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
164
- "model.layers.24.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
165
- "model.layers.24.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
166
- "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
167
- "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
168
- "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
169
- "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
170
- "model.layers.25.input_layernorm.weight": "model-00004-of-00005.safetensors",
171
- "model.layers.25.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
172
- "model.layers.25.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
173
- "model.layers.25.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
174
- "model.layers.25.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
175
- "model.layers.25.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
176
- "model.layers.25.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
177
- "model.layers.25.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
178
- "model.layers.25.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
179
- "model.layers.26.input_layernorm.weight": "model-00004-of-00005.safetensors",
180
- "model.layers.26.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
181
- "model.layers.26.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
182
- "model.layers.26.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
183
- "model.layers.26.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
184
- "model.layers.26.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
185
- "model.layers.26.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
186
- "model.layers.26.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
187
- "model.layers.26.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
188
- "model.layers.27.input_layernorm.weight": "model-00004-of-00005.safetensors",
189
- "model.layers.27.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
190
- "model.layers.27.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
191
- "model.layers.27.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
192
- "model.layers.27.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
193
- "model.layers.27.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
194
- "model.layers.27.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
195
- "model.layers.27.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
196
- "model.layers.27.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
197
- "model.layers.28.input_layernorm.weight": "model-00004-of-00005.safetensors",
198
- "model.layers.28.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
199
- "model.layers.28.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
200
- "model.layers.28.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
201
- "model.layers.28.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
202
- "model.layers.28.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
203
- "model.layers.28.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
204
- "model.layers.28.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
205
- "model.layers.28.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
206
- "model.layers.29.input_layernorm.weight": "model-00004-of-00005.safetensors",
207
- "model.layers.29.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
208
- "model.layers.29.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
209
- "model.layers.29.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
210
- "model.layers.29.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
211
- "model.layers.29.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
212
- "model.layers.29.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
213
- "model.layers.29.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
214
- "model.layers.29.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
215
- "model.layers.3.input_layernorm.weight": "model-00001-of-00005.safetensors",
216
- "model.layers.3.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
217
- "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
218
- "model.layers.3.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
219
- "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
220
- "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
221
- "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
222
- "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
223
- "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
224
- "model.layers.30.input_layernorm.weight": "model-00004-of-00005.safetensors",
225
- "model.layers.30.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
226
- "model.layers.30.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
227
- "model.layers.30.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
228
- "model.layers.30.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
229
- "model.layers.30.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
230
- "model.layers.30.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
231
- "model.layers.30.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
232
- "model.layers.30.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
233
- "model.layers.31.input_layernorm.weight": "model-00004-of-00005.safetensors",
234
- "model.layers.31.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
235
- "model.layers.31.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
236
- "model.layers.31.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
237
- "model.layers.31.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
238
- "model.layers.31.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
239
- "model.layers.31.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
240
- "model.layers.31.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
241
- "model.layers.31.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
242
- "model.layers.32.input_layernorm.weight": "model-00004-of-00005.safetensors",
243
- "model.layers.32.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
244
- "model.layers.32.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
245
- "model.layers.32.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
246
- "model.layers.32.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
247
- "model.layers.32.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
248
- "model.layers.32.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
249
- "model.layers.32.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
250
- "model.layers.32.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
251
- "model.layers.33.input_layernorm.weight": "model-00005-of-00005.safetensors",
252
- "model.layers.33.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
253
- "model.layers.33.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
254
- "model.layers.33.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
255
- "model.layers.33.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
256
- "model.layers.33.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
257
- "model.layers.33.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
258
- "model.layers.33.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
259
- "model.layers.33.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
260
- "model.layers.34.input_layernorm.weight": "model-00005-of-00005.safetensors",
261
- "model.layers.34.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
262
- "model.layers.34.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
263
- "model.layers.34.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
264
- "model.layers.34.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
265
- "model.layers.34.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
266
- "model.layers.34.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
267
- "model.layers.34.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
268
- "model.layers.34.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
269
- "model.layers.35.input_layernorm.weight": "model-00005-of-00005.safetensors",
270
- "model.layers.35.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
271
- "model.layers.35.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
272
- "model.layers.35.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
273
- "model.layers.35.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
274
- "model.layers.35.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
275
- "model.layers.35.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
276
- "model.layers.35.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
277
- "model.layers.35.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
278
- "model.layers.36.input_layernorm.weight": "model-00005-of-00005.safetensors",
279
- "model.layers.36.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
280
- "model.layers.36.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
281
- "model.layers.36.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
282
- "model.layers.36.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
283
- "model.layers.36.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
284
- "model.layers.36.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
285
- "model.layers.36.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
286
- "model.layers.36.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
287
- "model.layers.37.input_layernorm.weight": "model-00005-of-00005.safetensors",
288
- "model.layers.37.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
289
- "model.layers.37.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
290
- "model.layers.37.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
291
- "model.layers.37.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
292
- "model.layers.37.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
293
- "model.layers.37.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
294
- "model.layers.37.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
295
- "model.layers.37.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
296
- "model.layers.38.input_layernorm.weight": "model-00005-of-00005.safetensors",
297
- "model.layers.38.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
298
- "model.layers.38.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
299
- "model.layers.38.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
300
- "model.layers.38.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
301
- "model.layers.38.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
302
- "model.layers.38.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
303
- "model.layers.38.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
304
- "model.layers.38.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
305
- "model.layers.39.input_layernorm.weight": "model-00005-of-00005.safetensors",
306
- "model.layers.39.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
307
- "model.layers.39.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
308
- "model.layers.39.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
309
- "model.layers.39.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
310
- "model.layers.39.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
311
- "model.layers.39.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
312
- "model.layers.39.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
313
- "model.layers.39.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
314
- "model.layers.4.input_layernorm.weight": "model-00001-of-00005.safetensors",
315
- "model.layers.4.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
316
- "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
317
- "model.layers.4.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
318
- "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
319
- "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
320
- "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
321
- "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
322
- "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
323
- "model.layers.5.input_layernorm.weight": "model-00001-of-00005.safetensors",
324
- "model.layers.5.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
325
- "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
326
- "model.layers.5.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
327
- "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
328
- "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
329
- "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
330
- "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
331
- "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
332
- "model.layers.6.input_layernorm.weight": "model-00002-of-00005.safetensors",
333
- "model.layers.6.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
334
- "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
335
- "model.layers.6.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
336
- "model.layers.6.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
337
- "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
338
- "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
339
- "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
340
- "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
341
- "model.layers.7.input_layernorm.weight": "model-00002-of-00005.safetensors",
342
- "model.layers.7.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
343
- "model.layers.7.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
344
- "model.layers.7.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
345
- "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
346
- "model.layers.7.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
347
- "model.layers.7.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
348
- "model.layers.7.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
349
- "model.layers.7.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
350
- "model.layers.8.input_layernorm.weight": "model-00002-of-00005.safetensors",
351
- "model.layers.8.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
352
- "model.layers.8.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
353
- "model.layers.8.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
354
- "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
355
- "model.layers.8.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
356
- "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
357
- "model.layers.8.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
358
- "model.layers.8.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
359
- "model.layers.9.input_layernorm.weight": "model-00002-of-00005.safetensors",
360
- "model.layers.9.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
361
- "model.layers.9.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
362
- "model.layers.9.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
363
- "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
364
- "model.layers.9.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
365
- "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
366
- "model.layers.9.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
367
- "model.layers.9.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
368
- "model.norm.weight": "model-00005-of-00005.safetensors"
369
- }
370
- }
 
1
+ {"metadata": {"mergekit_version": "0.0.4.4", "total_size": 24495564800}, "weight_map": {"lm_head.weight": "model-00001-of-00005.safetensors", "model.embed_tokens.weight": "model-00001-of-00005.safetensors", "model.layers.0.input_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00005.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00005.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00001-of-00005.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00005.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00005.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00005.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00005.safetensors", "model.layers.1.input_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00001-of-00005.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00005.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00001-of-00005.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00005.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00005.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00005.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00005.safetensors", "model.layers.10.input_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00001-of-00005.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00005.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00001-of-00005.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00005.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00005.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00005.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00005.safetensors", "model.layers.11.input_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00001-of-00005.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00005.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00001-of-00005.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00005.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00005.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00005.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00005.safetensors", "model.layers.12.input_layernorm.weight": "model-00001-of-00005.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.13.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.14.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.15.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.16.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.17.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.18.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.19.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.2.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00002-of-00005.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00005.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00002-of-00005.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00005.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00005.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00005.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00005.safetensors", "model.layers.20.input_layernorm.weight": "model-00002-of-00005.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.21.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.22.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.23.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.24.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.25.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.26.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.27.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.28.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00003-of-00005.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00005.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00003-of-00005.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00005.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00005.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00005.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00005.safetensors", "model.layers.29.input_layernorm.weight": "model-00003-of-00005.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.3.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.30.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.31.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.32.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.32.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.32.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.32.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.32.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.32.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.32.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.32.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.32.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.33.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.33.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.33.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.33.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.33.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.33.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.33.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.33.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.33.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.34.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.34.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.34.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.34.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.34.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.34.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.34.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.34.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.34.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.35.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.35.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.35.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.35.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.35.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.35.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.35.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.35.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.35.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.36.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.36.mlp.down_proj.weight": "model-00004-of-00005.safetensors", "model.layers.36.mlp.gate_proj.weight": "model-00004-of-00005.safetensors", "model.layers.36.mlp.up_proj.weight": "model-00004-of-00005.safetensors", "model.layers.36.post_attention_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.36.self_attn.k_proj.weight": "model-00004-of-00005.safetensors", "model.layers.36.self_attn.o_proj.weight": "model-00004-of-00005.safetensors", "model.layers.36.self_attn.q_proj.weight": "model-00004-of-00005.safetensors", "model.layers.36.self_attn.v_proj.weight": "model-00004-of-00005.safetensors", "model.layers.37.input_layernorm.weight": "model-00004-of-00005.safetensors", "model.layers.37.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.37.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.37.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.37.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.37.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.37.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.37.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.37.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.38.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.38.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.38.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.38.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.38.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.38.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.38.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.38.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.38.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.39.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.39.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.39.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.39.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.39.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.39.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.39.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.39.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.39.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.4.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.5.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.6.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.7.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.8.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.layers.9.input_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00005-of-00005.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00005-of-00005.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00005-of-00005.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00005-of-00005.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00005-of-00005.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00005-of-00005.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00005-of-00005.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00005-of-00005.safetensors", "model.norm.weight": "model-00005-of-00005.safetensors"}}