Update README.md

Browse files

Files changed (1) hide show

README.md +32 -0

README.md CHANGED Viewed

@@ -71,6 +71,38 @@ Step	Training Loss
 37	0.686400
 38	0.724200
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 # **JAIS Adapted 7B Chat Merged with V4 LORA adapters on Google Colab via: **

 37	0.686400
 38	0.724200
+Merged model looks like the following (printing trainable parameters; pytorch) readout:
+'''
+LlamaForCausalLM(
+  (model): LlamaModel(
+    (embed_tokens): Embedding(64000, 4096)
+    (layers): ModuleList(
+      (0-31): 32 x LlamaDecoderLayer(
+        (self_attn): LlamaSdpaAttention(
+          (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
+          (k_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
+          (v_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
+          (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
+          (rotary_emb): LlamaRotaryEmbedding()
+        )
+        (mlp): LlamaMLP(
+          (gate_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
+          (up_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
+          (down_proj): Linear4bit(in_features=11008, out_features=4096, bias=False)
+          (act_fn): SiLU()
+        )
+        (input_layernorm): LlamaRMSNorm((4096,), eps=1e-05)
+        (post_attention_layernorm): LlamaRMSNorm((4096,), eps=1e-05)
+      )
+    )
+    (norm): LlamaRMSNorm((4096,), eps=1e-05)
+    (rotary_emb): LlamaRotaryEmbedding()
+  )
+  (lm_head): Linear(in_features=4096, out_features=64000, bias=False)
+)
+'''
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 # **JAIS Adapted 7B Chat Merged with V4 LORA adapters on Google Colab via: **