twhoool02
/

Llama-2-7b-hf-AutoGPTQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

twhoool02 commited on Mar 26

Commit

bdaea36

•

1 Parent(s): d6c079b

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +2 -52

README.md CHANGED Viewed

@@ -17,32 +17,7 @@ pipeline_tag: text-generation
 qunatized_by: twhoool02
 ---
-# Model Card for LlamaForCausalLM(
-  (model): LlamaModel(
-    (embed_tokens): Embedding(32000, 4096)
-    (layers): ModuleList(
-      (0-31): 32 x LlamaDecoderLayer(
-        (self_attn): LlamaSdpaAttention(
-          (rotary_emb): LlamaRotaryEmbedding()
-          (k_proj): QuantLinear()
-          (o_proj): QuantLinear()
-          (q_proj): QuantLinear()
-          (v_proj): QuantLinear()
-        )
-        (mlp): LlamaMLP(
-          (act_fn): SiLU()
-          (down_proj): QuantLinear()
-          (gate_proj): QuantLinear()
-          (up_proj): QuantLinear()
-        )
-        (input_layernorm): LlamaRMSNorm()
-        (post_attention_layernorm): LlamaRMSNorm()
-      )
-    )
-    (norm): LlamaRMSNorm()
-  )
-  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
-)
 ## Model Details
@@ -51,32 +26,7 @@ This model is a GPTQ quantized version of the meta-llama/Llama-2-7b-hf model.
 - **Developed by:** Ted Whooley
 - **Library:** Transformers, GPTQ
 - **Model type:** llama
-- **Model name:** LlamaForCausalLM(
-  (model): LlamaModel(
-    (embed_tokens): Embedding(32000, 4096)
-    (layers): ModuleList(
-      (0-31): 32 x LlamaDecoderLayer(
-        (self_attn): LlamaSdpaAttention(
-          (rotary_emb): LlamaRotaryEmbedding()
-          (k_proj): QuantLinear()
-          (o_proj): QuantLinear()
-          (q_proj): QuantLinear()
-          (v_proj): QuantLinear()
-        )
-        (mlp): LlamaMLP(
-          (act_fn): SiLU()
-          (down_proj): QuantLinear()
-          (gate_proj): QuantLinear()
-          (up_proj): QuantLinear()
-        )
-        (input_layernorm): LlamaRMSNorm()
-        (post_attention_layernorm): LlamaRMSNorm()
-      )
-    )
-    (norm): LlamaRMSNorm()
-  )
-  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
-)
 - **Pipeline tag:** text-generation
 - **Qunatized by:** twhoool02
 - **Language(s) (NLP):** en

 qunatized_by: twhoool02
 ---
+# Model Card for twhoool02/Llama-2-7b-hf-AutoGPTQ
 ## Model Details
 - **Developed by:** Ted Whooley
 - **Library:** Transformers, GPTQ
 - **Model type:** llama
+- **Model name:** Llama-2-7b-hf-AutoGPTQ
 - **Pipeline tag:** text-generation
 - **Qunatized by:** twhoool02
 - **Language(s) (NLP):** en