--- language: en license: other tags: - meta - GPTQ - facebook - llama - llama2 base_model: meta-llama/Llama-2-7b-hf model_name: Llama-2-7b-hf-AutoGPTQ library: - Transformers - GPTQ model_type: llama pipeline_tag: text-generation qunatized_by: twhoool02 --- # Model Card for LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(32000, 4096) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaSdpaAttention( (rotary_emb): LlamaRotaryEmbedding() (k_proj): QuantLinear() (o_proj): QuantLinear() (q_proj): QuantLinear() (v_proj): QuantLinear() ) (mlp): LlamaMLP( (act_fn): SiLU() (down_proj): QuantLinear() (gate_proj): QuantLinear() (up_proj): QuantLinear() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) ) ## Model Details This model is a GPTQ quantized version of the meta-llama/Llama-2-7b-hf model. - **Developed by:** Ted Whooley - **Library:** Transformers, GPTQ - **Model type:** llama - **Model name:** LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(32000, 4096) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaSdpaAttention( (rotary_emb): LlamaRotaryEmbedding() (k_proj): QuantLinear() (o_proj): QuantLinear() (q_proj): QuantLinear() (v_proj): QuantLinear() ) (mlp): LlamaMLP( (act_fn): SiLU() (down_proj): QuantLinear() (gate_proj): QuantLinear() (up_proj): QuantLinear() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) ) - **Pipeline tag:** text-generation - **Qunatized by:** twhoool02 - **Language(s) (NLP):** en - **License:** other