language: en
license: other
tags:
- meta
- GPTQ
- facebook
- llama
- llama2
base_model: meta-llama/Llama-2-7b-hf
model_name: Llama-2-7b-hf-AutoGPTQ
library:
- Transformers
- GPTQ
model_type: llama
pipeline_tag: text-generation
qunatized_by: twhoool02
Model Card for LlamaForCausalLM(
(model): LlamaModel( (embed_tokens): Embedding(32000, 4096) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaSdpaAttention( (rotary_emb): LlamaRotaryEmbedding() (k_proj): QuantLinear() (o_proj): QuantLinear() (q_proj): QuantLinear() (v_proj): QuantLinear() ) (mlp): LlamaMLP( (act_fn): SiLU() (down_proj): QuantLinear() (gate_proj): QuantLinear() (up_proj): QuantLinear() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) )
Model Details
This model is a GPTQ quantized version of the meta-llama/Llama-2-7b-hf model.
- Developed by: Ted Whooley
- Library: Transformers, GPTQ
- Model type: llama
- Model name: LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(32000, 4096) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaSdpaAttention( (rotary_emb): LlamaRotaryEmbedding() (k_proj): QuantLinear() (o_proj): QuantLinear() (q_proj): QuantLinear() (v_proj): QuantLinear() ) (mlp): LlamaMLP( (act_fn): SiLU() (down_proj): QuantLinear() (gate_proj): QuantLinear() (up_proj): QuantLinear() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) )
- Pipeline tag: text-generation
- Qunatized by: twhoool02
- Language(s) (NLP): en
- License: other