twhoool02's picture
Upload README.md with huggingface_hub
d6c079b verified
|
raw
history blame
2.12 kB
metadata
language: en
license: other
tags:
  - meta
  - GPTQ
  - facebook
  - llama
  - llama2
base_model: meta-llama/Llama-2-7b-hf
model_name: Llama-2-7b-hf-AutoGPTQ
library:
  - Transformers
  - GPTQ
model_type: llama
pipeline_tag: text-generation
qunatized_by: twhoool02

Model Card for LlamaForCausalLM(

(model): LlamaModel( (embed_tokens): Embedding(32000, 4096) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaSdpaAttention( (rotary_emb): LlamaRotaryEmbedding() (k_proj): QuantLinear() (o_proj): QuantLinear() (q_proj): QuantLinear() (v_proj): QuantLinear() ) (mlp): LlamaMLP( (act_fn): SiLU() (down_proj): QuantLinear() (gate_proj): QuantLinear() (up_proj): QuantLinear() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) )

Model Details

This model is a GPTQ quantized version of the meta-llama/Llama-2-7b-hf model.

  • Developed by: Ted Whooley
  • Library: Transformers, GPTQ
  • Model type: llama
  • Model name: LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(32000, 4096) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaSdpaAttention( (rotary_emb): LlamaRotaryEmbedding() (k_proj): QuantLinear() (o_proj): QuantLinear() (q_proj): QuantLinear() (v_proj): QuantLinear() ) (mlp): LlamaMLP( (act_fn): SiLU() (down_proj): QuantLinear() (gate_proj): QuantLinear() (up_proj): QuantLinear() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) )
  • Pipeline tag: text-generation
  • Qunatized by: twhoool02
  • Language(s) (NLP): en
  • License: other