twhoool02
/

Llama-2-7b-hf-AutoGPTQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Llama-2-7b-hf-AutoGPTQ / README.md

twhoool02's picture

Upload README.md with huggingface_hub

d6c079b verified 6 months ago

|

2.12 kB

	---
	language: en
	license: other
	tags:
	- meta
	- GPTQ
	- facebook
	- llama
	- llama2
	base_model: meta-llama/Llama-2-7b-hf
	model_name: Llama-2-7b-hf-AutoGPTQ
	library:
	- Transformers
	- GPTQ
	model_type: llama
	pipeline_tag: text-generation
	qunatized_by: twhoool02
	---

	# Model Card for LlamaForCausalLM(
	(model): LlamaModel(
	(embed_tokens): Embedding(32000, 4096)
	(layers): ModuleList(
	(0-31): 32 x LlamaDecoderLayer(
	(self_attn): LlamaSdpaAttention(
	(rotary_emb): LlamaRotaryEmbedding()
	(k_proj): QuantLinear()
	(o_proj): QuantLinear()
	(q_proj): QuantLinear()
	(v_proj): QuantLinear()
	)
	(mlp): LlamaMLP(
	(act_fn): SiLU()
	(down_proj): QuantLinear()
	(gate_proj): QuantLinear()
	(up_proj): QuantLinear()
	)
	(input_layernorm): LlamaRMSNorm()
	(post_attention_layernorm): LlamaRMSNorm()
	)
	)
	(norm): LlamaRMSNorm()
	)
	(lm_head): Linear(in_features=4096, out_features=32000, bias=False)
	)

	## Model Details

	This model is a GPTQ quantized version of the meta-llama/Llama-2-7b-hf model.

	- Developed by: Ted Whooley
	- Library: Transformers, GPTQ
	- Model type: llama
	- Model name: LlamaForCausalLM(
	(model): LlamaModel(
	(embed_tokens): Embedding(32000, 4096)
	(layers): ModuleList(
	(0-31): 32 x LlamaDecoderLayer(
	(self_attn): LlamaSdpaAttention(
	(rotary_emb): LlamaRotaryEmbedding()
	(k_proj): QuantLinear()
	(o_proj): QuantLinear()
	(q_proj): QuantLinear()
	(v_proj): QuantLinear()
	)
	(mlp): LlamaMLP(
	(act_fn): SiLU()
	(down_proj): QuantLinear()
	(gate_proj): QuantLinear()
	(up_proj): QuantLinear()
	)
	(input_layernorm): LlamaRMSNorm()
	(post_attention_layernorm): LlamaRMSNorm()
	)
	)
	(norm): LlamaRMSNorm()
	)
	(lm_head): Linear(in_features=4096, out_features=32000, bias=False)
	)
	- Pipeline tag: text-generation
	- Qunatized by: twhoool02
	- Language(s) (NLP): en
	- License: other