Edit model card

Model Card for Llama2-7b-chat-HF-NF4

Model Details

This model is a NF4 quantized version of the meta-llama/Llama-2-7b-chat-hf model.

  • Developed by: Ted Whooley
  • Library: Transformers, NF4
  • Model type: llama
  • Model name: Llama2-7b-chat-HF-NF4
  • Pipeline tag: text-generation
  • Qunatized by: twhoool02
  • Language(s) (NLP): en
  • License: other
Downloads last month
0
Safetensors
Model size
3.6B params
Tensor type
F32
FP16
U8
Inference API
Input a message to start chatting with twhoool02/Llama2-7b-chat-HF-NF4.
This model can be loaded on Inference API (serverless).

Quantized from