--- library_name: transformers base_model: - HuggingFaceM4/Idefics3-8B-Llama3 pipeline_tag: image-text-to-text --- # Idefics3-8B-Llama3-bnb_nf4 BitsAndBytes NF4 quantization. ### Quantization Quantization created with: ``` python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig model_id = "HuggingFaceM4/Idefics3-8B-Llama3" nf4_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True, llm_int8_enable_fp32_cpu_offload=True, llm_int8_skip_modules=["lm_head", "model.vision_model", "model.connector"], ) model_nf4 = AutoModelForVision2Seq.from_pretrained(model_id, quantization_config=nf4_config) ```