paligemma-cpu-gguf / README.md
mjlm's picture
Initial commit.
dea4744

A newer version of the Gradio SDK is available: 5.6.0

Upgrade
metadata
title: PaliGemma Demo
emoji: 🤲
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 4.22.0
app_file: app.py
pinned: false
license: apache-2.0

PaliGemma Demo

See Blogpost and big_vision README.md for details about the model.

Development

Local testing (CPU, Python 3.12):

pip -m venv env
. env/bin/activate
pip install -qr requirements-cpu.txt
python app.py

Environment variables:

  • MOCK_MODEL=yes: For quick UI testing.
  • RAM_CACHE_GB=18: Enables caching of 3 bf16 models in memory: a single bf16 model is about 5860 MB. Use with care on spaces with little RAM. For example, on a A10G large space you can cache five models in RAM, so you would set RAM_CACHE_GB=30.
  • HOST_COLOCATION=4: If host RAM/disk is shared between 4 processes (e.g. the Huggingface A10 large Spaces).

Loading models:

  • The set of models loaded is defined in ./models.py.
  • You must first acknowledge usage conditions to access models.
  • When testing locally, you'll have to run huggingface_cli login.
  • When running in a Huggingface Space, you'll have to set a HF_TOKEN secret.