YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Took 42 hours to quantize on 4xA40s, at a batch size of 128. I could've went higher, but hindsight. At that batch size, it was using about 25-30 GiB per GPU, utilization remained at 100%.
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.