Sleeping
1
⚡
None defined yet.
From xMAD.ai
Welcome to the official Hugging Face organization for xMADified models from xMAD.ai!
The repositories below contains popular Llama models xMADified with our NeurIPS 2024 methods from 16-bit floats to 4-bit integers, using xMAD.ai proprietary technology. These models are fine-tunable over the same reduced (4x less) hardware in mere 3-clicks. Watch our product demo here
For additional models, please join our beta here, and we'll get back to you promptly!
Current Public xMADified Models:
The memory and hardware requirements (GPU memory needed to run as well as fine-tune them) are listed in the table below:
Model | GPU Memory Requirement (Before/After) |
---|---|
Llama-3.2-3B-Instruct-xMADai-4bit | 6.5 GB → 3.5 GB (any laptop GPU) |
Llama-3.2-1B-Instruct-xMADai-4bit | 2.5 GB → 2 GB (any laptop GPU) |
Llama-3.1-405B-Instruct-xMADai-4bit | 800 GB (16 H100s) → 250 GB (8 V100) |
Llama-3.1-8B-Instruct-xMADai-4bit | 16 GB → 7 GB (any laptop GPU) |