wenhuach (wenhua cheng)

replied to their post about 1 month ago

While that may be one reason, it doesn't fully explain why there are still many quantized models available for LLaMA 3.1 and LLaMA 3.3.

reacted to their post with 🚀 about 2 months ago

Post

2331

Are we the only providers of INT4 quantized models for Llama 3.2 VL?
OPEA/Llama-3.2-90B-Vision-Instruct-int4-sym-inc
OPEA/Llama-3.2-11B-Vision-Instruct-int4-sym-inc

3 replies

·

posted an update about 2 months ago

Post

2331

Are we the only providers of INT4 quantized models for Llama 3.2 VL?
OPEA/Llama-3.2-90B-Vision-Instruct-int4-sym-inc
OPEA/Llama-3.2-11B-Vision-Instruct-int4-sym-inc

3 replies

·

replied to their post about 2 months ago

You can try using auto-round-fast xxx for a slight accuracy drop, or auto-round-fast xxx --nsamples 1 --iters 1 for very fast execution without algorithm tuning.

replied to their post about 2 months ago

Thank you for your suggestion. As our focus is on algorithm development and our computational resources are limited, we currently lack the bandwidth to support a large number of models. If you come across any models that would benefit from quantization, feel free to comment on any models under OPEA. We will make an effort to prioritize and quantize them if resources allow.

reacted to their post with 🔥👀 about 2 months ago

Post

1818

AutoRound has demonstrated strong results even at 2-bit precision for VLM models like QWEN2-VL-72B. Check it out here: OPEA/Qwen2-VL-72B-Instruct-int2-sym-inc.

4 replies

·

posted an update about 2 months ago

Post

1818

AutoRound has demonstrated strong results even at 2-bit precision for VLM models like QWEN2-VL-72B. Check it out here: OPEA/Qwen2-VL-72B-Instruct-int2-sym-inc.

4 replies

·

upvoted 3 collections about 2 months ago

reacted to their post with ❤️ 2 months ago

Post

341

This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

https://huggingface.co/OPEA

3 replies

·

replied to their post 2 months ago

Sure, we will have a try

posted an update 2 months ago

Post

341

This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

https://huggingface.co/OPEA

3 replies

·

liked a model 2 months ago

OPEA/Meta-Llama-3.1-70B-Instruct-int4-sym-inc

Updated Dec 23, 2024 • 12 • 1

New activity in OPEA/glm-4-9b-chat-int4-sym-inc 2 months ago

Update README.md

#1 opened 2 months ago by

wenhuach

updated a model 2 months ago

OPEA/glm-4-9b-chat-int4-sym-inc

Updated Dec 23, 2024 • 10

reacted to their post with 🚀 2 months ago

Post

985

OPEA space just releases nearly 20 int4 models, for example, QWQ-32B-Preview,
Llama-3.2-11B-Vision-Instruct, Qwen2.5, Llama3.1, etc. Check out https://huggingface.co/OPEA

posted an update 2 months ago

Post

985

OPEA space just releases nearly 20 int4 models, for example, QWQ-32B-Preview,
Llama-3.2-11B-Vision-Instruct, Qwen2.5, Llama3.1, etc. Check out https://huggingface.co/OPEA

updated a model 2 months ago

Intel/Meta-Llama-3.1-8B-Instruct-int4-inc

Updated Nov 28, 2024 • 2

wenhua cheng

AI & ML interests

Recent Activity

Organizations

wenhuach's activity

QWEN-AutoRound

VLMs-AutoRound

Llama-AutoRound

OPEA/Meta-Llama-3.1-70B-Instruct-int4-sym-inc

Update README.md

OPEA/glm-4-9b-chat-int4-sym-inc

Intel/Meta-Llama-3.1-8B-Instruct-int4-inc