wenhua cheng

wenhuach

AI & ML interests

Model Compression, CV

Recent Activity

Organizations

Intel's profile picture Need4Speed's profile picture Qwen's profile picture

wenhuach's activity

replied to their post 3 days ago
view reply

You can try using auto-round-fast xxx for a slight accuracy drop, or auto-round-fast xxx --nsamples 1 --iters 1 for very fast execution without algorithm tuning.

replied to their post 3 days ago
view reply

Thank you for your suggestion. As our focus is on algorithm development and our computational resources are limited, we currently lack the bandwidth to support a large number of models. If you come across any models that would benefit from quantization, feel free to comment on any models under OPEA. We will make an effort to prioritize and quantize them if resources allow.

reacted to their post with 🔥👀 4 days ago
posted an update 4 days ago
reacted to their post with ❤️ 12 days ago
view post
Post
326
This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

https://huggingface.co/OPEA
  • 3 replies
·
replied to their post 12 days ago
posted an update 15 days ago
view post
Post
326
This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

https://huggingface.co/OPEA
  • 3 replies
·
New activity in OPEA/glm-4-9b-chat-int4-sym-inc 22 days ago

Update README.md

#1 opened 22 days ago by
wenhuach
reacted to their post with 🚀 22 days ago
view post
Post
976
OPEA space just releases nearly 20 int4 models, for example, QWQ-32B-Preview,
Llama-3.2-11B-Vision-Instruct, Qwen2.5, Llama3.1, etc. Check out https://huggingface.co/OPEA
posted an update 22 days ago
view post
Post
976
OPEA space just releases nearly 20 int4 models, for example, QWQ-32B-Preview,
Llama-3.2-11B-Vision-Instruct, Qwen2.5, Llama3.1, etc. Check out https://huggingface.co/OPEA