OpenSourceRonin commited on
Commit
108ad69
Β·
verified Β·
1 Parent(s): e1ccadd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -61,7 +61,7 @@ Read tech report at [**Tech Report**](https://github.com/microsoft/VPTQ/blob/mai
61
 
62
  | Model Series | Collections | (Estimated) Bit per weight |
63
  | :--------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
64
- | DeeSseek R1 | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-deepseek-r1-without-finetune-67d0832c203afd208bb8449e) | [2.x bits, reshard for 4 GPUs](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536_mixed_mp4) [2 bits, reshard for 4 GPUs](https://huggingface.co/VPTQ-community/deepseek-r1_v8_k_65536_mp4) [3 bits, resahrd for 4 GPUs](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536_256_mp4), [3 bits](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536_256) [2 bits](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536)|
65
  | Llama 3.3 70B Instruct | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-llama-33-70b-instruct-without-finetune-675ef82388de8c1c1bef75ab) | [4 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-65536-woft) [3 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-256-woft) [2 bits (1)](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-0-woft) [2 bits (2)](VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-65536-woft) [1.875 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-16384-woft) [1.625 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-1024-woft) |
66
  | Llama 3.1 Nemotron 70B Instruct HF | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-llama-31-nemotron-70b-instruct-hf-without-finetune-671730b96f16208d0b3fe942) | [4 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-65536-woft) [3 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-256-woft) [2 bits (1)](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-65536-woft) [2 bits (2)](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-0-woft) [1.875 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-16384-woft) [1.625 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-1024-woft) [1.5 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-256-woft)|
67
  | Llama 3.1 8B Instruct | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-llama-31-8b-instruct-without-finetune-66f2b70b1d002ceedef02d2e) | [4 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-65536-woft) [3.5 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-4096-woft) [3 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-256-woft) [2.3 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v12-k65536-4096-woft) |
 
61
 
62
  | Model Series | Collections | (Estimated) Bit per weight |
63
  | :--------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
64
+ | DeeSseek R1 | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-deepseek-r1-without-finetune-67d0832c203afd208bb8449e) | [mixed 2-3 bits (reshard for 4 GPUs)](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536_mixed_mp4), [2 bits (reshard for 4 GPUs)](https://huggingface.co/VPTQ-community/deepseek-r1_v8_k_65536_mp4), [3 bits (resahrd for 4 GPUs)](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536_256_mp4), [3 bits](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536_256), [2 bits](https://huggingface.co/VPTQ-community/deepseek-r1_v_8_k_65536)|
65
  | Llama 3.3 70B Instruct | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-llama-33-70b-instruct-without-finetune-675ef82388de8c1c1bef75ab) | [4 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-65536-woft) [3 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-256-woft) [2 bits (1)](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v8-k65536-0-woft) [2 bits (2)](VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-65536-woft) [1.875 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-16384-woft) [1.625 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.3-70B-Instruct-v16-k65536-1024-woft) |
66
  | Llama 3.1 Nemotron 70B Instruct HF | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-llama-31-nemotron-70b-instruct-hf-without-finetune-671730b96f16208d0b3fe942) | [4 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-65536-woft) [3 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-256-woft) [2 bits (1)](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-65536-woft) [2 bits (2)](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-0-woft) [1.875 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-16384-woft) [1.625 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-1024-woft) [1.5 bits](https://huggingface.co/VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v16-k65536-256-woft)|
67
  | Llama 3.1 8B Instruct | [HF πŸ€—](https://huggingface.co/collections/VPTQ-community/vptq-llama-31-8b-instruct-without-finetune-66f2b70b1d002ceedef02d2e) | [4 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-65536-woft) [3.5 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-4096-woft) [3 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-256-woft) [2.3 bits](https://huggingface.co/VPTQ-community/Meta-Llama-3.1-8B-Instruct-v12-k65536-4096-woft) |