IST Austria Distributed Algorithms and Systems Lab

university

https://ist.ac.at/en/research/alistarh-group/

https://github.com/IST-DASLab

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

BlackSamorez updated a collection 10 days ago

HIGGS

BlackSamorez updated a model 22 days ago

ISTA-DASLab/Llama-3.3-70B-Instruct-HIGGS-GPTQ-4bit

BlackSamorez updated a collection 24 days ago

HIGGS

View all activity

ISTA-DASLab's activity

BlackSamorez

updated a collection 10 days ago

HIGGS

Collection

Models prequantized with [HIGGS](https://arxiv.org/abs/2411.17525) zero-shot quantization. Requires the latest `transformers` to run. • 17 items • Updated 10 days ago • 4

BlackSamorez

updated a model 22 days ago

ISTA-DASLab/Llama-3.3-70B-Instruct-HIGGS-GPTQ-4bit

Updated 22 days ago • 12 • 1

BlackSamorez

updated a collection 24 days ago

HIGGS

Collection

Models prequantized with [HIGGS](https://arxiv.org/abs/2411.17525) zero-shot quantization. Requires the latest `transformers` to run. • 17 items • Updated 10 days ago • 4

BlackSamorez

updated a model 24 days ago

ISTA-DASLab/Llama-3.1-8B-HIGGS-GPTQ-4bit

Text Generation • Updated 24 days ago • 12

BlackSamorez

updated a collection 24 days ago

HIGGS

Collection

Models prequantized with [HIGGS](https://arxiv.org/abs/2411.17525) zero-shot quantization. Requires the latest `transformers` to run. • 17 items • Updated 10 days ago • 4

BlackSamorez

updated 4 models 24 days ago

d-alistarh

authored 9 papers 25 days ago

Model compression via distillation and quantization

Paper • 1802.05668 • Published Feb 15, 2018 • 1

Sparse Finetuning for Inference Acceleration of Large Language Models

Paper • 2310.06927 • Published Oct 10, 2023 • 14

Towards End-to-end 4-Bit Inference on Generative Large Language Models

Paper • 2310.09259 • Published Oct 13, 2023 • 1

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Paper • 2306.03078 • Published Jun 5, 2023 • 3

Error Feedback Can Accurately Compress Preconditioners

Paper • 2306.06098 • Published Jun 9, 2023

RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation

Paper • 2401.04679 • Published Jan 9, 2024 • 2

Extreme Compression of Large Language Models via Additive Quantization

Paper • 2401.06118 • Published Jan 11, 2024 • 12

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

Paper • 2308.02060 • Published Aug 3, 2023 • 1

How Well Do Sparse Imagenet Models Transfer?

Paper • 2111.13445 • Published Nov 26, 2021 • 1

AI & ML interests

Recent Activity

Team members 12

ISTA-DASLab's activity