view article Article Fast Inference on Large Language Models: BLOOMZ on Habana Gaudi2 Accelerator Mar 28, 2023 • 1