GLM-130B模型的int4量化版本,可在四张3090Ti的情况下进行推理。 An int4 quantized version of the GLM-130B model that can be inferred with 4 * 3090Ti . --- license: apache-2.0 --- iannobug@gmail.com