Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

d-Qwen2-0.5B - GGUF

Original model description:

license: apache-2.0 library_name: transformers tags: - qwen2 - distillation datasets: - EleutherAI/the_pile_deduplicated

  • This is a distillation experiment with Qwen2-1.5B as teacher and Qwen2-0.5B as student model respectively.
  • Samples were taken from the Pile dataset.
  • optimizer: SM3, scheduler: cosine with warmup, lr=2e-5

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains distilled 0.5B Qwen2 language model.

Downloads last month
1,899
GGUF
Model size
494M params
Architecture
qwen2
+2
Unable to determine this model's library. Check the docs .