I'm now working on finetuning of coding models. If you are GPU-hungry like me, you will find quantized models very helpful. But quantization for finetuning and inference are different and incompatible. So I made two collections here.

Inference (GGUF, via Ollama, CPU is enough)
onekq-ai/ollama-ready-coding-models-67118c3cfa1af2cf04a926d6

Finetuning (Bitsandbytes, QLora, GPU is needed)
onekq-ai/qlora-ready-coding-models-67118771ce001b8f4cf946b2

For quantization, the inference models are far more popular on HF than finetuning models. I use https://huggingface.co/QuantFactory to generate inference models (GGUF), and there are a few other choices.

But there hasn't been such a service for finetuning models. DIY isn't too hard though. I made a few myself and you can find the script in the model cards. If the original model is small enough, you can even do it on a free T4 (available via Google Colab).

If you know a (small) coding model worthy of quantization, please let me know and I'd love to add it to the collections.

upvoted a collection 3 months ago

Ollama-ready Coding Models

Collection

For inference. CPU is enough for both quantization and inference. • 14 items • Updated Oct 19, 2024 • 2

liked 2 models 5 months ago

DeepMount00/Italian_NER_XXL

Token Classification • Updated Mar 28, 2024 • 2.9k • 35

DeepMount00/GLiNER_PII_ITA

Token Classification • Updated Aug 13, 2024 • 277 • 7

liked a model 6 months ago

interneuronai/it_support_ticket_classification_pegasus

Text Classification • Updated May 10, 2024 • 61 • 6

liked 2 datasets 7 months ago

DeepMount00/mmlu_ita

Viewer • Updated Jun 12, 2024 • 45k • 30 • 1

DeepMount00/python_ita

Viewer • Updated Jun 11, 2024 • 27.7k • 32 • 3

liked a model 7 months ago

sapienzanlp/modello-italia-9b

Text Generation • Updated Jun 12, 2024 • 2.39k • 19

liked 2 datasets 7 months ago

HuggingFaceFW/fineweb-edu

Viewer • Updated 3 days ago • 3.24B • 164k • 590

Cohere/wikipedia-2023-11-embed-multilingual-v3

Viewer • Updated Mar 19, 2024 • 247M • 3.78k • 232

reacted to Jaward's post with 🚀 7 months ago

Post

2122

Proof that ablative educational dataset significantly enhances model capabilities (independent of model parameters or architecture) 🤩

Yesterday, FineWeb’s technical report was published. FYI FineWeb (by 🤗) is currently the best opensource text dataset that can scale up model performance up to that of GPT-3 level.

While proprietary datasets used in training models like GPT-4/Claude/LlaMA are crawled internally and never released, FineWeb builds on CommonCrawl (an open repo for crawled web data). They preprocessed the data using their custom built data preprocessing library datatrove (which they also opensourced), and then evaluate the data quality on lighteval by training small sized models “ablation models” using nanotron (a library for pretraining transformer models).

Of all versions of FineWeb, FineWeb-Edu outperforms all other subsets. This is thanks to a new filtering technique wherein they used synthetic data to develop classifiers for identifying educational contents.

Turned out “Education is All You Need”:)

1 reply

upvoted a collection 8 months ago

🇮🇹 Italian NLP Resources

Collection

Collection of models, datasets and demos relevant to Italian NLP 🇮🇹 • 268 items • Updated 20 days ago • 24

liked a Space 8 months ago

Sleeping

🏃

Lexora 3b Chat

reacted to albertvillanova's post with 🚀 8 months ago

Post

1662

🚀 We recently released datasets 2.19.0! 📦

🔥 What's New:
- Polars integration 🐻‍❄️
- fsspec support for conversion to JSON, CSV, and Parquet
- Mode parameter for Image feature
- CLI function to convert script-datasets to Parquet
- Dataset.take and Dataset.skip

Plus, a bunch of general improvements & bug fixes!

Check out the release notes: https://github.com/huggingface/datasets/releases/tag/2.19.0

Upgrade now and power up your data workflows! 💥

2 replies

liked a model 9 months ago

DeepMount00/OCR_corrector

Text2Text Generation • Updated Apr 19, 2024 • 22 • 15