David Golchinfar's picture

David Golchinfar PRO

DavidGF

·

https://vago-solutions.ai

AI & ML interests

finetune llms, improve german language understanding and generated text of llms

Recent Activity

liked a model 7 days ago

Qwen/QVQ-72B-Preview

liked a dataset 8 days ago

flozi00/Fineweb2-EDUscore-German-llama3.3-70b-500k

liked a dataset 11 days ago

HuggingFaceTB/finemath

View all activity

Articles

SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive

Organizations

DavidGF's activity

upvoted an article 11 days ago

Article

Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs

By

•

12 days ago

• 3

upvoted an article about 1 month ago

Article

SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive

By

•

Nov 9

• 9

upvoted a collection about 2 months ago

🇫🇷 Calme-3

Here you can find all the new Calme-3 models • 27 items • Updated 26 days ago • 10

upvoted a paper 3 months ago

Spectrum: Targeted Training on Signal to Noise Ratio

Paper • 2406.06623 • Published Jun 7 • 12

upvoted an article 5 months ago

Article

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

Jul 31

• 59

upvoted a collection 6 months ago

VAGO solutions quants

Quantized version for the excellent german speaking models created by VAGO solutions. • 6 items • Updated Apr 20 • 2

upvoted a collection 7 months ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28 • 352

upvoted a collection 8 months ago

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12 • 34

upvoted 2 articles 8 months ago

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

Apr 24

• 59

Article

Fine-tune Llama 3 with ORPO

By

•

Apr 22

• 229

upvoted a collection 8 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 25 days ago • 698

upvoted a collection 9 months ago

🇩🇪German SFT and DPO datasets

Datasets that can be used for LLM training with axolotl, trl or llama_factory. • 32 items • Updated Nov 11 • 11

upvoted 2 papers 9 months ago

Arcee's MergeKit: A Toolkit for Merging Large Language Models

Paper • 2403.13257 • Published Mar 20 • 20

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 605

upvoted a paper 11 months ago

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

Paper • 2401.17377 • Published Jan 30 • 35

upvoted a paper 12 months ago

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 56