view article Article BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡ By xhluca • 3 days ago • 22
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published 9 days ago • 74
view article Article Introducing the Ultimate SEC LLM: Revolutionizing Financial Insights with Llama-3-70B By Crystalcareai • 10 days ago • 6
view article Article Building a Vision Mixture-of-Expert Model from several fine-tuned Phi-3-Vision Models By mjbuehler • 17 days ago • 4
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22 • 43
Unmixtraled experts Collection This collections contains all 8 experts of Mixtral 8x22B converted to single dense 22B models. The models are intended as basis for merges or finetune • 9 items • Updated Apr 11 • 1
💥 Laser vs DoRA vs Daser vs LoRA Collection Comparison of different PEFT techniques of NeuralMonarch. • 4 items • Updated Mar 22 • 5
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated 16 days ago • 188
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 66
🐶 Beagle Collection Merges done using mergekit and LazyMergekit: https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb#scrollTo=d5mYzDo1q96y • 8 items • Updated May 27 • 6
DRAGON Models Collection Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models • 11 items • Updated Feb 3 • 42