160 68 220

Philipp Schmid

philschmid

https://www.philschmid.de

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago

philschmid/modernbert-llm-router

updated a collection 3 days ago

LLM Reasoning Papers

upvoted a paper 5 days ago

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

View all activity

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

May 1

• 69

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 281

Making thousands of open LLMs bloom in the Vertex AI Model Garden

Apr 10

• 18

CodeGemma - an official Google release for code LLMs

Apr 9

• 99

Bringing serverless GPU inference to Hugging Face users

Apr 2

• 11

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Mar 18

• 7

Welcome Gemma - Google's new open LLM

Feb 21

• 21

From OpenAI to Open LLMs with Messages API

Feb 8

• 12

Hugging Face Text Generation Inference available for AWS Inferentia2

Feb 1

• 5

Hugging Face and Google partner for open AI collaboration

Jan 25

• 4

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Dec 11, 2023

• 11

Deploy Embedding Models with Hugging Face Inference Endpoints

Oct 24, 2023

• 2

Llama 2 on Amazon SageMaker a Benchmark

Sep 26, 2023

Fine-tuning Llama 2 70B using PyTorch FSDP

Sep 13, 2023

• 16

Spread Your Wings: Falcon 180B is here

Sep 6, 2023

• 4

Code Llama: Llama 2 learns to code

Aug 25, 2023

• 9

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Aug 10, 2023

Llama 2 is here - get it on Hugging Face

Jul 18, 2023

• 22

Deploy LLMs with Hugging Face Inference Endpoints

Jul 4, 2023

• 11

The Falcon has landed in the Hugging Face ecosystem

Jun 5, 2023

• 10

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

May 31, 2023

• 2

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

May 24, 2023

Creating a Coding Assistant with StarCoder

May 9, 2023

• 1

Accelerating Hugging Face Transformers with AWS Inferentia2

Apr 17, 2023

Hugging Face and AWS partner to make AI more accessible

Feb 21, 2023

• 2

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

• 5

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

• 3

Accelerated Inference with Optimum and Transformers Pipelines

May 10, 2022

• 2

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Mar 16, 2022

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

• 2

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Jan 11, 2022

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Jun 3, 2021

• 3

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Apr 8, 2021

Organizations

Posts 2

Post

6898

New state-of-the-art open LLM! 🚀 Databricks just released DBRX, a 132B MoE trained on 12T tokens. Claiming to surpass OpenAI GPT-3.5 and is competitive with Google Gemini 1.0 Pro. 🤯

TL;DR
🧮 132B MoE with 16 experts with 4 active in generation
🪟 32 000 context window
📈 Outperforms open LLMs on common benchmarks, including MMLU
🚀 Up to 2x faster inference than Llama 2 70B
💻 Trained on 12T tokens
🔡 Uses the GPT-4 tokenizer
📜 Custom License, commercially useable

Collection: databricks/dbrx-6601c0852a0cdd3c59f71962
Demo: databricks/dbrx-instruct

Kudos to the Team at Databricks and MosaicML for this strong release in the open community! 🤗

Post

What's the best way to fine-tune open LLMs in 2024? Look no further! 👀 I am excited to share “How to Fine-Tune LLMs in 2024 with Hugging Face” using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. 🚀

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
💡Define and understand use cases for fine-tuning
🧑🏻‍💻 Setup of the development environment
🧮 Create and prepare dataset (OpenAI format)
🏋️‍♀️ Fine-tune LLM using TRL and the SFTTrainer
🥇 Test and evaluate the LLM
🚀 Deploy for production with TGI

👉 https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. 🔜