Amazon SageMaker

company

https://aws.amazon.com/sagemaker

https://github.com/aws/amazon-sagemaker-examples

Activity Feed

AI & ML interests

Assets for Amazon SageMaker

Recent Activity

philschmid updated a dataset 17 days ago

amazon-sagemaker/repository-metadata

philschmid authored a paper over 1 year ago

Datasets: A Community Library for Natural Language Processing

View all activity

amazon-sagemaker's activity

philschmid

updated a dataset 17 days ago

amazon-sagemaker/repository-metadata

Preview • Updated 17 days ago • 94 • 1

philschmid

posted an update 9 months ago

Post

6906

New state-of-the-art open LLM! 🚀 Databricks just released DBRX, a 132B MoE trained on 12T tokens. Claiming to surpass OpenAI GPT-3.5 and is competitive with Google Gemini 1.0 Pro. 🤯

TL;DR
🧮 132B MoE with 16 experts with 4 active in generation
🪟 32 000 context window
📈 Outperforms open LLMs on common benchmarks, including MMLU
🚀 Up to 2x faster inference than Llama 2 70B
💻 Trained on 12T tokens
🔡 Uses the GPT-4 tokenizer
📜 Custom License, commercially useable

Collection: databricks/dbrx-6601c0852a0cdd3c59f71962
Demo: databricks/dbrx-instruct

Kudos to the Team at Databricks and MosaicML for this strong release in the open community! 🤗

4 replies

philschmid

posted an update 11 months ago

Post

What's the best way to fine-tune open LLMs in 2024? Look no further! 👀 I am excited to share “How to Fine-Tune LLMs in 2024 with Hugging Face” using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. 🚀

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
💡Define and understand use cases for fine-tuning
🧑🏻‍💻 Setup of the development environment
🧮 Create and prepare dataset (OpenAI format)
🏋️‍♀️ Fine-tune LLM using TRL and the SFTTrainer
🥇 Test and evaluate the LLM
🚀 Deploy for production with TGI

👉 https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. 🔜

4 replies

philschmid

authored a paper over 1 year ago

Datasets: A Community Library for Natural Language Processing

Paper • 2109.02846 • Published Sep 7, 2021 • 10

AI & ML interests

Recent Activity

Team members 8

amazon-sagemaker's activity