Derek Thomas's picture

Derek Thomas

derek-thomas

·

https://datavistics.github.io

datavistics

AI & ML interests

https://www.linkedin.com/in/dthomas/

Recent Activity

updated a dataset about 22 hours ago

reddit-tools-HF/dataset-creator-reddit-bestofredditorupdates

updated a dataset 1 day ago

derek-thomas/labeled-multiple-choice-explained-falcon-reasoning

upvoted a paper 7 days ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

View all activity

Articles

Deploying Speech-to-Speech on Hugging Face

The 5 Most Under-Rated Tools on Hugging Face

TGI Multi-LoRA: Deploy Once, Serve 30 Models

Benchmarking Text Generation Inference

AI Watermarking 101: Tools and Techniques

Organizations

derek-thomas's activity

upvoted a paper 7 days ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 51

upvoted a collection about 1 month ago

Prompt Order Experiment

Prompt Order Experiment shows how to run a simple experiment on the hub and leverage tools like AutoTrain, and Inference Endpoints. • 16 items • Updated 21 days ago • 2

upvoted an article about 1 month ago

Article

Low Code Large Language Model Alignment

By

•

Nov 19, 2024

• 13

upvoted a paper 2 months ago

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

Paper • 2410.12705 • Published Oct 16, 2024 • 30

upvoted an article 2 months ago

Article

Deploying Speech-to-Speech on Hugging Face

Oct 22, 2024

• 35

upvoted a paper 2 months ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21, 2024 • 59

upvoted an article 2 months ago

Article

AI Watermarking 101: Tools and Techniques

Feb 26, 2024

• 15

upvoted 3 papers 3 months ago

ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation

Paper • 2410.01731 • Published Oct 2, 2024 • 16

BordIRlines: A Dataset for Evaluating Cross-lingual Retrieval-Augmented Generation

Paper • 2410.01171 • Published Oct 2, 2024 • 5

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18, 2024 • 36

upvoted 3 papers 4 months ago

Challenges and Responses in the Practice of Large Language Models

Paper • 2408.09416 • Published Aug 18, 2024 • 1

Characterizing Prompt Compression Methods for Long Context Inference

Paper • 2407.08892 • Published Jul 11, 2024 • 9

GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering

Paper • 2409.06595 • Published Sep 10, 2024 • 37

upvoted 2 articles 4 months ago

Article

Introducing Community Tools on HuggingChat

Sep 16, 2024

• 34

Article

Accelerate 1.0.0

Sep 13, 2024

• 51

upvoted 5 papers 4 months ago

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Paper • 2406.19314 • Published Jun 27, 2024 • 20

Efficient Detection of Toxic Prompts in Large Language Models

Paper • 2408.11727 • Published Aug 21, 2024 • 12

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Paper • 2407.12883 • Published Jul 16, 2024 • 8

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 138

APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

Paper • 2406.18518 • Published Jun 26, 2024 • 24