georgewritescode (George Cameron)

Posts 2

Post

1036

Visualization of GPT-4o breaking away from the quality & speed trade-off curve the LLMs have followed thus far ✂️

Key GPT-4o takeaways
‣ GPT-4o not only offers the highest quality, it also sits amongst the fastest LLMs
‣ For those with speed/latency-sensitive use cases, where previously Claude 3 Haiku or Mixtral 8x7b were leaders, GPT-4o is now a compelling option (though significantly more expensive)
‣ Previously Groq was the only provider to break from the curve using its own LPU chips. OpenAI has done it on Nvidia hardware (one can imagine the potential for GPT-4o on Groq)

👉 How did they do it? Will follow up with more analysis on this but potential approaches include a very large but sparse MoE model (similar to Snowflake's Arctic) and improvements in data quality (likely to have driven much of Llama 3's impressive quality relative to parameter count)

Notes: Throughput represents the median across providers over the last 14 days of measurements (8x per day)

Data is present on our HF leaderboard: ArtificialAnalysis/LLM-Performance-Leaderboard and graphs present on our website

View all Posts

Articles 3

Article

20

Evaluating Audio Reasoning with Big Bench Audio

View all Articles

models

None public yet

datasets

None public yet

George Cameron

AI & ML interests

Organizations

Posts 2

Articles 3

Evaluating Audio Reasoning with Big Bench Audio

models

datasets