SambaNova Systems

company

Verified

https://sambanova.ai/

SambaNovaAI

sambanova

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

kz919 updated a Space 9 days ago

sambanovasystems/QwQ-32B-preview

kz919 updated a Space 16 days ago

sambanovasystems/SambaNova-Qwen2.5-Coder-Artifacts

kz919 authored a paper 26 days ago

Cautious Optimizers: Improving Training with One Line of Code

View all activity

sambanovasystems's activity

kz919

updated a Space 9 days ago

Running

🔍

QwQ-32B-Preview

kz919

updated a Space 16 days ago

Running

187

🐢

Cautious Optimizers: Improving Training with One Line of Code

Paper • 2411.16085 • Published 28 days ago • 15

Amitabhab

updated a Space 26 days ago

Running

🌍

Planner

Plan your itinerary with the help of AI

kz919

updated 2 Spaces about 2 months ago

Running

🏆

Pictionary

Play Pictionary With Llama3.2 instruct

Running

🐨

Pix2Latex

Real-time pixel 2 latex code

kz919

updated a Space 2 months ago

Runtime error

📉

Sambanova Gradio

zolicsaki

posted an update 3 months ago

Post

1252

We’ve open-sourced an app, powered by SambaNova Cloud and Llama 405B, that intelligently detects when a web search is needed—then answers directly or with RAG.

sambanovasystems/auto-web-search

🥚 A hidden Easter egg is that Auto Search detection is already trained into Llama 3.1 checkpoints. Simply use the tool usage system prompt below, and the model will either respond with a web search query if it deems necessary or respond to the query directly.🥚

Environment: IPython
Tools: Brave Search
Knowledge Cutoff Date: December 2023
Today's Date: September 2024
You are a helpful assistant. Reminder:
Search function calls MUST follow the specified format: "brave_search.call(query)"

You can see the documentation here
https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1#built-in-tooling
and read about how the tool usage was trained into Llama3.1 models in section 4.3.5 here https://arxiv.org/pdf/2407.21783

kz919

posted an update 3 months ago

Post

1283

Just for the meme.

But the clear lesson I learnt from building these demos are, the more powerful the underlying base model is, the closer you will get to GPT4o1. CoT is nothing more than simply inducing the latent reasoning capability from the model.

kz919/GPT4-O1-Proximas

kz919

posted an update 3 months ago

Post

1865

https://huggingface.co/spaces/kz919/Llama3.1-Instruct-O1

zolicsaki

posted an update 3 months ago

Post

1293

Fast inference is no longer a nice-to-have demo; it will be the driving force behind future frontier models. Time to switch over to custom AI hardware and short Nvidia.

Try out SambaNova's lightning fast API for free at https://sambanova.ai/fast-api?api_ref=444868

kz919

posted an update 3 months ago

Post

2447

"It's Sunday night, fancy a game?"
https://kz919-can-you-beat-405b-in-chess.hf.space/
built with the one and only SN fast API:
https://sambanova.ai/fast-api?api_ref=907266

7 replies

kz919

posted an update 4 months ago

Post

636

Good lord... Spent almost a day debugging this and it turns out it was an issue of gradio update incompatible with the new fastapi.
https://discuss.huggingface.co/t/huggingface-space-failed-after-working-initially/105514/8

Finally got it back online! Come chat with your favorite anime characters here:
kz919/Persona-AI

kz919

posted an update 4 months ago

Post

1586

Spent a few minutes to build an alternative to Character AI on top of llama3.1 405B through SambaNova's super fast inference API

Space: kz919/Persona-AI
API referral link: https://sambanova.ai/fast-api?api_ref=907266

3 replies

kz919

posted an update 4 months ago

Post

1688

The only 405B spaces still freely accessible are powered by SN fast api.

xianbao/SambaNova-fast

https://sambanova.ai/fast-api?api_ref=907266

zolicsaki

posted an update 4 months ago

Post

1811

You can run Llama405B at over 100 tokens per second for free using SambaNova's API! https://sambanova.ai/fast-api?api_ref=444868

I have been able to generate some high quality synthetic data and use it as an LLM as a judge instead of the slower and more expensive alternatives like openAI or Anthropic.

2 replies

yongningsheng

authored a paper 7 months ago

SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts

Paper • 2405.07518 • Published May 13 • 24

zolicsaki

posted an update 7 months ago

Post

890

SambaNova just released a revolutionary paper about how the SN40L AI chip can host many LLMs on a single node and run inference so efficiently that it enables running a "composition of experts." These experts can be interconnected via a router, resulting in remarkable accuracy. This method allows you to take open source expert models from HuggingFace and continuously build and integrate them into a composition of experts.

I am also super excited about the possibilities that SN40Ls unlock for LLM agent workflows and pipelined calls. With the release of GPT4o, it seems that monolithic LLMs are starting to reach a plateau, and I believe that the next wave of AI will be driven by pipelined LLM calls and agent workflows. Most pipelined LLM workflows are bottlenecked by prohibitively expensive compute and high latency, but the SN40L provides a one stop shop solution for this. We need to get the word out to the community that this hardware exists, because it will open up a realm of possibilities that developers working with Nvidia hardware did not know exist.

SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts (2405.07518)

daweih

authored a paper 7 months ago

SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts

Paper • 2405.07518 • Published May 13 • 24

kz919

authored a paper 7 months ago

SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts

Paper • 2405.07518 • Published May 13 • 24

AI & ML interests

Recent Activity

Team members 138

sambanovasystems's activity

QwQ-32B-Preview

Qwen2.5 Coder Artifacts

Planner

Pictionary

Pix2Latex

Sambanova Gradio