Arcee AI

Enterprise

company

Verified

https://arcee.ai

arcee-ai

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Crystalcareai new activity 12 days ago

arcee-ai/Virtuoso-Small:Chat Template

juliensimon updated a Space 13 days ago

arcee-ai/Benchmarks

chargoddard updated a dataset 17 days ago

arcee-ai/EvolKit-75K

View all activity

arcee-ai's activity

freddyaboulton

posted an update 4 days ago

Post

1035

Just created a Gradio space for playing with the new OAI realtime voice API!

freddyaboulton/openai-realtime-voice

freddyaboulton

posted an update 5 days ago

Post

414

Gemini can talk 🗣️

Check out the new multimodal API from Google on @akhaliq 's anychat or my space. It's very fast and smart 🍓

https://huggingface.co/spaces/freddyaboulton/gemini-voicehttps://huggingface.co/spaces/akhaliq/anychat

1 reply

freddyaboulton

posted an update 10 days ago

Post

1786

Version 0.0.21 of gradio-pdf now properly loads chinese characters!

freddyaboulton

posted an update 10 days ago

Post

1501

Hello Llama 3.2! 🗣️🦙

Build a Siri-like coding assistant that responds to "Hello Llama" in 100 lines of python! All with Gradio, webRTC 😎

freddyaboulton/hey-llama-code-editor

freddyaboulton

posted an update 12 days ago

Post

1044

Just created a cookbook of real time audio/video spaces created using Gradio and WebRTC ⚡️

Use this and the [docs](https://freddyaboulton.github.io/gradio-webrtc/) to get started building the next gen of AI apps!

freddyaboulton/gradio-webrtc-cookbook-6758ba7745aeca7b1be7de0f

2 replies

bartowski

posted an update 12 days ago

Post

6333

Looks like Q4_0_N_M file types are going away

Before you panic, there's a new "preferred" method which is online (I prefer the term on-the-fly) repacking, so if you download Q4_0 and your setup can benefit from repacking the weights into interleaved rows (what Q4_0_4_4 was doing), it will do that automatically and give you similar performance (minor losses I think due to using intrinsics instead of assembly, but intrinsics are more maintainable)

You can see the reference PR here:

https://github.com/ggerganov/llama.cpp/pull/10446

So if you update your llama.cpp past that point, you won't be able to run Q4_0_4_4 (unless they add backwards compatibility back), but Q4_0 should be the same speeds (though it may currently be bugged on some platforms)

As such, I'll stop making those newer model formats soon, probably end of this week unless something changes, but you should be safe to download and Q4_0 quants and use those !

Also IQ4_NL supports repacking though not in as many shapes yet, but should get a respectable speed up on ARM chips, PR for that can be found here: https://github.com/ggerganov/llama.cpp/pull/10541

Remember, these are not meant for Apple silicon since those use the GPU and don't benefit from the repacking of weights

7 replies

Crystalcareai

in arcee-ai/Virtuoso-Small 12 days ago

Chat Template

#5 opened 17 days ago by

isr431

juliensimon

updated a Space 13 days ago

Running

🔥

Benchmarks

chargoddard

updated a dataset 17 days ago

arcee-ai/EvolKit-75K

Viewer • Updated 17 days ago • 74.2k • 1.06k • 22

Crystalcareai

updated a model 18 days ago

arcee-ai/Virtuoso-Small

Updated 18 days ago • 2.44k • 39

Crystalcareai

in arcee-ai/Virtuoso-Small 18 days ago

Question about model's origin

#2 opened 18 days ago by

sometimesanotion

Fix tokenizer.json with file from Qwen/Qwen2.5-14B

#3 opened 18 days ago by

MariusNocturnum

use the original Qwen2.5-14B-Instruct tokenizer

#4 opened 18 days ago by

MaziyarPanahi

updated a model 18 days ago

arcee-ai/Virtuoso-Small

Updated 18 days ago • 2.44k • 39

MaziyarPanahi

in arcee-ai/Virtuoso-Small 18 days ago

use the original Qwen2.5-14B-Instruct tokenizer

#4 opened 18 days ago by

MaziyarPanahi

Crystalcareai

in arcee-ai/Virtuoso-Small 18 days ago

Adding Evaluation Results

#1 opened 18 days ago by

leaderboard-pr-bot

Crystalcareai

updated a model 19 days ago

arcee-ai/Virtuoso-Small-GGUF

Updated 19 days ago • 5.05k • 4

bartowski

posted an update 20 days ago

Post

9422

Old mixtral model quants may be broken!

Recently Slaren over on llama.cpp refactored the model loader - in a way that's super awesome and very powerful - but with it came breaking of support for "split tensor MoE models", which applies to older mixtral models

You may have seen my upload of one such older mixtral model, ondurbin/bagel-dpo-8x7b-v0.2, and with the newest changes it seems to be able to run without issue

If you happen to run into issues with any other old mixtral models, drop a link here and I'll try to remake them with the new changes so that we can continue enjoying them :)

1 reply

Crystalcareai

updated 2 datasets 23 days ago

arcee-ai/LLama-405B-Logits

Viewer • Updated 23 days ago • 10k • 159 • 8

arcee-ai/EvolKit-75K

Viewer • Updated 17 days ago • 74.2k • 1.06k • 22

AI & ML interests

Recent Activity

Team members 34

arcee-ai's activity

Chat Template

Benchmarks

Question about model's origin

Fix tokenizer.json with file from Qwen/Qwen2.5-14B

use the original Qwen2.5-14B-Instruct tokenizer

use the original Qwen2.5-14B-Instruct tokenizer

Adding Evaluation Results