Lewdiculous (AetherArchitectural)

Lewdiculous

AI & ML interests

🔹https://arch.datasets.fyi | [Personal Profile] General tech and LLM stuff! https://beacons.ai/Lewdiculous | https://rentry.co/Lewdiculous | Mancer LLM Inference (ref): https://link.datasets.fyi/lwdmncr

Recent Activity

liked a model 2 days ago

PygmalionAI/Pygmalion-3-12B

reacted to retronic's post with 🔥 7 days ago

Colox, a reasoning AI model. I am currently working on a model smarter than GPT o1 that thinks before it speaks. It is coming tomorrow in the afternoon.

updated a model 9 days ago

LWDCLS/NightWing3_Virtuoso-10B-v0.2-GGUF-IQ-Imatrix

View all activity

Organizations

Posts 3

Post

6589

Hello fellow LLMers, just a quick notice that some of my activity will be moved into the AetherArchitectural Commuity and split with @Aetherarchio .

[here] https://huggingface.co/AetherArchitectural

All activity should be visible in the left side of my profile.

Post

47492

More context for your Pascal GPU or older!

Update: Now available in the official releases of KoboldCpp!
[releases] https://github.com/LostRuins/koboldcpp/releases/latest

These are great news for all the users with GTX 10XX, P40...

Flash Attention implementation for older NVIDIA GPUs without requiring Tensor Cores has come to llama.cpp in the last few days, and should be merged in the next version of KoboldCpp, you can already try it with another fork or by building it.

[Mentioned KCPP fork] https://github.com/Nexesenex/kobold.cpp/releases/latest

[PR] https://github.com/ggerganov/llama.cpp/pull/7188

You should expect less VRAM usage for the same context, allowing you to experience higher contexts with your current GPU.

There have also been reported final tokens/second speed improvements for inference, so that's also grand!

If you have tried it, I'd like to hear your experiences with --flashattention so far, especially for this implementation and for the large number of Pascal (GTX 10XX, P40...) cards.

Discussion linked bellow, with more links to relevant information:

https://huggingface.co/LWDCLS/LLM-Discussions/discussions/11

Cheers!

View all Posts