18 9 45

M

Maykeye

Maykeye

AI & ML interests

Image Gen, TextGen, training sillyness from scratch

Recent Activity

new activity 2 days ago

Maykeye/TinyLLama-v0:Adding ONNX file of this model

liked a model 5 months ago

Zyphra/Zamba2-7B-Instruct

commented on a paper 5 months ago

Differential Transformer

View all activity

Organizations

None yet

Maykeye's activity

New activity in Maykeye/TinyLLama-v0 2 days ago

Adding ONNX file of this model

#4 opened 2 days ago by

mrm8848

liked a model 5 months ago

Zyphra/Zamba2-7B-Instruct

Text Generation • Updated 21 days ago • 2.5k • 89

commented a paper 5 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 171 •

liked a model 5 months ago

Zyphra/Zamba2-2.7B-instruct

Text Generation • Updated 21 days ago • 568 • 82

reacted to MonsterMMORPG's post with 👀 6 months ago

Post

2451

I have done an extensive multi-GPU FLUX Full Fine Tuning / DreamBooth training experimentation on RunPod by using 2x A100–80 GB GPUs (PCIe) since this was commonly asked of me.

Full article here : https://medium.com/@furkangozukara/multi-gpu-flux-fu

Image 1
Image 1 shows that only first part of installation of Kohya GUI took 30 minutes on a such powerful machine on a very expensive Secure Cloud pod — 3.28 USD per hour
There was also part 2, so just installation took super time
On Massed Compute, it would take like 2–3 minutes
This is why I suggest you to use Massed Compute over RunPod, RunPod machines have terrible hard disk speeds and they are like lottery to get good ones

Image 2, 3 and 4
Image 2 shows speed of our very best config FLUX Fine Tuning training shared below when doing 2x Multi GPU training
https://www.patreon.com/posts/kohya-flux-fine-112099700
Used config name is : Quality_1_27500MB_6_26_Second_IT.json
Image 3 shows VRAM usage of this config when doing 2x Multi GPU training
Image 4 shows the GPUs of the Pod

Image 5 and 6
Image 5 shows speed of our very best config FLUX Fine Tuning training shared below when doing a single GPU training
https://www.patreon.com/posts/kohya-flux-fine-112099700
Used config name is : Quality_1_27500MB_6_26_Second_IT.json
Image 6 shows this setup used VRAM amount

Image 7 and 8
Image 7 shows speed of our very best config FLUX Fine Tuning training shared below when doing a single GPU training and Gradient Checkpointing is disabled
https://www.patreon.com/posts/kohya-flux-fine-112099700
Used config name is : Quality_1_27500MB_6_26_Second_IT.json
Image 8 shows this setup used VRAM amount

....

reacted to kz919's post with 👀 6 months ago

Post

1913

https://huggingface.co/spaces/kz919/Llama3.1-Instruct-O1

upvoted a paper 6 months ago

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection

Paper • 2409.08513 • Published Sep 13, 2024 • 14

reacted to TuringsSolutions's post with 👀 6 months ago

Post

1443

ChatGPT does better at math if you prompt it to think like Captain Picard from Star Trek. Scientifically proven fact lol. This got me to thinking, LLM models probably 'think' about the world in weird ways. Far different ways than we would. This got me down a rabbit hole of thinking about different concepts but for LLM models. Somewhere along the way, Python Chemistry was born. To an LLM model, there is a strong connection between Python and Chemistry. To an LLM model, it is easier to understand exactly how Python works, if you frame it in terms of chemistry.

Don't believe me? Ask Python-Chemistry-GPT yourself: https://chatgpt.com/g/g-dzjYhJp4U-python-chemistry-gpt

Want to train your own Python-GPT and prove this concept actually works? Here is the dataset: https://huggingface.co/.../TuringsSolu.../PythonChemistry400

replied to enzostvs's post 6 months ago

Being called a king and being told I can be more is not exactly a hurtful roast. Feels more like a pep talk. 🤪

reacted to enzostvs's post with 🔥 6 months ago

Post

3605

What if we asked the AI what it thought of our hugging face profile? 👹
I've released a new space capable of doing it.... watch out, it hits hard! 🥊

Try it now ➡️ enzostvs/hugger-roaster

Share your roast below 👇

6 replies

liked a model 6 months ago

ai-forever/ruBert-base

Fill-Mask • Updated Nov 3, 2023 • 617k • 31

New activity in Maykeye/TinyLLama-v0 6 months ago

Interview request: genAI evaluation & documentation

#3 opened 6 months ago by

meggymuggy

liked a dataset 9 months ago

huggan/anime-faces

Preview • Updated Mar 22, 2022 • 566 • 23

upvoted a paper 9 months ago

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16, 2024 • 42

liked a dataset 9 months ago

RMT-team/babilong

Viewer • Updated Jun 17, 2024 • 25k • 15k • 14

liked a model 9 months ago

Zyphra/Zamba-7B-v1

Text Generation • Updated Oct 3, 2024 • 620 • 28

New activity in TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T 10 months ago

Pickling error - cannot load on transformers==4.37.0.dev0

#3 opened about 1 year ago by

danielhanchen

reacted to Fizzarolli's post with 👍 10 months ago

Post

2676

Is anyone looking into some sort of decentralized/federated dataset generation or classification by humans instead of synthetically?

From my experience with trying models, a *lot* of modern finetunes are trained on what amounts to, in essence, GPT-4 generated slop that makes everything sound like a rip-off GPT-4 (refer to i.e. the Dolphin finetunes). I have a feeling that this is a lot of the reason people haven't been quite as successful as Meta's instruct tunes of Llama 3.

liked 2 datasets 10 months ago

hayden-donnelly/db-sfw-128px-filtered-and-cropped

Viewer • Updated Mar 10, 2024 • 3.4M • 105 • 1

animelover/danbooru2022

Updated Dec 4, 2023 • 1.34k • 150