Dokyoon

leeloolee

Eruly

AI & ML interests

Recent Activity

upvoted a paper 9 days ago

GUI Agents: A Survey

reacted to m-ric's post with 👍 15 days ago

𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞 𝐫𝐞𝐥𝐞𝐚𝐬𝐞𝐬 𝐏𝐢𝐜𝐨𝐭𝐫𝐨𝐧, 𝐚 𝐦𝐢𝐜𝐫𝐨𝐬𝐜𝐨𝐩𝐢𝐜 𝐥𝐢𝐛 𝐭𝐡𝐚𝐭 𝐬𝐨𝐥𝐯𝐞𝐬 𝐋𝐋𝐌 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝟒𝐃 𝐩𝐚𝐫𝐚𝐥𝐥𝐞𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧 🥳 🕰️ Llama-3.1-405B took 39 million GPU-hours to train, i.e. about 4.5 thousand years. 👴🏻 If they had needed all this time, we would have GPU stories from the time of Pharaoh 𓂀: "Alas, Lord of Two Lands, the shipment of counting-stones arriving from Cathay was lost to pirates, this shall delay the building of your computing temple by many moons " 🛠️ But instead, they just parallelized the training on 24k H100s, which made it take just a few months. This required parallelizing across 4 dimensions: data, tensor, context, pipeline. And it is infamously hard to do, making for bloated code repos that hold together only by magic. 🤏 𝗕𝘂𝘁 𝗻𝗼𝘄 𝘄𝗲 𝗱𝗼𝗻'𝘁 𝗻𝗲𝗲𝗱 𝗵𝘂𝗴𝗲 𝗿𝗲𝗽𝗼𝘀 𝗮𝗻𝘆𝗺𝗼𝗿𝗲! Instead of building mega-training codes, Hugging Face colleagues cooked in the other direction, towards tiny 4D parallelism libs. A team has built Nanotron, already widely used in industry. And now a team releases Picotron, a radical approach to code 4D Parallelism in just a few hundred lines of code, a real engineering prowess, making it much easier to understand what's actually happening! ⚡ 𝗜𝘁'𝘀 𝘁𝗶𝗻𝘆, 𝘆𝗲𝘁 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹: Counting in MFU (Model FLOPs Utilization, how much the model actually uses all the compute potential), this lib reaches ~50% on SmolLM-1.7B model with 8 H100 GPUs, which is really close to what huge libs would reach. (Caution: the team is leading further benchmarks to verify this) Go take a look 👉 https://github.com/huggingface/picotron/tree/main/picotron

reacted to alimotahharynia's post with 🔥 15 days ago

Here's the space for our new article that leverages LLMs with reinforcement learning to design high-quality small molecules. Check it out at https://huggingface.co/spaces/alimotahharynia/GPT-2-Drug-Generator. You can also access the article here: https://arxiv.org/abs/2411.14157. I would be happy to receive your feedback.

View all activity

Organizations

leeloolee's activity

liked a dataset 16 days ago

echo840/OCRBench

Viewer • Updated 16 days ago • 1k • 5.2k • 11

liked a model 17 days ago

U4R/StructTable-InternVL2-1B

Image-to-Text • Updated 22 days ago • 1.12k • 28

liked a model 18 days ago

google/Gemma-Embeddings-v1.0

Updated 18 days ago • 654 • 111

liked a model 23 days ago

TIGER-Lab/VLM2Vec-Full

Text Generation • Updated 14 days ago • 22.6k • 21

liked a Space 24 days ago

Running

💻

Vision Papers

All paper summaries read by Merve

liked a dataset about 1 month ago

NCSOFT/K-SEED

Viewer • Updated 28 days ago • 2.97k • 225 • 14

liked 2 Spaces about 1 month ago

Running

🥇

Vidore Leaderboard

Running

640

👁

PR Puppet Sora

liked a model about 1 month ago

zjunlp/HalDet-llava-7b

Text Generation • Updated Apr 24, 2024 • 27 • 2

liked 5 datasets 2 months ago

liked 3 datasets 3 months ago

YiyangAiLab/POVID_preference_data_for_VLLMs

Viewer • Updated Apr 1, 2024 • 17.2k • 44 • 7

Salesforce/blip3-grounding-50m

Viewer • Updated Sep 19, 2024 • 52.4M • 902 • 20

nvidia/HelpSteer2

Viewer • Updated 16 days ago • 21.4k • 15.2k • 392

liked a model 3 months ago

google/gemma-2-2b-jpn-it

Text Generation • Updated Oct 2, 2024 • 24.6k • 149

liked 2 datasets 3 months ago

HuggingFaceH4/10k_prompts_ranked

Viewer • Updated Sep 30, 2024 • 10.3k • 46 • 3

HuggingFaceH4/llava-instruct-mix-vsft

Viewer • Updated Apr 11, 2024 • 273k • 1.06k • 36