Charles

tuanlda78202

AI & ML interests

LLMs

Recent Activity

Organizations

ZeroGPU Explorers's profile picture Jan's profile picture Cortex's profile picture Homebrew Research's profile picture

tuanlda78202's activity

Reacted to AdinaY's post with πŸ‘€ 10 days ago
Reacted to clem's post with πŸš€ about 2 months ago
view post
Post
3702
Very few people realize that most of the successful AI startups got successful because they were focused on open science and open-source for at least their first few years. To name but a few, OpenAI (GPT, GPT2 was open-source), Runway & Stability (stable diffusion), Cohere, Mistral and of course Hugging Face!

The reasons are not just altruistic, it's also because sharing your science and your models pushes you to build AI faster (which is key in a fast-moving domain like AI), attracts the best scientists & engineers and generates much more visibility, usage and community contributions than if you were 100% closed-source. The same applies to big tech companies as we're seeing with Meta and Google!

More startups and companies should release research & open-source AI, it's not just good for the world but also increases their probability of success!
Β·
Reacted to fdaudens's post with πŸ‘ about 2 months ago
upvoted 3 articles 2 months ago
view article
Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

β€’ 212
view article
Article

Vision Language Models Explained

β€’ 216
view article
Article

Illustrated LLM OS: An Implementational Perspective

By shivance β€’
β€’ 15
Reacted to elinas's post with πŸ‘€ 3 months ago
view post
Post
2022
We conducted an experiment in an effort to revive LLaMA 1 33B as it had unique prose and a lack of "GPT-isms" and "slop" in its pretraining data, as well as being one of the favorites at the time. With multiple finetune runs, we were able to extend the model from it's pretrained base of 2048 to ~12,000 tokens adding approx. 500M tokens in the process. The effective length is 16,384 but it's better to keep it on the lower range. It writes well and in multiple formats. In the future, we have some ideas like implementing GQA. Please take a look and we would love to hear your feedback!

ZeusLabs/Chronos-Divergence-33B
Reacted to davidberenstein1957's post with πŸ”₯ 3 months ago
view post
Post
1501
Interested in learning about everything Image?

​With the rise of recent interest in Vision Language Models (VLMs), we decided to make a push to include an ImageField within Argilla! This means any open source developer can now work on better models for vision ML tasks too and we would like to show you how.

​We would love to introduce this new feature to you, so we've prepared a set of notebooks to go over some common image scenarios.
finetune an CLIP retrieval model with sentence transformers
use ColPali+ Qwen VL for RAG and log the results to Argilla
image-generation preference: creating multi-modal preference datasets for free using Hugging Face inference endpoints.

​See you on Thursday!

https://lu.ma/x7id1jqu
Reacted to AlexBodner's post with πŸ‘€ 3 months ago
view post
Post
3783
πŸ’ΎπŸ§ How much VRAM will you need for training your AI model? πŸ’ΎπŸ§ 
Check out this app where you convert:
Pytorch/tensorflow summary -> needed VRAM
or
Parameter count -> needed VRAM

Use it in: http://howmuchvram.com

And everything is open source! Ask for new functionalities or contribute in:
https://github.com/AlexBodner/How_Much_VRAM
If it's useful to you leave a star 🌟and share it to someone that will find the tool useful!
Β·
upvoted an article 3 months ago
view article
Article

Welcome Gemma 2 - Google's new open LLM

β€’ 124