Peter Vankman

Venkman42

AI & ML interests

None yet

Recent Activity

reacted to burtenshaw's post with 🔥 1 day ago
We’re launching a FREE and CERTIFIED course on Agents! We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents. Here's what you'll learn: - Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions. - Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors. - Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents. - Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents. Audience This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents. Enroll today and start building the next generation of AI agent applications! https://bit.ly/hf-learn-agents
liked a model 4 months ago
allenai/Molmo-7B-D-0924
liked a model 4 months ago
allenai/MolmoE-1B-0924
View all activity

Organizations

None yet

Venkman42's activity

reacted to burtenshaw's post with 🔥 1 day ago
view post
Post
24412
We’re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents
·
New activity in mlabonne/Yet_Another_LLM_Leaderboard 8 months ago

Nice Leaderboard :)

63
#1 opened about 1 year ago by
Venkman42
reacted to hrishbhdalal's post with 👍 8 months ago
view post
Post
764
I just saw that openai is using an updated tokenizer and it greatly increases the speed of the model and maybe even performance as if we increase the size of vocabulary, it could predict a single token which might be equivalent to two or three tokens in current tokenizers with 50-60k tokens or even 100k. I was thinking of scaling this to a million vocabulary size and then pre training llama3 8b using lora. I know that the model might go to shit, but we can increase the speed of the tokens generation greatly imo. And as one of meta papers said that predicting multiple tokens at the same time can actually increase the performance of a model, so I can imagine increasing the vocabulary in this way means multiple token generation in a way. Yann Lecunn also says that we don’t think in tokens but more like representations or abstractions of situations or problem to be solved. Can scaling to a million vocab size or even 10 million vocab size lead to better and more robust models? Please give me your thoughts on what can go wrong, what can go right etc…
  • 1 reply
·
reacted to thomwolf's post with ❤️ 10 months ago
view post
Post
5125
A Little guide to building Large Language Models in 2024

This is a post-recording of a 75min lecture I gave two weeks ago on how to train a LLM from scratch in 2024. I tried to keep it short and comprehensive – focusing on concepts that are crucial for training good LLM but often hidden in tech reports.

In the lecture, I introduce the students to all the important concepts/tools/techniques for training good performance LLM:
* finding, preparing and evaluating web scale data
* understanding model parallelism and efficient training
* fine-tuning/aligning models
* fast inference

There is of course many things and details missing and that I should have added to it, don't hesitate to tell me you're most frustrating omission and I'll add it in a future part. In particular I think I'll add more focus on how to filter topics well and extensively and maybe more practical anecdotes and details.

Now that I recorded it I've been thinking this could be part 1 of a two-parts series with a 2nd fully hands-on video on how to run all these steps with some libraries and recipes we've released recently at HF around LLM training (and could be easily adapted to your other framework anyway):
*datatrove for all things web-scale data preparation: https://github.com/huggingface/datatrove
*nanotron for lightweight 4D parallelism LLM training: https://github.com/huggingface/nanotron
*lighteval for in-training fast parallel LLM evaluations: https://github.com/huggingface/lighteval

Here is the link to watch the lecture on Youtube: https://www.youtube.com/watch?v=2-SPH9hIKT8
And here is the link to the Google slides: https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit#slide=id.p

Enjoy and happy to hear feedback on it and what to add, correct, extend in a second part.
  • 2 replies
·
New activity in merve/llava-next 10 months ago