15 1 30

Peter Vankman

Venkman42

AI & ML interests

None yet

Recent Activity

reacted to burtenshaw's post with 🔥 1 day ago

We’re launching a FREE and CERTIFIED course on Agents! We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents. Here's what you'll learn: - Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions. - Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors. - Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents. - Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents. Audience This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents. Enroll today and start building the next generation of AI agent applications! https://bit.ly/hf-learn-agents

liked a model 4 months ago

allenai/Molmo-7B-D-0924

liked a model 4 months ago

allenai/MolmoE-1B-0924

View all activity

Organizations

None yet

Venkman42's activity

reacted to burtenshaw's post with 🔥 1 day ago

Post

24412

We’re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents

22 replies

liked 3 models 4 months ago

New activity in mlabonne/TwinLlama-3.1-8B 4 months ago

Did you continue pretraining with unsloth for this?

#3 opened 4 months ago by

Venkman42

updated a collection 5 months ago

Top Model

Collection

This model outperformed all previous phi-2 based finetunes, except for one MoE implementation • 3 items • Updated Aug 18, 2024 • 2

liked a Space 5 months ago

Running on Zero

784

🥖

Parler-TTS

High-fidelity Text-To-Speech

liked a model 6 months ago

qnguyen3/nanoLLaVA-1.5

Image-Text-to-Text • Updated Sep 21, 2024 • 1.12k • 105

liked a Space 6 months ago

Running on Zero

124

🚀

nanoLLaVA-1.5

New activity in mlabonne/Yet_Another_LLM_Leaderboard 8 months ago

Nice Leaderboard :)

#1 opened about 1 year ago by

Venkman42

reacted to hrishbhdalal's post with 👍 8 months ago

Post

764

I just saw that openai is using an updated tokenizer and it greatly increases the speed of the model and maybe even performance as if we increase the size of vocabulary, it could predict a single token which might be equivalent to two or three tokens in current tokenizers with 50-60k tokens or even 100k. I was thinking of scaling this to a million vocabulary size and then pre training llama3 8b using lora. I know that the model might go to shit, but we can increase the speed of the tokens generation greatly imo. And as one of meta papers said that predicting multiple tokens at the same time can actually increase the performance of a model, so I can imagine increasing the vocabulary in this way means multiple token generation in a way. Yann Lecunn also says that we don’t think in tokens but more like representations or abstractions of situations or problem to be solved. Can scaling to a million vocab size or even 10 million vocab size lead to better and more robust models? Please give me your thoughts on what can go wrong, what can go right etc…

1 reply

liked a Space 9 months ago

Running on CPU Upgrade

🥇

Open CoT Leaderboard

Track, rank and evaluate open LLMs' CoT quality

liked 2 models 10 months ago

invalid-coder/Starling-LM-7B-beta-laser-dpo

Text Generation • Updated Mar 30, 2024 • 73 • 3

intfloat/multilingual-e5-large-instruct

Feature Extraction • Updated Sep 26, 2024 • 335k • • 265

liked a Space 10 months ago

Running

3.88k

🏆🤖

Chatbot Arena Leaderboard

reacted to thomwolf's post with ❤️ 10 months ago

Post

5125

A Little guide to building Large Language Models in 2024

This is a post-recording of a 75min lecture I gave two weeks ago on how to train a LLM from scratch in 2024. I tried to keep it short and comprehensive – focusing on concepts that are crucial for training good LLM but often hidden in tech reports.

In the lecture, I introduce the students to all the important concepts/tools/techniques for training good performance LLM:
* finding, preparing and evaluating web scale data
* understanding model parallelism and efficient training
* fine-tuning/aligning models
* fast inference

There is of course many things and details missing and that I should have added to it, don't hesitate to tell me you're most frustrating omission and I'll add it in a future part. In particular I think I'll add more focus on how to filter topics well and extensively and maybe more practical anecdotes and details.

Now that I recorded it I've been thinking this could be part 1 of a two-parts series with a 2nd fully hands-on video on how to run all these steps with some libraries and recipes we've released recently at HF around LLM training (and could be easily adapted to your other framework anyway):
*datatrove for all things web-scale data preparation: https://github.com/huggingface/datatrove
*nanotron for lightweight 4D parallelism LLM training: https://github.com/huggingface/nanotron
*lighteval for in-training fast parallel LLM evaluations: https://github.com/huggingface/lighteval

Here is the link to watch the lecture on Youtube: https://www.youtube.com/watch?v=2-SPH9hIKT8
And here is the link to the Google slides: https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit#slide=id.p

Enjoy and happy to hear feedback on it and what to add, correct, extend in a second part.

2 replies

New activity in merve/llava-next 10 months ago

Which version of llava next are you running here?

#1 opened 10 months ago by

Venkman42

liked 3 models 10 months ago

abacusai/TheProfessor-155b

Text Generation • Updated Feb 22, 2024 • 145 • 94

cognitivecomputations/dolphin-2.8-gemma-2b

Text Generation • Updated Mar 24, 2024 • 14 • 13

llava-hf/llava-v1.6-mistral-7b-hf

Image-Text-to-Text • Updated 10 days ago • 502k • 248