Ricardo Malagon Jerez

rjmalagon

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

None yet

rjmalagon's activity

reacted to mkurman's post with πŸ”₯ 27 days ago
view post
Post
1177
We built a new small language model SmolLM2-MedIT-Upscale-2B, based on SmolLM2-1.7B-Instruct from Hugging Face. The premise was simple - increasing the vector in attention layers would positively impact the model's capabilities.

What did we prove?
In total, not much really, since we don't have the original trained under the same conditions as our upscale. However...

1. We scaled up the model without losing its quality
2. We confirmed that the method we devised works
3. After extremely short fine-tuning, the model achieved much better results in IFEval compared to the original (53.68 vs 64.29) and a higher overall average score in Open LLM Leaderboard (14.75 vs 15.17)

I consider this a big success πŸ˜‡, since surpassing the original in metrics is often very time-consuming, generates high costs, and doesn't always work out.

Meanwhile, we're moving forward, training SmolLM2 400M Instruct as an upscale of 136M.

We're curious about how increasing the base and intermediate vectors will affect the model's quality. We'll compare it to the original and the 360M Instruct version released by Hugging Face.

License: Apache 2.0​​​​​​​​​​​​​​​​

meditsolutions/SmolLM2-MedIT-Upscale-2B
reacted to clem's post with πŸ‘ 29 days ago
view post
Post
4360
Hugging Face is becoming the best place to share the most viral AI apps with spaces.

Kolors Virtual Try-on just crossed 6,000,000 unique visitors & is now the #5 most popular space. Congrats to the Kwai Kolors team!

Kwai-Kolors/Kolors-Virtual-Try-On
  • 2 replies
Β·
New activity in Alibaba-NLP/gte-Qwen2-1.5B-instruct 3 months ago

Qwen 2.5 1.5B retrain?

4
#12 opened 3 months ago by
tomaarsen
New activity in Danielbrdz/Barcenas-8b-Juridico-Mexicano 4 months ago

Disponible para Ollama

#1 opened 4 months ago by
rjmalagon
New activity in Danielbrdz/Barcenas-14b-Juridico-Mexicano 4 months ago

Acceso via Ollama

#1 opened 4 months ago by
rjmalagon