Ricardo Malagon Jerez's picture

6 8

Ricardo Malagon Jerez

rjmalagon

·

AI & ML interests

None yet

Recent Activity

liked a model 10 days ago

wanlige/li-14b-v0.4

liked a model 2 months ago

Danielbrdz/Barcenas-10b

reacted to mkurman's post with 🔥 3 months ago

We built a new small language model SmolLM2-MedIT-Upscale-2B, based on SmolLM2-1.7B-Instruct from Hugging Face. The premise was simple - increasing the vector in attention layers would positively impact the model's capabilities. What did we prove? In total, not much really, since we don't have the original trained under the same conditions as our upscale. However... 1. We scaled up the model without losing its quality 2. We confirmed that the method we devised works 3. After extremely short fine-tuning, the model achieved much better results in IFEval compared to the original (53.68 vs 64.29) and a higher overall average score in Open LLM Leaderboard (14.75 vs 15.17) I consider this a big success 😇, since surpassing the original in metrics is often very time-consuming, generates high costs, and doesn't always work out. Meanwhile, we're moving forward, training SmolLM2 400M Instruct as an upscale of 136M. We're curious about how increasing the base and intermediate vectors will affect the model's quality. We'll compare it to the original and the 360M Instruct version released by Hugging Face. License: Apache 2.0 https://huggingface.co/meditsolutions/SmolLM2-MedIT-Upscale-2B

View all activity

Organizations

None yet

rjmalagon's activity

liked a model 10 days ago

wanlige/li-14b-v0.4

Text Generation • Updated 10 days ago • 1.17k • 14

liked a model 2 months ago

Danielbrdz/Barcenas-10b

Text Generation • Updated Jan 4 • 21 • 1

liked 2 models 6 months ago

Danielbrdz/Barcenas-14b-Juridico-Mexicano

Text Generation • Updated Aug 7, 2024 • 11 • 1

Danielbrdz/Barcenas-Llama3-8b-ORPO

Text Generation • Updated Apr 29, 2024 • 11.2k • 7

liked a model 8 months ago

cognitivecomputations/dolphin-2.9.2-qwen2-7b

Text Generation • Updated Jun 18, 2024 • 2.5k • 67

liked 2 models 9 months ago

mlabonne/Beyonder-4x7B-v3

Text Generation • Updated Mar 28, 2024 • 2.35k • 58

mlabonne/NeuralDaredevil-8B-abliterated

Text Generation • Updated Aug 27, 2024 • 15.7k • 192

liked a model about 1 year ago

yunconglong/Truthful_DPO_TomGrc_FusionNet_7Bx2_MoE_13B

Text Generation • Updated Feb 28, 2024 • 6.25k • 53