Code: https://github.com/Jaykef/ai-algorithms/blob/main/grpo_multimodal_reasoner.ipynb
alkinun
AI & ML interests
Recent Activity
Organizations
AtAndDev's activity

Code: https://github.com/Jaykef/ai-algorithms/blob/main/grpo_multimodal_reasoner.ipynb

I didnt mean you lul
I meant him
@MonsterMMORPG
...

brother, dunking on some great models to defend your "product" is not a great (hate to say it but) human value...

ma guys suffered ik :)
Its expensive for everyone, just go with o3-mini, they just figured out that they are not the single llm provider and just doubled the cost of r1 for o3-mini.

Several GPUs are fine tuning it at the same time, each using a different dataset and using QLoRA and the successful ones are merged later. Compared to LoRa this allows faster training and also reduced overfitting because the merge operation heals overfitting. The problem with this could be the 4 bit quantization may make models dumber. But I am not looking for sheer IQ. Too much mind is a problem anyway :)
Has anyone tried parallel QLoRa and merge before?
I also automated the dataset selection and benchmarking and converging to objectives (the fit function, the reward). It is basically trying to get higher score in AHA Leaderboard as fast as possible with a diverse set of organisms that "evolve by training".
I want to release some cool stuff when I have the time:
- how an answer to a single question changes over time, with each training round or day
- a chart to show AHA alignment over training rounds


julien-c/follow-history
As you can see, I still have more followers than @julien-c even if he's trying to change this by building such cool spaces 😝😝😝



Also, the links are just wrong as of my knowledge, open source just means its accessible to everyone to download... But the license differs like said, but the worst it can be is not to be used to make money, thats just it.
Please correct me if im wrong.

Well, the models are research and there is some real work going into them but I checked some of those products that are promoted here and they are either clones of spaces you can find here and some name added...
Plus, all models here are oss but licensed different like (cc-by-nc or custom licenses) but either way they provide competition, contribution and ideas here which is always plus to everyone.