Ali El Filali

alielfilali01

AI & ML interests

"AI Psychometrician" | NLP (mainly for Arabic) | Other interests include Reinforcement Learning and Cognitive sciences among others

Articles

Organizations

Posts 24

view post
Post
1504
I feel like this incredible resource hasn't gotten the attention it deserves in the community!

@clefourrier and generally the HuggingFace evaluation team put together a fantastic guidebook covering a lot about 𝗘𝗩𝗔𝗟𝗨𝗔𝗧𝗜𝗢𝗡 from basics to advanced tips.

link : https://github.com/huggingface/evaluation-guidebook

I haven’t finished it yet, but i'am enjoying every piece of it so far. Huge thanks @clefourrier and the team for this invaluable resource !
view post
Post
1762
Why nobdoy is talking about the new training corpus released by MBZUAI today.

TxT360 is +15 Trillion tokens corpus outperforming FineWeb on several metrics. Ablation studies were done up to 1T tokens.

Read blog here : LLM360/TxT360
Dataset : LLM360/TxT360