18 12 50

Jordan Legg PRO

takarajordan

https://takara.ai

AI & ML interests

Chief AI Officer @takara.ai. Diffusion, Inference optimisation and all things MultiModal.

Recent Activity

liked a model about 15 hours ago

takara-ai/charenji

replied to their post 2 days ago

I'm super excited to release my first open-source text dataset: WorldScenario 20K is a novel dataset of 20,000 synthetically generated multi-stakeholder scenarios designed to simulate real-world decision-making processes. Each scenario explores a unique environmental, societal, or economic issue. I used the brand new https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct model to generate this dataset and I put the dataset through some post processing to clean and evaluate the dataset for diversity. I'd appreciate some feedback and thoughts on my new release! Thanks! https://huggingface.co/datasets/takarajordan/WorldScenario_20K

updated a model 3 days ago

cappuch/streamformertest

View all activity

Organizations

takarajordan's activity

liked a model about 15 hours ago

takara-ai/charenji

Image-to-Text • Updated Nov 8, 2024 • 1

replied to their post 2 days ago

Sir, basically I want to create a generative AI university helpdesk chatbot, and for this, I have created datasets myself and also fine-tuned models, but I am not getting satisfactory results. Sir, if you have time, could you please check my datasets in my profile and help me understand how I can improve my dataset and work on it so that my task gets completed? I would be very grateful to you.

I would enhance your dataset to use multi turn conversations if you can at all for llama2 you could do something like this:

<s>[INST] Is the BS Physics program a part-time or full-time course? [/INST] The BS Physics program is a full-time undergraduate program that requires regular on-campus attendance. </s><s>[INST] How many units per semester? [/INST] A typical semester load consists of 15-18 units. </s>

hope this helps! Again, please reach out to me on discord here: takarajordan_82155

updated a model 3 days ago

cappuch/streamformertest

Updated 4 days ago

New activity in cappuch/streamformertest 3 days ago

Fix training parameters and fix cross entropy loss for padding tokens.

#1 opened 3 days ago by

takarajordan

updated a Space 11 days ago

Running

📉

README

New activity in open-acc/README 11 days ago

Update README.md

#10 opened 11 days ago by

takarajordan

liked a dataset 11 days ago

takarajordan/LastFM_120K

Viewer • Updated 11 days ago • 123k • 33 • 1

updated a dataset 11 days ago

takarajordan/LastFM_120K

Viewer • Updated 11 days ago • 123k • 33 • 1

liked a dataset 11 days ago

HuggingFaceTB/finemath

Viewer • Updated 11 days ago • 48.3M • 28.2k • 216

liked a model 12 days ago

answerdotai/ModernBERT-base

Fill-Mask • Updated 8 days ago • 77.7k • 584

liked a Space 12 days ago

Running on CPU Upgrade

100

🥇

LevelBot

replied to s3nh's post 15 days ago

gimme an invite! :D

reacted to s3nh's post with ❤️ 15 days ago

Post

1732

Welcome back,

Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect.
Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3

https://huggingface.co/SmolTuners

3 replies

replied to merve's post 15 days ago

Amazing work

reacted to merve's post with 🚀 15 days ago

Post

2722

Aya by Cohere For AI can now see! 👀

C4AI community has built Maya 8B, a new open-source multilingual VLM built on SigLIP and Aya 8B 🌱 works on 8 languages! 🗣️

The authors extend Llava dataset using Aya's translation capabilities with 558k examples!
ry it here kkr5155/maya_demo

Dataset maya-multimodal/pretrain

Model maya-multimodal/maya 👏
kudos @nahidalam and team

1 reply

reacted to merve's post with 🚀 15 days ago

Post

3150

Apollo is a new family of open-source video language models by Meta, where 3B model outperforms most 7B models and 7B outperforms most 30B models 🧶

✨ the models come in 1.5B https://huggingface.co/Apollo-LMMs/Apollo-1_5B-t32, 3B https://huggingface.co/Apollo-LMMs/Apollo-3B-t32 and 7B https://huggingface.co/Apollo-LMMs/Apollo-7B-t32 with A2.0 license, based on Qwen1.5 & Qwen2
✨ the authors also release a benchmark dataset https://huggingface.co/spaces/Apollo-LMMs/ApolloBench

The paper has a lot of experiments (they trained 84 models!) about what makes the video LMs work ⏯️

Try the demo for best setup here https://huggingface.co/spaces/Apollo-LMMs/Apollo-3B
they evaluate sampling strategies, scaling laws for models and datasets, video representation and more!
> The authors find out that whatever design decision was applied to small models also scale properly when the model and dataset are scaled 📈 scaling dataset has diminishing returns for smaller models
> They evaluate frame sampling strategies, and find that FPS sampling is better than uniform sampling, and they find 8-32 tokens per frame optimal
> They also compare image encoders, they try a variation of models from shape optimized SigLIP to DINOv2
they find google/siglip-so400m-patch14-384 to be most powerful 🔥
> they also compare freezing different parts of models, training all stages with some frozen parts give the best yield

They eventually release three models, where Apollo-3B outperforms most 7B models and Apollo 7B outperforms 30B models 🔥