Weights and Biases

company

Verified

https://wandb.com

wandb

Activity Feed Request to join this org

AI & ML interests

None defined yet.

wandb's activity

param-bharat

updated a dataset 3 months ago

wandb/ragbench-test-sample

Viewer • Updated Dec 16, 2024 • 957 • 69

geekyrakshit

updated 2 Spaces 3 months ago

Guardrails-Genie

Explore apps with guided authentication

README

param-bharat

updated a dataset 3 months ago

wandb/ragbench-sentence-relevance-balanced

Viewer • Updated Dec 9, 2024 • 624k • 64

morgan

updated 3 datasets 3 months ago

wandb/finqa-data-processed-hallucination

Viewer • Updated Dec 1, 2024 • 16.6k • 69

wandb/finqa-data-processed

Viewer • Updated Dec 1, 2024 • 8.28k • 135

wandb/fava-data-processed

Viewer • Updated Dec 1, 2024 • 460 • 36

tcapelle

updated a model 3 months ago

wandb/bias_scorer

Updated Nov 29, 2024 • 11

morgan

updated a dataset 3 months ago

wandb/RAGTruth-processed

Viewer • Updated Nov 28, 2024 • 17.8k • 78 • 4

geekyrakshit

updated a dataset 7 months ago

wandb/weave_cookbook_datasets

Updated Aug 1, 2024 • 6

ayut

updated a Space 7 months ago

Paper Reader

morgan

posted an update 8 months ago

Post

1304

Llama 3.1 405B Instruct beats GPT-4o on MixEval-Hard

Just ran MixEval for 405B, Sonnet-3.5 and 4o, with 405B landing right between the other two at 66.19

The GPT-4o result of 64.7 replicated locally but Sonnet-3.5 actually scored 70.25/69.45 in my replications 🤔 Still well ahead of the other 2 though.

Sammple of 1 of the eval calls here: https://wandb.ai/morgan/MixEval/weave/calls/07b05ae2-2ef5-4525-98a6-c59963b76fe1

Quick auto-logging tracing for openai-compatible clients and many more here: https://wandb.github.io/weave/quickstart/

tcapelle

updated 2 models 11 months ago

wandb/zephyr-orpo-7b-v0.2

Text Generation • Updated Apr 21, 2024 • 20 • 4

wandb/Mistral-7B-v0.2

Text Generation • Updated Apr 11, 2024 • 18 • 2

tcapelle

updated 2 models 12 months ago

wandb/gemma-2b-zephyr-sft

Text Generation • Updated Mar 23, 2024 • 113 • 4

wandb/pruned_mistral

Text Generation • Updated Mar 18, 2024 • 57 • 4

geekyrakshit

updated a Space 12 months ago

Reproducible Stable Diffusion XL

tcapelle

updated 2 models 12 months ago

wandb/mistral-7b-zephyr-dpo

Text Generation • Updated Mar 12, 2024 • 36 • 4

wandb/mistral-7b-zephyr-sft

Text Generation • Updated Mar 11, 2024 • 25 • 3

tcapelle

updated a dataset 12 months ago

wandb/deita-10k-v0-sft-latin

Viewer • Updated Mar 10, 2024 • 18k • 135 • 1