Victor Gallego's picture

Victor Gallego

vicgalle

·

https://github.com/vicgalle

AI & ML interests

Preference fine-tuning, alignment & synthetic data. Building LLMs in general!

Recent Activity

liked a model about 1 hour ago

sesame/csm-1b

upvoted a collection about 6 hours ago

liked a dataset about 13 hours ago

usail-hkust/JailJudge

View all activity

Organizations

vicgalle's activity

upvoted a collection about 6 hours ago

DeepHermes

Preview models of hybrid reasoner Hermes series • 6 items • Updated about 7 hours ago • 13

upvoted a collection 28 days ago

DPO

Various useful datasets with preference optimization • 16 items • Updated Jan 23 • 4

upvoted a paper 29 days ago

MetaSC: Test-Time Safety Specification Optimization for Language Models

Paper • 2502.07985 • Published about 1 month ago • 3

upvoted a paper about 1 month ago

Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 22

upvoted a collection about 1 month ago

Toxic Commons

Tools for de-toxifying public domain data, especially multilingual and historical text data and data with OCR errors. • 3 items • Updated Oct 31, 2024 • 6

upvoted a collection 2 months ago

Cosmos

The collection of Cosmos models • 31 items • Updated Jan 17 • 269

upvoted an article 5 months ago

Article

VLM Art Analysis

By

•

Oct 4, 2024

• 11

upvoted a collection 5 months ago

steiner-preview

Reasoning models trained on synthetic data using reinforcement learning. • 3 items • Updated Oct 20, 2024 • 32

upvoted 2 papers 5 months ago

Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems

Paper • 2410.13334 • Published Oct 17, 2024 • 13

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 171

upvoted a collection 6 months ago

Llama 3.2 Re-upload

10 items • Updated Sep 25, 2024 • 11

upvoted 2 papers 6 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 138

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 48

upvoted an article 7 months ago

Article

Tensor Parallelism

By

•

Aug 20, 2024

• 16

upvoted a collection 7 months ago

Hermes 3

The Hermes 3 Series of Models • 12 items • Updated 28 days ago • 112

upvoted a paper 7 months ago

WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models

Paper • 2408.03837 • Published Aug 7, 2024 • 18

upvoted a collection 8 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 653

upvoted 3 articles 8 months ago

Article

Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing

By

•

Jul 19, 2024

• 20

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 332

Article

The Rise of Agentic Data Generation

By

•

Jul 15, 2024

• 82