罗杰斯's picture

罗杰斯

rojasdiego

·

https://rojasdiego.com

AI & ML interests

LLMs for Code Generation

Recent Activity

updated a collection 3 minutes ago

liked a model 4 minutes ago

infly/OpenCoder-1.5B-Base

liked a model 5 minutes ago

infly/OpenCoder-8B-Instruct

View all activity

Organizations

rojasdiego's activity

upvoted a paper about 2 months ago

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 35

upvoted a paper 2 months ago

Why Does the Effective Context Length of LLMs Fall Short?

Paper • 2410.18745 • Published Oct 24, 2024 • 17

upvoted a paper 3 months ago

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published Oct 3, 2024 • 47

upvoted a collection 3 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 28 days ago • 551

upvoted 3 papers 4 months ago

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 88

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Paper • 2405.04324 • Published May 7, 2024 • 22

Scaling Granite Code Models to 128K Context

Paper • 2407.13739 • Published Jul 18, 2024 • 19

upvoted 3 collections 4 months ago

Code LLMs

6 items • Updated 3 minutes ago • 1

Arctic-embed

A collection of text embedding models optimized for retrieval accuracy and efficiency • 8 items • Updated 29 days ago • 17

MoEs papers reading list

60 items • Updated Nov 4, 2024 • 137

upvoted 3 papers 4 months ago

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

Paper • 2408.16532 • Published Aug 29, 2024 • 47

CogVLM2: Visual Language Models for Image and Video Understanding

Paper • 2408.16500 • Published Aug 29, 2024 • 56

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 63

upvoted a paper 6 months ago

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published Jun 22, 2024 • 45

upvoted a paper 7 months ago

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Paper • 2406.12793 • Published Jun 18, 2024 • 31

upvoted 2 papers 10 months ago

MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

Paper • 2402.15627 • Published Feb 23, 2024 • 34

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 605

upvoted a collection 11 months ago

Stable Code

Suite of developer assistant models • 5 items • Updated Apr 8, 2024 • 41

upvoted 2 papers about 1 year ago

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

Paper • 2401.00448 • Published Dec 31, 2023 • 28

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138