Sherman Chann's picture

Sherman Chann

152334H

·

https://152334H.github.io

152334H

AI & ML interests

None yet

Organizations

152334H's activity

commented 2 papers 3 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 135 •

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 135 •

commented 2 papers 4 months ago

DeepSpeak Dataset v1.0

Paper • 2408.05366 • Published Aug 9 • 11 •

OpenResearcher: Unleashing AI for Accelerated Scientific Research

Paper • 2408.06941 • Published Aug 13 • 30 •

New activity in meta-llama/Llama-3.1-405B 4 months ago

8-kv-heads

#21 opened 5 months ago by

New activity in 152334H/miqu-1-70b-sf 5 months ago

Adding Evaluation Results

#23 opened 5 months ago by

leaderboard-pr-bot

commented a paper 5 months ago

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27 • 57 •

commented 3 papers 6 months ago

Unveiling Encoder-Free Vision-Language Models

Paper • 2406.11832 • Published Jun 17 • 50 •

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17 • 22 •

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Paper • 2406.11069 • Published Jun 16 • 13 •

commented a paper 10 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 603 •

New activity in 152334H/miqu-1-70b-hermes2.5-qlora 10 months ago

What are tokens 32002:32031

#1 opened 10 months ago by

New activity in 152334H/miqu-1-70b-sf 11 months ago

vllm support?

#19 opened 11 months ago by

Remove extra degrees of freedom by dequantizing the `q5_K_M`, `q4_K_M` and `q2_K` models together?

#18 opened 11 months ago by

Sticking a restrictive license on a model that's not even yours to begin with?

#14 opened 11 months ago by

2.4bpp exl2 waiting room

#3 opened 11 months ago by

Model load fail

#13 opened 11 months ago by

Chat template

#11 opened 11 months ago by

Commerical Use?

#12 opened 11 months ago by

more quantized versions？

#10 opened 11 months ago by