13 12 7

ct2

ct-2

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use

upvoted a paper 10 days ago

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

upvoted a collection 10 days ago

Slam

View all activity

Organizations

None yet

ct-2's activity

upvoted 2 papers 10 days ago

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use

Paper • 2502.15872 • Published 17 days ago • 4

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

Paper • 2502.17055 • Published 14 days ago • 16

upvoted a collection 10 days ago

Slam

Collection

All resources for SpeechLMs from "Slamming: Training a Speech Language Model on One GPU in a Day". We provide tokeniser, lm, and datasets • 6 items • Updated 13 days ago • 13

upvoted a paper 10 days ago

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Paper • 2502.19261 • Published 12 days ago • 6

liked a model 10 days ago

llm-jp/llm-jp-3-172b-instruct3

Text Generation • Updated Jan 20 • 4.96k • 9

upvoted a collection 10 days ago

Drop-Upcycling

Collection

25 items • Updated 10 days ago • 2

liked a model 12 days ago

IntelLabs/sqft-qa-sparsepeft-mistral-7b-v0.3-50-gptq-math-heu

Text Generation • Updated 26 days ago • 158 • 3

upvoted a paper 17 days ago

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?

Paper • 2502.11895 • Published 21 days ago • 1

upvoted a collection 23 days ago

Hamanasu

Collection

A brand new series of Models from yours truly, Designed for Intelligence, Creativity and Roleplay. • 13 items • Updated 3 days ago • 4

liked a model 23 days ago

Delta-Vector/Hamanasu-15B-Instruct

Updated 5 days ago • 483 • 9

upvoted 2 papers 26 days ago

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

Paper • 2502.02631 • Published Feb 4 • 2

Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales

Paper • 2502.01908 • Published Feb 4 • 1

upvoted a paper 28 days ago

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Paper • 2502.05003 • Published about 1 month ago • 42

updated a model about 1 month ago

ct-2/DeepSeek-R1-ReDistill-Qwen-7B-v1.1-Q4_0-GGUF

Text Generation • Updated Feb 1 • 100

published a model about 1 month ago

ct-2/DeepSeek-R1-ReDistill-Qwen-7B-v1.1-Q4_0-GGUF

Text Generation • Updated Feb 1 • 100

New activity in hexgrad/Kokoro-82M about 1 month ago

Android TTS app

#80 opened about 1 month ago by

ct-2

updated a model about 2 months ago

ct-2/DeepSeek-R1-Distill-Llama-8B-Q4_0-GGUF

Updated Jan 20 • 163

published a model about 2 months ago

ct-2/DeepSeek-R1-Distill-Llama-8B-Q4_0-GGUF

Updated Jan 20 • 163

updated a model about 2 months ago

ct-2/DeepSeek-R1-Distill-Llama-8B-Q4_K_M-GGUF

Updated Jan 20 • 129

published a model about 2 months ago

ct-2/DeepSeek-R1-Distill-Llama-8B-Q4_K_M-GGUF

Updated Jan 20 • 129