Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN Paper • 2412.13795 • Published 5 days ago • 18
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11 • 31
PDEgym Collection A collection of datasets of solutions to partial differential equations. • 21 items • Updated May 30 • 5
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer Paper • 2312.03724 • Published Nov 27, 2023 • 1
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression Paper • 2403.15447 • Published Mar 18 • 16
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression Paper • 2403.15447 • Published Mar 18 • 16
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark Paper • 2402.11592 • Published Feb 18 • 2
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark Paper • 2402.11592 • Published Feb 18 • 2