BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 41
view article Article Accelerating LLM Inference: Fast Sampling with Gumbel-Max Trick By cxdu • 29 days ago • 8
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1 • 42
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 41
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29 • 75
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts Paper • 2404.15247 • Published Apr 23 • 3
NeuRI: Diversifying DNN Generation via Inductive Rule Inference Paper • 2302.02261 • Published Feb 4, 2023 • 3
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models Paper • 2312.04724 • Published Dec 7, 2023 • 20
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation Paper • 2305.01210 • Published May 2, 2023 • 4
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning Paper • 2311.02103 • Published Nov 1, 2023 • 16