SCBench: A KV Cache-Centric Analysis of Long-Context Methods Paper β’ 2412.10319 β’ Published 23 days ago β’ 9
SCBench: A KV Cache-Centric Analysis of Long-Context Methods Paper β’ 2412.10319 β’ Published 23 days ago β’ 9
Wolf: Captioning Everything with a World Summarization Framework Paper β’ 2407.18908 β’ Published Jul 26, 2024 β’ 32
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper β’ 2407.20183 β’ Published Jul 29, 2024 β’ 41
view article Article MInference 1.0: 10x Faster Million Context Inference with a Single GPU By liyucheng β’ Jul 11, 2024 β’ 12
view article Article How to Optimize TTFT of 8B LLMs with 1M Tokens toΒ 20s By iofu728 β’ Jul 21, 2024 β’ 2
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper β’ 2407.02490 β’ Published Jul 2, 2024 β’ 23
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper β’ 2407.02490 β’ Published Jul 2, 2024 β’ 23
microsoft/llmlingua-2-xlm-roberta-large-meetingbank Token Classification β’ Updated Apr 3, 2024 β’ 32.2k β’ 17
microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank Token Classification β’ Updated Apr 3, 2024 β’ 27.3k β’ 25