tony
tintegral
AI & ML interests
None yet
Recent Activity
updated
a collection
10 days ago
practical
updated
a collection
10 days ago
practical
updated
a collection
10 days ago
practical
Organizations
None yet
Collections
2
-
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Paper • 2412.15204 • Published • 31 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 44 -
Alignment faking in large language models
Paper • 2412.14093 • Published • 7 -
The Open Source Advantage in Large Language Models (LLMs)
Paper • 2412.12004 • Published • 9
models
None public yet
datasets
None public yet