Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 4 days ago • 40
Public Domain 12M: A Highly Aesthetic Image-Text Dataset with Novel Governance Mechanisms Paper • 2410.23144 • Published Oct 30, 2024 • 4