mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus Paper • 2406.08707 • Published Jun 13, 2024 • 15
Semi-automatic staging area for high-quality structured data extraction from scientific literature Paper • 2309.10923 • Published Sep 19, 2023
Mining experimental data from Materials Science literature with Large Language Models: an evaluation study Paper • 2401.11052 • Published Jan 19, 2024 • 1
Mixture of Soft Prompts for Controllable Data Generation Paper • 2303.01580 • Published Mar 2, 2023 • 1
Semantic Consistency for Assuring Reliability of Large Language Models Paper • 2308.09138 • Published Aug 17, 2023 • 2
SuperMat: Construction of a linked annotated dataset from superconductors-related publications Paper • 2101.02455 • Published Jan 7, 2021 • 2
Automatic extraction of materials and properties from superconductors scientific literature Paper • 2210.15600 • Published Oct 26, 2022 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 27