Run evaluation tests with Selene and Selene-Mini models
Upload and analyze datasets with evaluation criteria