Evaluating Large Language Models with Tests of Spanish as a Foreign Language: Pass or Fail?
Paper
•
2409.15334
•
Published
Democratizar el PLN en español creando recursos abiertos en nuestro idioma🚀
Llama-3-8B-instruct
) to generate synthetic instructions and then fine-tune the base version (Llama-3-8B
) on this dataset, you can improve even the it-tuned versionollama
models (initially phi and llama3) automatically and upload it to the Hugging Face Hub!distilabel
, so we implemented PrometheusEval
.PrometheusEval
running their 7B variant with vLLM in a single L40 on top of
HuggingFaceH4/instruction-dataset, we got the 327 existing prompt-completion pairs evaluated and pushed to the Hub in less than 2 minutes!distilabel
on top of the awesome
LDJnr/Capybara from
@LDJnr