How to evaluate zero-shot Llama3.1 !!

#1
by adeem6 - opened
Hugging Face Discord Community org

I'm using Llama3.1 without dataset, do u please have any ideas of how to evaluate Llama3.1 without dataset

Hugging Face Discord Community org

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Hugging Face Discord Community org

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

Hugging Face Discord Community org

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.

Hugging Face Discord Community org

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.

Thank u so much!! I'm a beginner in this field, is there any existing agent I could use? or a tutorial of creating one?

Hugging Face Discord Community org

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.

Thank u so much!! I'm a beginner in this field, is there any existing agent I could use? or a tutorial of creating one?

You can use the standalone agent or via API in Python, which will evaluate the report. You can check the Hugging Face documentation for their API (https://huggingface.co/docs/api-inference/tasks/chat-completion) to learn more.

Hugging Face Discord Community org

You could possibly use a subset of training data, especially for the subject area you want it for work with and see how well its answers match the trains data set. Don't use the whole data set, just select some items from it.

Sign up or log in to comment