Spaces:
Running
How to evaluate zero-shot Llama3.1 !!
I'm using Llama3.1 without dataset, do u please have any ideas of how to evaluate Llama3.1 without dataset
You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number
You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number
Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?
You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number
Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?
You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.
You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number
Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?
You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.
Thank u so much!! I'm a beginner in this field, is there any existing agent I could use? or a tutorial of creating one?
You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number
Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?
You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.
Thank u so much!! I'm a beginner in this field, is there any existing agent I could use? or a tutorial of creating one?
You can use the standalone agent or via API in Python, which will evaluate the report. You can check the Hugging Face documentation for their API (https://huggingface.co/docs/api-inference/tasks/chat-completion) to learn more.
You could possibly use a subset of training data, especially for the subject area you want it for work with and see how well its answers match the trains data set. Don't use the whole data set, just select some items from it.