Spaces:

discord-community
/

README

Running

App Files Files Community

How to evaluate zero-shot Llama3.1 !!

by adeem6 - opened 4 days ago

Discussion

adeem6

Hugging Face Discord Community org 4 days ago

I'm using Llama3.1 without dataset, do u please have any ideas of how to evaluate Llama3.1 without dataset

itlwas

Hugging Face Discord Community org 4 days ago

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

adeem6

Hugging Face Discord Community org 4 days ago

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

itlwas

Hugging Face Discord Community org 4 days ago

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.

adeem6

Hugging Face Discord Community org 4 days ago

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.

Thank u so much!! I'm a beginner in this field, is there any existing agent I could use? or a tutorial of creating one?

itlwas

Hugging Face Discord Community org 4 days ago

You can ask several math questions that have one correct answer and compare the model's answers to a reference. For example, solve simple equations or arithmetic problems. You could also ask questions about historical dates where the answer is a specific number

Thank u for ur response, I'm using the model to generate a structured report any ideas of how to evaluate this report?

You can create a separate agent that will clearly and accurately evaluate the report, checking its structure, logic, accuracy of information, and compliance with requirements. This agent can use predefined criteria for automated assessment.

Thank u so much!! I'm a beginner in this field, is there any existing agent I could use? or a tutorial of creating one?

You can use the standalone agent or via API in Python, which will evaluate the report. You can check the Hugging Face documentation for their API (https://huggingface.co/docs/api-inference/tasks/chat-completion) to learn more.

unixwzrd

Hugging Face Discord Community org 3 days ago

You could possibly use a subset of training data, especially for the subject area you want it for work with and see how well its answers match the trains data set. Don't use the whole data set, just select some items from it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment