Post
1608
WorkerSafetyQAEval: A new benchmark to evaluate worker safety domain question and answering
Happy to share a new benchmark on question and answers for worker safety domain. The benchmark and leaderboard is available at
codelion/worker-safety-qa-eval
We evaluate popular generic chatbots like ChatGPT and HuggingChat on WorkerSafetyQAEval and compare it with a domain specific RAG bot called Securade.ai Safety Copilot - codelion/safety-copilot It highlights the importance of having domain specific knowledge for critical domains like worker safety that require high accuracy. Securade.ai Safety Copilot achieves ~97% on the benchmark setting a new SOTA.
You can read more about the Safety Copilot on https://securade.ai/blog/how-securade-ai-safety-copilot-transforms-worker-safety.html
Happy to share a new benchmark on question and answers for worker safety domain. The benchmark and leaderboard is available at
codelion/worker-safety-qa-eval
We evaluate popular generic chatbots like ChatGPT and HuggingChat on WorkerSafetyQAEval and compare it with a domain specific RAG bot called Securade.ai Safety Copilot - codelion/safety-copilot It highlights the importance of having domain specific knowledge for critical domains like worker safety that require high accuracy. Securade.ai Safety Copilot achieves ~97% on the benchmark setting a new SOTA.
You can read more about the Safety Copilot on https://securade.ai/blog/how-securade-ai-safety-copilot-transforms-worker-safety.html