Spaces:
Runtime error
Runtime error
File size: 1,864 Bytes
166b383 424d53d 166b383 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
---
title: Auto Evaluator
emoji: :brain
colorFrom: blue
colorTo: yellow
sdk: streamlit
sdk_version: 1.19.0
app_file: app.py
pinned: false
license: mit
---
# `Auto-evaluator` :brain: :memo:
This is a lightweight evaluation tool for question-answering using `Langchain` to:
- Ask the user to input a set of documents of interest
- Apply an LLM (`GPT-3.5-turbo`) to auto-generate `question`-`answer` pairs from these docs
- Generate a question-answering chain with a specified set of UI-chosen configurations
- Use the chain to generate a response to each `question`
- Use an LLM (`GPT-3.5-turbo`) to score the response relative to the `answer`
- Explore scoring across various chain configurations
**Run as Streamlit app**
`pip install -r requirements.txt`
`streamlit run auto-evaluator.py`
**Inputs**
`num_eval_questions` - Number of questions to auto-generate (if the user does not supply an eval set)
`split_method` - Method for text splitting
`chunk_chars` - Chunk size for text splitting
`overlap` - Chunk overlap for text splitting
`embeddings` - Embedding method for chunks
`retriever_type` - Chunk retrieval method
`num_neighbors` - Neighbors for retrieval
`model` - LLM for summarization of retrieved chunks
`grade_prompt` - Prompt choice for model self-grading
**Blog**
https://blog.langchain.dev/auto-eval-of-question-answering-tasks/
**UI**
![image](https://user-images.githubusercontent.com/122662504/233218347-de10cf41-6230-47a7-aa9e-8ab01673b87a.png)
**Hosted app**
See:
https://github.com/langchain-ai/auto-evaluator
And:
https://autoevaluator.langchain.com/
**Disclaimer**
```You will need an OpenAI API key with access to `GPT-4` and an Anthropic API key to take advantage of all of the default dashboard model settings. However, additional models (e.g., from Hugging Face) can be easily added to the app.``` |