VayuBuddy Question Curation
π― Aim
The purpose to create this templet is to have the automated interface to collect and manage data analytic questions for VayuBuddy
π Folder Structure
The project is organized as follows:
project_root/
βββ app.py # Main Streamlit application
βββ requirements.txt # Dependencies list
βββ README.md # Documentation
β
βββ data/
β βββ questions/ # Stores question-related data
β β βββ 0/ # Folder for question ID 0
β β β βββ question.txt # Question text
β β β βββ answer.txt # Answer text
β β β βββ code.py # Reference code for the question
β β β βββ metadata.json # Metadata for the question
β β βββ 1/ # Folder for question ID 1
β β β βββ question.txt # Question text
β β β βββ answer.txt # Answer text
β β β βββ code.py # Reference code for the question
β β β βββ metadata.json # Metadata for the question
β ... ... ... # and so on...
β β ... ...
β β
β βββ raw_data/ # Stores the required CSV's
β βββ NCAP_Funding.csv # NCAP Funding Data
β βββ State.csv # States area & population Data
β βββ Data.csv # Main AQI Data
β
βββ pages/ # Streamlit multipage support
β βββ all_question.py # Page to view questions
β βββ execute_code.py # Page to run the code of all questions
β βββ add_question.py # Page to add new questions
β βββ edit_question.py # Page to edit existing questions
β βββ delete_question.py # Page to delete questions
β
βββ utils/ # Utility functions
β βββ load_jsonl.py # Function to load questions a list
β βββ data_to_jsonl.py # Function to convert question folders into JSONL
β βββ jsonl_to_data.py # Function to convert JSONL into question folders
β βββ code_services.py # Handles code formatting & execution
β
βββ output.jsonl # Processed question data in JSONL format
This structure ensures modularity and maintainability of the project. π
π How to use this App
- Add questions through
Add Questions
Page - Edit questions through
Edit Questions
Page - Delete questions through
Delete Questions
Page - The Data will not be saved in-case of missing fields or error in code
NOTE
- while entering Data form code.py in
Add Questions
Page orEdit Questions
Page either follow thetrue_code format
i.e. all code written in the true_code function and true_code function called in the end of it's defination or followNo true_code format
true_code format
def true_code():
import pandas as pd
df = pd.read_csv('data/raw_data/Data.csv', sep=",")
data = df.groupby(['state','station'])['PM2.5'].mean()
ans = data.idxmax()[0]
print(ans)
true_code()
No true_code format
import pandas as pd
df = pd.read_csv('data/raw_data/Data.csv', sep=",")
data = df.groupby(['state','station'])['PM2.5'].mean()
ans = data.idxmax()[0]
print(ans)
𧩠Sample Question
question.txt
Which state has the highest average PM2.5 concentration across all stations?
answer.txt
Delhi
code.py
def true_code():
import pandas as pd
df = pd.read_csv('data/raw_data/Data.csv', sep=",")
data = df.groupby(['state','station'])['PM2.5'].mean()
ans = data.idxmax()[0]
print(ans)
true_code()
metadata.json
{
"question_id": 0,
"category": "spatial",
"answer_category": "single",
"plot": false,
"libraries": [
"pandas"
]
}
π οΈ How to Set-Up project
open the terminal in the empty folder and follow the following steps:
1st step : clone repo
git clone https://github.com/ratnesh003/VayuBuddy-Question-Curation.git .
2rd step : to install the dependencies to run the codes
pip install -r requirements.txt
3nd step : to create dummy /data folder from already present output.jsonl
py .\utils\jsonl_to_data.py