ratneshpasi03's picture
Enhance README with project aim and improve step formatting
540f8c3
|
raw
history blame
4.68 kB

VayuBuddy Question Curation

🎯 Aim

The purpose to create this templet is to have the automated interface to collect and manage data analytic questions for VayuBuddy

πŸ“‚ Folder Structure

The project is organized as follows:

project_root/
│── app.py                         # Main Streamlit application
│── requirements.txt               # Dependencies list
│── README.md                      # Documentation
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ questions/                 # Stores question-related data
β”‚   β”‚   β”œβ”€β”€ 0/                     # Folder for question ID 0
β”‚   β”‚   β”‚   β”œβ”€β”€ question.txt       # Question text
β”‚   β”‚   β”‚   β”œβ”€β”€ answer.txt         # Answer text
β”‚   β”‚   β”‚   β”œβ”€β”€ code.py            # Reference code for the question
β”‚   β”‚   β”‚   └── metadata.json      # Metadata for the question
β”‚   β”‚   β”œβ”€β”€ 1/                     # Folder for question ID 1
β”‚   β”‚   β”‚   β”œβ”€β”€ question.txt       # Question text
β”‚   β”‚   β”‚   β”œβ”€β”€ answer.txt         # Answer text
β”‚   β”‚   β”‚   β”œβ”€β”€ code.py            # Reference code for the question
β”‚   β”‚   β”‚   └── metadata.json      # Metadata for the question
β”‚   ... ... ...                    # and so on...
β”‚   β”‚   ... ...
β”‚   β”‚
β”‚   └── raw_data/                  # Stores the required CSV's
β”‚       β”œβ”€β”€ NCAP_Funding.csv       # NCAP Funding Data
β”‚       β”œβ”€β”€ State.csv              # States area & population Data
β”‚       └── Data.csv               # Main AQI Data
β”‚
β”œβ”€β”€ pages/                         # Streamlit multipage support
β”‚   β”œβ”€β”€ all_question.py            # Page to view questions
β”‚   β”œβ”€β”€ execute_code.py            # Page to run the code of all questions
β”‚   β”œβ”€β”€ add_question.py            # Page to add new questions
β”‚   β”œβ”€β”€ edit_question.py           # Page to edit existing questions
β”‚   └── delete_question.py         # Page to delete questions
β”‚
β”œβ”€β”€ utils/                         # Utility functions
β”‚   β”œβ”€β”€ load_jsonl.py              # Function to load questions a list
β”‚   β”œβ”€β”€ data_to_jsonl.py           # Function to convert question folders into JSONL 
β”‚   β”œβ”€β”€ jsonl_to_data.py           # Function to convert JSONL into question folders 
β”‚   └── code_services.py           # Handles code formatting & execution
β”‚
└── output.jsonl                   # Processed question data in JSONL format

This structure ensures modularity and maintainability of the project. πŸš€

πŸ“œ How to use this App

  • Add questions through Add Questions Page
  • Edit questions through Edit Questions Page
  • Delete questions through Delete Questions Page
  • The Data will not be saved in-case of missing fields or error in code

NOTE

  • while entering Data form code.py in Add Questions Page or Edit Questions Page either follow the true_code format i.e. all code written in the true_code function and true_code function called in the end of it's defination or follow No true_code format

true_code format

def true_code():
    import pandas as pd
    
    df = pd.read_csv('data/raw_data/Data.csv', sep=",")
    
    data = df.groupby(['state','station'])['PM2.5'].mean()
    ans = data.idxmax()[0]
    print(ans)

true_code()

No true_code format

import pandas as pd

df = pd.read_csv('data/raw_data/Data.csv', sep=",")

data = df.groupby(['state','station'])['PM2.5'].mean()
ans = data.idxmax()[0]
print(ans)

🧩 Sample Question

question.txt

Which state has the highest average PM2.5 concentration across all stations?

answer.txt

Delhi

code.py

def true_code():
    import pandas as pd
    
    df = pd.read_csv('data/raw_data/Data.csv', sep=",")
    
    data = df.groupby(['state','station'])['PM2.5'].mean()
    ans = data.idxmax()[0]
    print(ans)

true_code()

metadata.json

{
    "question_id": 0,
    "category": "spatial",
    "answer_category": "single",
    "plot": false,
    "libraries": [
        "pandas"
    ]
}

πŸ› οΈ How to Set-Up project

open the terminal in the empty folder and follow the following steps:

1st step : clone repo

git clone https://github.com/ratnesh003/VayuBuddy-Question-Curation.git .

2rd step : to install the dependencies to run the codes

pip install -r requirements.txt

3nd step : to create dummy /data folder from already present output.jsonl

py .\utils\jsonl_to_data.py