import streamlit as st import pandas as pd import plotly.express as px def main(): st.title("📚 Project Documentation") # Custom CSS for better styling st.markdown(""" """, unsafe_allow_html=True) # Q1: Development Timeline st.markdown("""
⏱️ Q1: How long did it take to solve the problem?
The solution was developed in approximately 5 hours (excluding data collection and model training phases).
""", unsafe_allow_html=True) # Q2: Solution Explanation st.markdown("""
🔍 Q2: Can you explain your solution approach?
The solution implements a multi-stage document classification pipeline:

1. Direct URL Text Approach:
2. Baseline Approach (ML Model):
3. (DL Model):
""", unsafe_allow_html=True) # Q3: Model Selection st.markdown("""
🤖 Q3: Which models did you use and why?
Implemented baseline using TF-IDF and Logistic Regression and then used BERT-based model:

Baseline Model:
BERT Model:
""", unsafe_allow_html=True) # Q4: Limitations and Improvements st.markdown("""
⚠️ Q4: What are the current limitations and potential improvements?
Current Implementation & Limitations:
Proposed Improvements:
""", unsafe_allow_html=True) # Q5: Model Performance st.markdown("""
📊 Q5: What is the model's performance on test data?
BERT Model Performance:

Category Precision Recall F1-Score Support
Cable 1.00 1.00 1.00 92
Fuses 0.95 1.00 0.98 42
Lighting 0.94 1.00 0.97 74
Others 1.00 0.92 0.96 83
Accuracy 0.98 291
Macro Avg 0.97 0.98 0.98 291
Weighted Avg 0.98 0.98 0.98 291
""", unsafe_allow_html=True) st.markdown("""
✨ Perfect performance (1.00) for Cable category
📈 High recall (1.00) across most categories
🎯 Overall accuracy of 98%
⚖️ Balanced performance across all metrics
""", unsafe_allow_html=True) # Q6: Metric Selection st.markdown("""
📈 Q6: Why did you choose these particular metrics?
Our metric selection was driven by the dataset characteristics:

Key Considerations:
Selected Metrics:
""", unsafe_allow_html=True) # Performance Visualization st.markdown("### 📊 Model Performance Comparison") metrics = { 'Metric': ['Accuracy', 'Precision', 'Recall', 'F1-Score'], 'Baseline': [0.85, 0.83, 0.84, 0.83], 'BERT': [0.98, 0.97, 0.98, 0.98] } df = pd.DataFrame(metrics) fig = px.bar( df, x='Metric', y=['Baseline', 'BERT'], barmode='group', title='Model Performance Comparison', color_discrete_sequence=['#2ecc71', '#3498db'], template='plotly_white' ) fig.update_layout( title_x=0.5, title_font_size=20, legend_title_text='Model Type', xaxis_title="Evaluation Metric", yaxis_title="Score", bargap=0.2, height=500 ) st.plotly_chart(fig, use_container_width=True) main()