Spaces:
Build error
Credit Risk Modelling
About
An interactive tool demonstrating credit risk modelling.
Emphasis on:
- Building models
- Comparing techniques
- Interpretating results
Built With
Hardware initially built on:
Processor: 11th Gen Intel(R) Core(TM) i7-1165G7 @2.80Ghz, 2803 Mhz, 4 Core(s), 8 Logical Processor(s)
Memory (RAM): 16GB
Local setup
Obtain the repo locally and open its root folder
To potentially contribute
git clone https://github.com/pkiage/tool-credit-risk-modelling.git
or
gh repo clone pkiage/tool-credit-risk-modelling
Just to deploy locally
Download ZIP
(optional) Setup virtual environment:
python -m venv venv
(optional) Activate virtual environment:
If using Unix based OS run the following in terminal:
.\venv\bin\activate
If using Windows run the following in terminal:
.\venv\Scripts\activate
Install requirements by running the following in terminal:
Required packages
pip install -r requirements.txt
Complete graphviz installation
https://graphviz.org/download/
Build and install local package
python setup.py build
python setup.py install
Run the streamlit app (app.py) by running the following in terminal (from repository root folder):
streamlit run src/app.py
Deployed setup details
For faster model building and testing (particularly XGBoost) a local setup or on a more powerful server than free heroku dyno type is recommended. (tutorials on servers for data science & ML)
Free Heroku dyno type was used to deploy the app
Memory (RAM): 512 MB
CPU Share: 1x
Compute: 1x-4x
Dedicated: no
Sleeps: yes
Enabled Autodeploy from Github
From main branch:
heroku login
git push heroku main
From branch beside main:
heroku login
git push heroku branch_name:main
Roadmap
To view/submit ideas as well as contribute please view issues.
Docs creation
pydeps Python module depenency visualization
Delete init.py and main.py then run the following
App and clusters
pydeps src/app.py --max-bacon=5 --cluster --rankdir BT -o docs/module-dependency-graph/src-app-clustered.svg
App and links
Features, models, & visualization links:
pydeps src/app.py --only features models visualization --max-bacon=4 --rankdir BT -o docs/module-dependency-graph/src-feature-model-visualization.svg
Only features
pydeps src/app.py --only features --max-bacon=5 --cluster --max-cluster-size=3 --rankdir BT -o docs/module-dependency-graph/src-features.svg
Only models
pydeps src/app.py --only models --max-bacon=5 --cluster --max-cluster-size=15 --rankdir BT -o docs/module-dependency-graph/src-models.svg
code2flow Call graphs for a pretty good estimate of project structure
Logistic
code2flow src/models/logistic_train_model.py -o docs/call-graph/logistic_train_model.svg
code2flow src/models/logistic_model.py -o docs/call-graph/logistic_model.svg
Xgboost
code2flow src/models/xgboost_train_model.py -o docs/call-graph/xgboost_train_model.svg
code2flow src/models/xgboost_model.py -o docs/call-graph/xgboost_model.svg
utils
code2flow src/models/util_test.py -o docs/call-graph/util_test.svg
code2flow src/models/util_predict_model_threshold.py -o docs/call-graph/util_predict_model_threshold.svg
code2flow src/models/util_predict_model.py -o docs/call-graph/util_predict_model.svg
code2flow src/models/util_model_comparison.py -o docs/call-graph/util_model_comparison.svg
References
Inspiration:
Credit Risk Modeling in Python by Datacamp
- General Methodology
- Data
A Gentle Introduction to Threshold-Moving for Imbalanced Classification
- Selecting optimal threshold using Youden's J statistic
- Project structure
- Buildpack used for Heroku deployment