File size: 6,915 Bytes
e7e98f2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
import streamlit as st
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
from PIL import Image

st.set_page_config(
    page_title='Churn Predictor',
    layout='wide',
    initial_sidebar_state='expanded'
)


def run():
    # judul
    st.title('**Churn Exploration**')
    st.subheader('Explore The Churn Dataset')

    # tambah gambar
    image = Image.open('churn.jpg')
    st.image(image)
    st.markdown('---')

    markdown_text = '''
    ## Background
    In today's competitive business landscape, customer churn has become a 
    significant concern for many companies. Customer churn refers to the 
    phenomenon where customers discontinue using a company's products or 
    services. Churn can have a negative impact on a company's revenue, 
    growth, and overall success. Therefore, companies are increasingly 
    focused on identifying customers who are likely to churn so that they 
    can take proactive measures to retain them.
    
    ## Objective
    The objective of this project is to develop a deep learning model for 
    churn prediction. The company wants to minimize the risk of customer 
    churn by accurately predicting which customers are likely to stop using 
    their products or services. By identifying potential churners in advance,
    the company can take targeted actions and implement retention strategies 
    to reduce churn rates and maximize customer loyalty.
    
    ## About Dataset
    
    |         Variable        |                                         Description                                           |
    |-------------------------|-----------------------------------------------------------------------------------------------|
    | user_id                 | ID of a customer                                                                              |                        
    | age                     | Age of a customer                                                                             |
    | gender                  | Gender of a customer                                                                          |
    | region category         | Region that a customer belongs to                                                             |
    | membership category     | Category of the membership that a customer is using                                           |
    | joining date            | Date when a customer became a member                                                          |
    | joined through referal  | Whether a customer joined using any referral code or ID                                       |
    | preferred_offer types   | Type of offer that a customer prefers                                                         |
    | medium_of operation     | Medium of operation that a customer uses for transactions                                     |
    | internet option         | Type of internet service a customer uses                                                      |
    | last visit time         | The last time a customer visited the website                                                  |
    | days since last login   | Number of days since a customer last logged into the website                                  |
    | average time spent      | Average time spent by a customer on the website                                               |
    | average transaction     | Average transaction value of a customer                                                       |
    | average freq login days | Number of times a customer has logged in to the website                                       |
    | point in wallet         | Points awarded to a customer on each transaction                                              |
    | used spesial discount   | Whether a customer uses special discounts offered                                             |
    | offer app preference    | Whether a customer prefers offers                                                             |
    | past complaint          | Whether a customer has raised any complaints                                                  |
    | complaint status        | Whether the complaints raised by a customer was resolved                                      |
    | feedback                | Feedback provided by a customer                                                               |
    | churn risk score        | Churn Score                                                                                   |
    
    '''
    st.markdown(markdown_text)
    st.markdown('---')

    st.subheader('**Data Exploratory**')
    st.markdown('---')

    st.write('### Customer Information')

    # show dataset
    data = pd.read_csv('churn.csv')
    st.dataframe(data)
    st.markdown('---')

    st.write("### Today's Condition")
    st.markdown('---')

    # show distribusi customer churn
    fig, ax = plt.subplots()
    plt.pie(data['churn_risk_score'].value_counts(),
            labels=['Churn', 'Not-Churn'],
            autopct='%1.1f%%',
            colors=['Grey', 'Orange'],
            startangle=40,
            explode=[0.05, 0])
    plt.title('Customer Churn Percentage')
    plt.axis('equal')
    st.pyplot(fig)

    '''
    Based on the above graph, it can be observed that the distribution of Churn Risk 
    Score ***tends to be evenly divided*** among its values, indicating that the feature 
    follows a normal distribution
    '''

    # visual numerical
    st.subheader('Chart Based on Metrics')
    st.markdown('---')

    choice = st.selectbox('Pick Numeric Columns: ', ('age', 'days_since_last_login', 'avg_time_spent',
                                                     'avg_transaction_value', 'avg_frequency_login_days', 'points_in_wallet'))
    fig, ax = plt.subplots(figsize=(8, 6))
    sns.kdeplot(data=data, x=choice, fill=True,
                hue='churn_risk_score', palette='inferno')
    ax.set_title(choice.capitalize()+' Ratio')
    st.pyplot(fig)
    st.markdown('---')

    # visual categorical
    choice_2 = st.selectbox('Pick Category Column : ', ('gender', 'region_category',
                            'membership_category', 'joined_through_referral', 'preferred_offer_types',
                                                        'medium_of_operation', 'internet_option', 'used_special_discount',
                                                        'offer_application_preference', 'past_complaint', 'complaint_status', 'feedback'))
    fig = plt.figure(figsize=(15, 10))
    sns.countplot(data=data, x=choice_2,
                  hue='churn_risk_score', palette='viridis')
    plt.xlabel(choice_2.capitalize())
    plt.ylabel('Count')
    plt.title(choice_2.capitalize()+' Ratio')
    plt.legend(title='Churn Risk Score')
    st.pyplot(fig)


if __name__ == '__main__':
    run()