Jensen-holm commited on
Commit
6307b4f
β€’
1 Parent(s): d04aaf5

switching to gradio and a complete rewrite that I did over in the

Browse files
Files changed (15) hide show
  1. .gitmodules +0 -3
  2. README.md +14 -34
  3. app.py +105 -45
  4. example/iris.csv +0 -151
  5. example/main.py +0 -35
  6. example/mushrooms.csv +0 -0
  7. ml-vis +0 -1
  8. nn/__init__.py +3 -0
  9. nn/activation.py +42 -29
  10. nn/loss.py +50 -0
  11. nn/nn.py +153 -53
  12. nn/test.py +30 -0
  13. nn/train.py +0 -127
  14. requirements.txt +4 -8
  15. vis.py +20 -0
.gitmodules DELETED
@@ -1,3 +0,0 @@
1
- [submodule "ml-vis"]
2
- path = ml-vis
3
- url = git@github.com:Jensen-holm/ml-vis.git
 
 
 
 
README.md CHANGED
@@ -1,34 +1,14 @@
1
- # Neural Network Classification (from-scratch)
2
-
3
- ## Parameters
4
- Think of epochs as rounds of training for your neural network. Each epoch means the network has gone through the entire dataset once, learning and adjusting its parameters. More epochs can lead to better accuracy, but too many can also overfit the model to your training data.
5
-
6
- #### Activation functions
7
- introduce non-linearity to your neural network, allowing it to model complex relationships in data. The choice of activation function (like sigmoid, ReLU, or tanh) affects how the network processes and passes information between its layers.
8
-
9
- #### Hidden Size
10
- This refers to the number of neurons or units in the hidden layer(s) of your neural network. More hidden units can make the network more capable of learning complex patterns, but it can also make training slower and increase the risk of overfitting.
11
-
12
- #### Learning Rate
13
- Imagine this as the step size your neural network takes during training. It determines how much the network's parameters are updated based on the error it observes. A higher learning rate means bigger steps but can lead to overshooting the optimal values, while a smaller learning rate may take longer to converge or find the best values.
14
-
15
- #### Test Size
16
- When training a neural network, or any machine learning model for that matter, it is important to split the data into training and testing sets. The test size parameter specifys how to split up the data into these two sets. a test size of 0.2 will split it up so that 80% of the data is used for training, and 20% of the data is used for testing.
17
-
18
-
19
- ## Backprop Algorithm
20
- Backpropagation, short for "backward propagation of errors," is the cornerstone of training artificial neural networks. It begins by initializing the network's weights and biases. During the forward pass, input data flows through the network's layers, undergoing weighted sum calculations and activation functions, eventually producing predictions. The algorithm then computes an error or loss by comparing these predictions to the actual target values. In the critical backward pass, starting from the output layer and moving in reverse, gradients of the loss with respect to each layer's outputs, weights, and biases are calculated using calculus and the chain rule. These gradients guide the adjustment of weights and biases in each layer, with the goal of minimizing the loss. This iterative process repeats for multiple epochs, refining the network's parameters until the error reaches an acceptable level or a fixed number of training iterations is completed, ultimately enabling the network to improve its predictions on new data.
21
-
22
- ## Implementation
23
- Behind the scenes, my API implements the backprop algorithm. The main loop first initializes weights and biases randomly. The algorithm starts by iterating n times where n is the number of epochs you specify above. During each iteration, starting with the randomly initialized weights and biases, the activation function that you choose will be run inside of this compute node function below:
24
-
25
- The activation function plays a crucial role in the behavior of your neural network. The compute node function, which we've discussed earlier, calculates the network's output. In each iteration of the training process, we compare this output to the actual data, which, in this case, represents the iris flower type. The difference between the predicted and actual values guides the algorithm in determining how much to adjust the network's weights and biases for better predictions. However, we must be careful to prevent the neural network from memorizing the training data, a problem in machine learning known as overfitting. To address this, we scale down the derivatives computed for weights and biases by the learning rate you specify, ensuring that the network learns in a controlled and meaningful manner. You'll notice that if you use 1 for the learning rate, the graph on loss/epoch is a lot choppier than it is if you have a lower learning rate like 0.01. The smoother the curve, the better. The process repeats for n epochs, then the final results are calculated, and our final weights and biases saved.
26
-
27
- ## Results
28
-
29
- #### Log Loss
30
-
31
- #### Accuracy Score
32
-
33
-
34
-
 
1
+ ---
2
+ title: Backprop Playground
3
+ emoji: πŸ”™
4
+ colorFrom: yellow
5
+ colorTo: blue
6
+ sdk: gradio
7
+ sdk_version: 4.26.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ ---
12
+
13
+ This web app uses a neural network framework that I built from scratch in <br>
14
+ python, using numpy as the only 3rd party library in the framework itself. <br>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -1,48 +1,108 @@
1
- from flask import Flask, request, jsonify, Response
2
- from flask_cors import CORS
3
- from nn.nn import NN
4
- from nn import train as train_nn
5
- from nn import activation
6
- import pandas as pd
7
- import io
8
-
9
- app = Flask(__name__)
10
-
11
- CORS(app, origins="*")
12
-
13
-
14
- @app.route("/neural-network", methods=["POST"])
15
- def neural_net():
16
- args = request.json
17
-
18
- try:
19
- net = NN.from_dict(args)
20
- except Exception as e:
21
- return Response(
22
- response=f"issue with request args: {e}",
23
- status=400,
24
- )
25
-
26
- try:
27
- df = pd.read_csv(io.StringIO(net.data))
28
- net.set_df(df=df)
29
- except Exception as e:
30
- return Response(
31
- response=f"error reading csv data: {e}",
32
- status=400,
33
- )
34
-
35
- try:
36
- activation.get_activation(nn=net)
37
- except Exception:
38
- return Response(
39
- response="invalid activation function",
40
- status=400,
41
- )
42
-
43
- result = train_nn.train(nn=net)
44
- return jsonify(result)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
 
47
  if __name__ == "__main__":
48
- app.run()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import plotly.express as px
2
+ from sklearn import datasets
3
+ from sklearn.preprocessing import StandardScaler, OneHotEncoder
4
+ from sklearn.model_selection import train_test_split
5
+ import numpy as np
6
+ import gradio as gr
7
+ from vis import iris_3d_scatter
8
+ import nn # custom neural network module
9
+
10
+
11
+ def _preprocess_iris_data(
12
+ seed: int,
13
+ ) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
14
+ iris = datasets.load_iris()
15
+ X = iris["data"]
16
+ y = iris["target"]
17
+ # normalize the features
18
+ X = StandardScaler().fit_transform(X)
19
+ # one hot encode the target variables
20
+ y = OneHotEncoder().fit_transform(y.reshape(-1, 1)).toarray()
21
+ return train_test_split(
22
+ X,
23
+ y,
24
+ test_size=0.2,
25
+ random_state=seed,
26
+ )
27
+
28
+
29
+ X_train, X_test, y_train, y_test = _preprocess_iris_data(seed=1)
30
+
31
+
32
+ def main(
33
+ Seed: int = 0,
34
+ Activation_Func: str = "SoftMax",
35
+ Loss_Func: str = "CrossEntropy",
36
+ Epochs: int = 100,
37
+ Hidden_Size: int = 8,
38
+ Learning_Rate: float = 0.01,
39
+ ) -> gr.Plot:
40
+
41
+ iris_classifier = nn.NN(
42
+ epochs=Epochs,
43
+ learning_rate=Learning_Rate,
44
+ activation_fn=Activation_Func,
45
+ loss_fn=Loss_Func,
46
+ hidden_size=Hidden_Size,
47
+ input_size=4, # number of features in iris dataset
48
+ output_size=3, # three classes in iris dataset
49
+ seed=Seed,
50
+ )
51
+
52
+ iris_classifier.train(X_train=X_train, y_train=y_train)
53
+ loss_fig = px.line(
54
+ x=[i for i in range(len(iris_classifier._loss_history))],
55
+ y=iris_classifier._loss_history,
56
+ )
57
+
58
+ return gr.Plot(loss_fig)
59
 
60
 
61
  if __name__ == "__main__":
62
+ with gr.Blocks() as interface:
63
+ gr.Markdown("# Backpropagation Playground")
64
+
65
+ with gr.Tab("Classification"):
66
+
67
+ with gr.Row():
68
+ data_plt = iris_3d_scatter()
69
+ gr.Plot(data_plt)
70
+
71
+ with gr.Row():
72
+ seed_input = [gr.Number(minimum=0, label="Random Seed")]
73
+
74
+ # inputs in the same row
75
+ with gr.Row():
76
+ with gr.Column():
77
+ numeric_inputs = [
78
+ gr.Slider(minimum=100, maximum=10_000, step=50, label="Epochs"),
79
+ gr.Slider(
80
+ minimum=2, maximum=64, step=2, label="Hidden Network Size"
81
+ ),
82
+ gr.Number(minimum=0.00001, maximum=1.5, label="Learning Rate"),
83
+ ]
84
+ with gr.Column():
85
+ fn_inputs = [
86
+ gr.Dropdown(
87
+ choices=["SoftMax"], label="Activation Function"
88
+ ),
89
+ gr.Dropdown(choices=["CrossEntropy"], label="Loss Function"),
90
+ ]
91
+
92
+ with gr.Row():
93
+ train_btn = gr.Button("Train", variant="primary")
94
+
95
+ # outputs in row below inputs
96
+ with gr.Row():
97
+ plt_outputs = [gr.Plot()]
98
+
99
+ train_btn.click(
100
+ fn=main,
101
+ inputs=seed_input + fn_inputs + numeric_inputs,
102
+ outputs=plt_outputs,
103
+ )
104
+
105
+ with gr.Tab("Regression"):
106
+ ...
107
+
108
+ interface.launch(show_error=True)
example/iris.csv DELETED
@@ -1,151 +0,0 @@
1
- sepal length,sepal width,petal length,petal width,species
2
- 5.1,3.5,1.4,0.2,Iris-setosa
3
- 4.9,3.0,1.4,0.2,Iris-setosa
4
- 4.7,3.2,1.3,0.2,Iris-setosa
5
- 4.6,3.1,1.5,0.2,Iris-setosa
6
- 5.0,3.6,1.4,0.2,Iris-setosa
7
- 5.4,3.9,1.7,0.4,Iris-setosa
8
- 4.6,3.4,1.4,0.3,Iris-setosa
9
- 5.0,3.4,1.5,0.2,Iris-setosa
10
- 4.4,2.9,1.4,0.2,Iris-setosa
11
- 4.9,3.1,1.5,0.1,Iris-setosa
12
- 5.4,3.7,1.5,0.2,Iris-setosa
13
- 4.8,3.4,1.6,0.2,Iris-setosa
14
- 4.8,3.0,1.4,0.1,Iris-setosa
15
- 4.3,3.0,1.1,0.1,Iris-setosa
16
- 5.8,4.0,1.2,0.2,Iris-setosa
17
- 5.7,4.4,1.5,0.4,Iris-setosa
18
- 5.4,3.9,1.3,0.4,Iris-setosa
19
- 5.1,3.5,1.4,0.3,Iris-setosa
20
- 5.7,3.8,1.7,0.3,Iris-setosa
21
- 5.1,3.8,1.5,0.3,Iris-setosa
22
- 5.4,3.4,1.7,0.2,Iris-setosa
23
- 5.1,3.7,1.5,0.4,Iris-setosa
24
- 4.6,3.6,1.0,0.2,Iris-setosa
25
- 5.1,3.3,1.7,0.5,Iris-setosa
26
- 4.8,3.4,1.9,0.2,Iris-setosa
27
- 5.0,3.0,1.6,0.2,Iris-setosa
28
- 5.0,3.4,1.6,0.4,Iris-setosa
29
- 5.2,3.5,1.5,0.2,Iris-setosa
30
- 5.2,3.4,1.4,0.2,Iris-setosa
31
- 4.7,3.2,1.6,0.2,Iris-setosa
32
- 4.8,3.1,1.6,0.2,Iris-setosa
33
- 5.4,3.4,1.5,0.4,Iris-setosa
34
- 5.2,4.1,1.5,0.1,Iris-setosa
35
- 5.5,4.2,1.4,0.2,Iris-setosa
36
- 4.9,3.1,1.5,0.2,Iris-setosa
37
- 5.0,3.2,1.2,0.2,Iris-setosa
38
- 5.5,3.5,1.3,0.2,Iris-setosa
39
- 4.9,3.6,1.4,0.1,Iris-setosa
40
- 4.4,3.0,1.3,0.2,Iris-setosa
41
- 5.1,3.4,1.5,0.2,Iris-setosa
42
- 5.0,3.5,1.3,0.3,Iris-setosa
43
- 4.5,2.3,1.3,0.3,Iris-setosa
44
- 4.4,3.2,1.3,0.2,Iris-setosa
45
- 5.0,3.5,1.6,0.6,Iris-setosa
46
- 5.1,3.8,1.9,0.4,Iris-setosa
47
- 4.8,3.0,1.4,0.3,Iris-setosa
48
- 5.1,3.8,1.6,0.2,Iris-setosa
49
- 4.6,3.2,1.4,0.2,Iris-setosa
50
- 5.3,3.7,1.5,0.2,Iris-setosa
51
- 5.0,3.3,1.4,0.2,Iris-setosa
52
- 7.0,3.2,4.7,1.4,Iris-versicolor
53
- 6.4,3.2,4.5,1.5,Iris-versicolor
54
- 6.9,3.1,4.9,1.5,Iris-versicolor
55
- 5.5,2.3,4.0,1.3,Iris-versicolor
56
- 6.5,2.8,4.6,1.5,Iris-versicolor
57
- 5.7,2.8,4.5,1.3,Iris-versicolor
58
- 6.3,3.3,4.7,1.6,Iris-versicolor
59
- 4.9,2.4,3.3,1.0,Iris-versicolor
60
- 6.6,2.9,4.6,1.3,Iris-versicolor
61
- 5.2,2.7,3.9,1.4,Iris-versicolor
62
- 5.0,2.0,3.5,1.0,Iris-versicolor
63
- 5.9,3.0,4.2,1.5,Iris-versicolor
64
- 6.0,2.2,4.0,1.0,Iris-versicolor
65
- 6.1,2.9,4.7,1.4,Iris-versicolor
66
- 5.6,2.9,3.6,1.3,Iris-versicolor
67
- 6.7,3.1,4.4,1.4,Iris-versicolor
68
- 5.6,3.0,4.5,1.5,Iris-versicolor
69
- 5.8,2.7,4.1,1.0,Iris-versicolor
70
- 6.2,2.2,4.5,1.5,Iris-versicolor
71
- 5.6,2.5,3.9,1.1,Iris-versicolor
72
- 5.9,3.2,4.8,1.8,Iris-versicolor
73
- 6.1,2.8,4.0,1.3,Iris-versicolor
74
- 6.3,2.5,4.9,1.5,Iris-versicolor
75
- 6.1,2.8,4.7,1.2,Iris-versicolor
76
- 6.4,2.9,4.3,1.3,Iris-versicolor
77
- 6.6,3.0,4.4,1.4,Iris-versicolor
78
- 6.8,2.8,4.8,1.4,Iris-versicolor
79
- 6.7,3.0,5.0,1.7,Iris-versicolor
80
- 6.0,2.9,4.5,1.5,Iris-versicolor
81
- 5.7,2.6,3.5,1.0,Iris-versicolor
82
- 5.5,2.4,3.8,1.1,Iris-versicolor
83
- 5.5,2.4,3.7,1.0,Iris-versicolor
84
- 5.8,2.7,3.9,1.2,Iris-versicolor
85
- 6.0,2.7,5.1,1.6,Iris-versicolor
86
- 5.4,3.0,4.5,1.5,Iris-versicolor
87
- 6.0,3.4,4.5,1.6,Iris-versicolor
88
- 6.7,3.1,4.7,1.5,Iris-versicolor
89
- 6.3,2.3,4.4,1.3,Iris-versicolor
90
- 5.6,3.0,4.1,1.3,Iris-versicolor
91
- 5.5,2.5,4.0,1.3,Iris-versicolor
92
- 5.5,2.6,4.4,1.2,Iris-versicolor
93
- 6.1,3.0,4.6,1.4,Iris-versicolor
94
- 5.8,2.6,4.0,1.2,Iris-versicolor
95
- 5.0,2.3,3.3,1.0,Iris-versicolor
96
- 5.6,2.7,4.2,1.3,Iris-versicolor
97
- 5.7,3.0,4.2,1.2,Iris-versicolor
98
- 5.7,2.9,4.2,1.3,Iris-versicolor
99
- 6.2,2.9,4.3,1.3,Iris-versicolor
100
- 5.1,2.5,3.0,1.1,Iris-versicolor
101
- 5.7,2.8,4.1,1.3,Iris-versicolor
102
- 6.3,3.3,6.0,2.5,Iris-virginica
103
- 5.8,2.7,5.1,1.9,Iris-virginica
104
- 7.1,3.0,5.9,2.1,Iris-virginica
105
- 6.3,2.9,5.6,1.8,Iris-virginica
106
- 6.5,3.0,5.8,2.2,Iris-virginica
107
- 7.6,3.0,6.6,2.1,Iris-virginica
108
- 4.9,2.5,4.5,1.7,Iris-virginica
109
- 7.3,2.9,6.3,1.8,Iris-virginica
110
- 6.7,2.5,5.8,1.8,Iris-virginica
111
- 7.2,3.6,6.1,2.5,Iris-virginica
112
- 6.5,3.2,5.1,2.0,Iris-virginica
113
- 6.4,2.7,5.3,1.9,Iris-virginica
114
- 6.8,3.0,5.5,2.1,Iris-virginica
115
- 5.7,2.5,5.0,2.0,Iris-virginica
116
- 5.8,2.8,5.1,2.4,Iris-virginica
117
- 6.4,3.2,5.3,2.3,Iris-virginica
118
- 6.5,3.0,5.5,1.8,Iris-virginica
119
- 7.7,3.8,6.7,2.2,Iris-virginica
120
- 7.7,2.6,6.9,2.3,Iris-virginica
121
- 6.0,2.2,5.0,1.5,Iris-virginica
122
- 6.9,3.2,5.7,2.3,Iris-virginica
123
- 5.6,2.8,4.9,2.0,Iris-virginica
124
- 7.7,2.8,6.7,2.0,Iris-virginica
125
- 6.3,2.7,4.9,1.8,Iris-virginica
126
- 6.7,3.3,5.7,2.1,Iris-virginica
127
- 7.2,3.2,6.0,1.8,Iris-virginica
128
- 6.2,2.8,4.8,1.8,Iris-virginica
129
- 6.1,3.0,4.9,1.8,Iris-virginica
130
- 6.4,2.8,5.6,2.1,Iris-virginica
131
- 7.2,3.0,5.8,1.6,Iris-virginica
132
- 7.4,2.8,6.1,1.9,Iris-virginica
133
- 7.9,3.8,6.4,2.0,Iris-virginica
134
- 6.4,2.8,5.6,2.2,Iris-virginica
135
- 6.3,2.8,5.1,1.5,Iris-virginica
136
- 6.1,2.6,5.6,1.4,Iris-virginica
137
- 7.7,3.0,6.1,2.3,Iris-virginica
138
- 6.3,3.4,5.6,2.4,Iris-virginica
139
- 6.4,3.1,5.5,1.8,Iris-virginica
140
- 6.0,3.0,4.8,1.8,Iris-virginica
141
- 6.9,3.1,5.4,2.1,Iris-virginica
142
- 6.7,3.1,5.6,2.4,Iris-virginica
143
- 6.9,3.1,5.1,2.3,Iris-virginica
144
- 5.8,2.7,5.1,1.9,Iris-virginica
145
- 6.8,3.2,5.9,2.3,Iris-virginica
146
- 6.7,3.3,5.7,2.5,Iris-virginica
147
- 6.7,3.0,5.2,2.3,Iris-virginica
148
- 6.3,2.5,5.0,1.9,Iris-virginica
149
- 6.5,3.0,5.2,2.0,Iris-virginica
150
- 6.2,3.4,5.4,2.3,Iris-virginica
151
- 5.9,3.0,5.1,1.8,Iris-virginica
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
example/main.py DELETED
@@ -1,35 +0,0 @@
1
- import requests
2
-
3
- with open("mushrooms.csv", "rb") as csv:
4
- data = csv.read()
5
-
6
- # class,cap-shape,cap-surface,cap-color,bruises,odor,gill-attachment,gill-spacing,gill-size,gill-color,stalk-shape,stalk-root,stalk-surface-above-ring,stalk-surface-below-ring,stalk-color-above-ring,stalk-color-below-ring,veil-type,veil-color,ring-number,ring-type,spore-print-color,population,habitat
7
-
8
- ARGS = {
9
- "epochs": 1_000,
10
- "hidden_size": 8,
11
- "learning_rate": 0.0001,
12
- "test_size": 0.1,
13
- "activation": "relu",
14
- "features": [
15
- "cap-shape",
16
- "cap-surface",
17
- "cap-color",
18
- "bruises",
19
- "odor",
20
- "gill-attachment",
21
- "gill-spacing",
22
- "gill-size",
23
- "gill-color",
24
- ],
25
- "target": "class",
26
- "data": data.decode("utf-8"),
27
- }
28
-
29
- if __name__ == "__main__":
30
- r = requests.post(
31
- "http://127.0.0.1:5000/neural-network",
32
- json=ARGS, # Send the data as a JSON object
33
- )
34
-
35
- print(r.text)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
example/mushrooms.csv DELETED
The diff for this file is too large to render. See raw diff
 
ml-vis DELETED
@@ -1 +0,0 @@
1
- Subproject commit bebe25b27a895c1de71743fbf808b8e592e80806
 
 
nn/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ from nn.nn import NN
2
+ from nn.activation import ACTIVATIONS
3
+ from nn.loss import LOSSES
nn/activation.py CHANGED
@@ -1,46 +1,59 @@
1
- from typing import Callable
2
- from nn.nn import NN
3
  import numpy as np
 
4
 
5
 
6
- def get_activation(nn: NN) -> Callable:
7
- a = nn.activation
8
- funcs = {
9
- "relu": relu,
10
- "sigmoid": sigmoid,
11
- "tanh": tanh,
12
- }
13
 
14
- prime_funcs = {
15
- "sigmoid": sigmoid_prime,
16
- "tanh": tanh_prime,
17
- "relu": relu_prime,
18
- }
19
 
20
- nn.set_func(funcs[a])
21
- nn.set_func_prime(prime_funcs[a])
 
 
22
 
 
 
 
23
 
24
- def relu(x):
25
- return np.maximum(0.0, x)
26
 
 
 
 
27
 
28
- def relu_prime(x):
29
- return np.maximum(0, x)
30
 
31
 
32
- def sigmoid(x):
33
- return 1.0 / (1.0 + np.exp(-x))
 
34
 
 
 
35
 
36
- def sigmoid_prime(x):
37
- s = sigmoid(x)
38
- return s * (1 - s)
39
 
 
 
 
40
 
41
- def tanh(x):
42
- return np.tanh(x)
 
43
 
44
 
45
- def tanh_prime(x):
46
- return 1 - np.tanh(x)**2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import numpy as np
2
+ from abc import abstractmethod, ABC
3
 
4
 
5
+ __all__ = ["Activation", "Relu", "TanH", "Sigmoid", "SoftMax", "ACTIVATIONS"]
 
 
 
 
 
 
6
 
 
 
 
 
 
7
 
8
+ class Activation(ABC):
9
+ @abstractmethod
10
+ def forward(self, X: np.ndarray) -> np.ndarray:
11
+ pass
12
 
13
+ @abstractmethod
14
+ def backward(self, X: np.ndarray) -> np.ndarray:
15
+ pass
16
 
 
 
17
 
18
+ class Relu(Activation):
19
+ def forward(self, X: np.ndarray) -> np.ndarray:
20
+ return np.maximum(0, X)
21
 
22
+ def backward(self, X: np.ndarray) -> np.ndarray:
23
+ return np.where(X > 0, 1, 0)
24
 
25
 
26
+ class TanH(Activation):
27
+ def forward(self, X: np.ndarray) -> np.ndarray:
28
+ return np.tanh(X)
29
 
30
+ def backward(self, X: np.ndarray) -> np.ndarray:
31
+ return 1 - self.forward(X) ** 2
32
 
 
 
 
33
 
34
+ class Sigmoid(Activation):
35
+ def forward(self, X: np.ndarray) -> np.ndarray:
36
+ return 1.0 / (1.0 + np.exp(-X))
37
 
38
+ def backward(self, X: np.ndarray) -> np.ndarray:
39
+ s = self.forward(X)
40
+ return s - (1 - s)
41
 
42
 
43
+ class SoftMax(Activation):
44
+ def forward(self, X: np.ndarray) -> np.ndarray:
45
+ exps = np.exp(
46
+ X - np.max(X, axis=1, keepdims=True)
47
+ ) # Avoid numerical instability
48
+ return exps / np.sum(exps, axis=1, keepdims=True)
49
+
50
+ def backward(self, X: np.ndarray) -> np.ndarray:
51
+ return X
52
+
53
+
54
+ ACTIVATIONS: dict[str, Activation] = {
55
+ "Relu": Relu(),
56
+ "Sigmoid": Sigmoid(),
57
+ "Tanh": TanH(),
58
+ "SoftMax": SoftMax(),
59
+ }
nn/loss.py ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from abc import ABC, abstractmethod
2
+ from nn.activation import SoftMax
3
+ import numpy as np
4
+
5
+
6
+ __all__ = ["Loss", "MSE", "CrossEntropy", "LOSSES"]
7
+
8
+
9
+ class Loss(ABC):
10
+ @abstractmethod
11
+ def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
12
+ pass
13
+
14
+ @abstractmethod
15
+ def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
16
+ pass
17
+
18
+
19
+ class MSE(Loss):
20
+ def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
21
+ return np.sum(np.square(y_hat - y_true)) / y_true.shape[0]
22
+
23
+ def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
24
+ return (y_hat - y_true) * (2 / y_true.shape[0])
25
+
26
+
27
+ class CrossEntropy(Loss):
28
+ def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
29
+ y_hat = np.asarray(y_hat)
30
+ y_true = np.asarray(y_true)
31
+ m = y_true.shape[0]
32
+ p = self._softmax(y_hat)
33
+ log_likelihood = -np.log(p[range(m), y_true.argmax(axis=1)])
34
+ loss = np.sum(log_likelihood) / m
35
+ return loss
36
+
37
+ def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
38
+ y_hat = np.asarray(y_hat)
39
+ y_true = np.asarray(y_true)
40
+ return (y_hat - y_true) / y_true.shape[0]
41
+
42
+ @staticmethod
43
+ def _softmax(X: np.ndarray) -> np.ndarray:
44
+ return SoftMax().forward(X)
45
+
46
+
47
+ LOSSES: dict[str, Loss] = {
48
+ "MSE": MSE(),
49
+ "CrossEntropy": CrossEntropy(),
50
+ }
nn/nn.py CHANGED
@@ -1,63 +1,163 @@
1
- from typing import Callable
2
- from sklearn.preprocessing import StandardScaler
3
- import pandas as pd
 
 
 
 
 
 
4
 
5
 
6
  class NN:
7
  def __init__(
8
  self,
9
  epochs: int,
10
- hidden_size: int,
11
  learning_rate: float,
12
- test_size: float,
13
- activation: str,
14
- features: list[str],
15
- target: str,
16
- data: str,
17
- ):
 
18
  self.epochs = epochs
19
- self.hidden_size = hidden_size
20
  self.learning_rate = learning_rate
21
- self.test_size = test_size
22
- self.activation = activation
23
- self.features = features
24
- self.target = target
25
- self.data = data
26
-
27
- self.loss_hist: list[float] = None
28
- self.func_prime: Callable = None
29
- self.func: Callable = None
30
- self.X: pd.DataFrame = None
31
- self.y: pd.DataFrame = None
32
- self.y_dummy: pd.DataFrame = None
33
- self.input_size: int = None
34
- self.output_size: int = None
35
-
36
- def set_df(self, df: pd.DataFrame) -> None:
37
- assert isinstance(df, pd.DataFrame)
38
- x = df[self.features]
39
- y = df[self.target]
40
- self.X = pd.get_dummies(x, columns=self.features)
41
- self.y_dummy = pd.get_dummies(y, columns=self.target)
42
- self.input_size = len(self.X.columns)
43
- self.output_size = len(self.y_dummy.columns)
44
-
45
- def normalize(self):
46
- scaler = StandardScaler()
47
- self.y_dummy = scaler.fit_transform(self.y_dummy)
48
- self.X = scaler.fit_transform(self.X)
49
-
50
- def set_func(self, f: Callable) -> None:
51
- assert isinstance(f, Callable)
52
- self.func = f
53
-
54
- def set_func_prime(self, f: Callable) -> None:
55
- assert isinstance(f, Callable)
56
- self.func_prime = f
57
-
58
- @classmethod
59
- def from_dict(cls, dct):
60
- """ Creates an instance of NN given a dictionary
61
- we can use this to make sure that the arguments are right
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
  """
63
- return cls(**dct)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Optional
2
+ from nn.activation import ACTIVATIONS, Activation
3
+ from nn.loss import LOSSES, Loss
4
+ import numpy as np
5
+
6
+ import gradio as gr
7
+
8
+
9
+ DTYPE = np.float32
10
 
11
 
12
  class NN:
13
  def __init__(
14
  self,
15
  epochs: int,
 
16
  learning_rate: float,
17
+ hidden_size: int,
18
+ input_size: int,
19
+ output_size: int,
20
+ activation_fn: str,
21
+ loss_fn: str,
22
+ seed: int,
23
+ ) -> None:
24
  self.epochs = epochs
 
25
  self.learning_rate = learning_rate
26
+ self.hidden_size = hidden_size
27
+ self.input_size = input_size
28
+ self.output_size = output_size
29
+ self.seed = seed
30
+
31
+ # try to get activation function and loss funciton
32
+ act_fn = ACTIVATIONS.get(activation_fn, None)
33
+ if act_fn is None:
34
+ raise KeyError(f"Invalid Activation function '{activation_fn}'")
35
+ loss_fn = LOSSES.get(loss_fn, None)
36
+ if loss_fn is None:
37
+ raise KeyError(f"Invalid Activation function '{activation_fn}'")
38
+ self._activation_fn: Activation = act_fn
39
+ self._loss_fn: Loss = loss_fn
40
+
41
+ self._loss_history = list()
42
+ self._weight_history = {
43
+ "wo": [],
44
+ "wh": [],
45
+ "bo": [],
46
+ "bh": [],
47
+ }
48
+
49
+ self._wo: Optional[np.ndarray] = None
50
+ self._wh: Optional[np.ndarray] = None
51
+ self._bo: Optional[np.ndarray] = None
52
+ self._bh: Optional[np.ndarray] = None
53
+ self._init_weights_and_biases()
54
+
55
+ def _init_weights_and_biases(self) -> None:
56
+ """
57
+ NN._init_weights_and_biases(): Should only be ran once, right before training loop
58
+ in order to initialize the weights and biases randomly.
59
+
60
+ params:
61
+ NN object with hidden layer size, output size, and input size
62
+ defined.
63
+
64
+ returns:
65
+ self, modifies _bh, _bo, _wo, _wh NN attributes in place.
66
+ """
67
+ np.random.seed(self.seed)
68
+ self._bh = np.zeros((1, self.hidden_size), dtype=DTYPE)
69
+ self._bo = np.zeros((1, self.output_size), dtype=DTYPE)
70
+ self._wh = np.asarray(
71
+ np.random.randn(self.input_size, self.hidden_size)
72
+ * np.sqrt(2 / self.input_size),
73
+ dtype=DTYPE,
74
+ )
75
+ self._wo = np.asarray(
76
+ np.random.randn(self.hidden_size, self.output_size)
77
+ * np.sqrt(2 / self.hidden_size),
78
+ dtype=DTYPE,
79
+ )
80
+ return
81
+
82
+ def _forward(self, X_train: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
83
  """
84
+ _forward(X_train): ran as the first step of each epoch during training.
85
+
86
+ params:
87
+ X_train: np.ndarray -> data that we are training the NN on.
88
+
89
+ returns:
90
+ output layer np array containing the predicted outputs calculated using
91
+ the weights and biases of the current epoch.
92
+ """
93
+ assert self._activation_fn is not None
94
+
95
+ # hidden layer
96
+ hidden_layer_output = self._activation_fn.forward(
97
+ np.dot(X_train, self._wh) + self._bh
98
+ )
99
+ # output layer (prediction layer)
100
+ y_hat = self._activation_fn.forward(
101
+ np.dot(hidden_layer_output, self._wo) + self._bo
102
+ )
103
+ return y_hat, hidden_layer_output
104
+
105
+ def _backward(
106
+ self,
107
+ X_train: np.ndarray,
108
+ y_hat: np.ndarray,
109
+ y_train: np.ndarray,
110
+ hidden_output: np.ndarray,
111
+ ) -> None:
112
+ assert self._activation_fn is not None
113
+ assert self._wo is not None
114
+ assert self._loss_fn is not None
115
+
116
+ # Calculate the error at the output
117
+ # This should be the derivative of the loss function with respect to the output of the network
118
+ error_output = self._loss_fn.backward(
119
+ y_hat, y_train
120
+ ) * self._activation_fn.backward(y_hat)
121
+
122
+ # Calculate gradients for output layer weights and biases
123
+ wo_prime = np.dot(hidden_output.T, error_output) * self.learning_rate
124
+ bo_prime = np.sum(error_output, axis=0, keepdims=True) * self.learning_rate
125
+
126
+ # Propagate the error back to the hidden layer
127
+ error_hidden = np.dot(error_output, self._wo.T) * self._activation_fn.backward(
128
+ hidden_output
129
+ )
130
+
131
+ # Calculate gradients for hidden layer weights and biases
132
+ wh_prime = np.dot(X_train.T, error_hidden) * self.learning_rate
133
+ bh_prime = np.sum(error_hidden, axis=0, keepdims=True) * self.learning_rate
134
+
135
+ # Update weights and biases
136
+ self._wo -= wo_prime
137
+ self._wh -= wh_prime
138
+ self._bo -= bo_prime
139
+ self._bh -= bh_prime
140
+
141
+ def train(self, X_train: np.ndarray, y_train: np.ndarray) -> "NN":
142
+ assert self._loss_fn is not None
143
+
144
+ for _ in gr.Progress().tqdm(range(self.epochs)):
145
+ y_hat, hidden_output = self._forward(X_train=X_train)
146
+ loss = self._loss_fn.forward(y_hat=y_hat, y_true=y_train)
147
+ self._loss_history.append(loss)
148
+ self._backward(
149
+ X_train=X_train,
150
+ y_hat=y_hat,
151
+ y_train=y_train,
152
+ hidden_output=hidden_output,
153
+ )
154
+
155
+ # keep track of weights an biases at each epoch for visualization
156
+ self._weight_history["wo"].append(self._wo[0, 0])
157
+ self._weight_history["wh"].append(self._wh[0, 0])
158
+ self._weight_history["bo"].append(self._bo[0, 0])
159
+ self._weight_history["bh"].append(self._bh[0, 0])
160
+ return self
161
+
162
+ def predict(self, X_test: np.ndarray) -> np.ndarray:
163
+ return self._forward(X_train=X_test)[0]
nn/test.py ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from nn.nn import NN
2
+ import unittest
3
+
4
+ TEST_NN = NN(
5
+ epochs=100,
6
+ learning_rate=0.001,
7
+ hidden_size=8,
8
+ input_size=2,
9
+ output_size=1,
10
+ activation_fn="Sigmoid",
11
+ loss_fn="MSE",
12
+ )
13
+
14
+
15
+ class TestNN(unittest.TestCase):
16
+ def test_init_w_b(self) -> None:
17
+ return
18
+
19
+ def test_forward(self) -> None:
20
+ return
21
+
22
+ def test_backward(self) -> None:
23
+ return
24
+
25
+ def test_train(self) -> None:
26
+ return
27
+
28
+
29
+ if __name__ == "__main__":
30
+ unittest.main()
nn/train.py DELETED
@@ -1,127 +0,0 @@
1
- from sklearn.model_selection import train_test_split
2
- from sklearn.metrics import log_loss
3
- from typing import Callable
4
- from nn.nn import NN
5
- import numpy as np
6
-
7
-
8
- def init_weights_biases(nn: NN):
9
- np.random.seed(0)
10
- bh = np.zeros((1, nn.hidden_size))
11
- bo = np.zeros((1, nn.output_size))
12
- wh = np.random.randn(nn.input_size, nn.hidden_size) * \
13
- np.sqrt(2 / nn.input_size)
14
- wo = np.random.randn(nn.hidden_size, nn.output_size) * \
15
- np.sqrt(2 / nn.hidden_size)
16
- return wh, wo, bh, bo
17
-
18
-
19
- def train(nn: NN) -> dict:
20
- wh, wo, bh, bo = init_weights_biases(nn=nn)
21
-
22
- X_train, X_test, y_train, y_test = train_test_split(
23
- nn.X.to_numpy(),
24
- nn.y_dummy.to_numpy(),
25
- test_size=nn.test_size,
26
- random_state=0,
27
- )
28
-
29
- accuracy_scores = []
30
- loss_hist: list[float] = []
31
- for _ in range(nn.epochs):
32
- # compute hidden output
33
- hidden_output = compute_node(
34
- data=X_train,
35
- weights=wh,
36
- biases=bh,
37
- func=nn.func,
38
- )
39
-
40
- # compute output layer
41
- y_hat = compute_node(
42
- data=hidden_output,
43
- weights=wo,
44
- biases=bo,
45
- func=nn.func,
46
- )
47
- # compute error & store it
48
- error = y_hat - y_train
49
- loss = log_loss(y_true=y_train, y_pred=y_hat)
50
- accuracy = accuracy_score(y_true=y_train, y_pred=y_hat)
51
- accuracy_scores.append(accuracy)
52
- loss_hist.append(loss)
53
-
54
- # compute derivatives of weights & biases
55
- # update weights & biases using gradient descent after
56
- # computing derivatives.
57
- dwo = nn.learning_rate * output_weight_prime(hidden_output, error)
58
-
59
- # Use NumPy to sum along the first axis (axis=0)
60
- # and then reshape to match the shape of bo
61
- dbo = nn.learning_rate * np.sum(output_bias_prime(error), axis=0)
62
-
63
- dhidden = np.dot(error, wo.T) * nn.func_prime(hidden_output)
64
- dwh = nn.learning_rate * hidden_weight_prime(X_train, dhidden)
65
- dbh = nn.learning_rate * hidden_bias_prime(dhidden)
66
-
67
- wh -= dwh
68
- wo -= dwo
69
- bh -= dbh
70
- bo -= dbo
71
-
72
- # compute final predictions on data not seen
73
- hidden_output_test = compute_node(
74
- data=X_test,
75
- weights=wh,
76
- biases=bh,
77
- func=nn.func,
78
- )
79
- y_hat = compute_node(
80
- data=hidden_output_test,
81
- weights=wo,
82
- biases=bo,
83
- func=nn.func,
84
- )
85
-
86
- return {
87
- "loss_hist": loss_hist,
88
- "log_loss": log_loss(y_true=y_test, y_pred=y_hat),
89
- "accuracy_scores": accuracy_scores,
90
- "test_accuracy": accuracy_score(y_true=y_test, y_pred=y_hat)
91
- }
92
-
93
-
94
- def compute_node(data: np.array, weights: np.array, biases: np.array, func: Callable) -> np.array:
95
- return func(np.dot(data, weights) + biases)
96
-
97
-
98
- def mean_squared_error(y: np.array, y_hat: np.array) -> np.array:
99
- return np.mean((y - y_hat) ** 2)
100
-
101
-
102
- def hidden_bias_prime(error):
103
- return np.sum(error, axis=0)
104
-
105
-
106
- def output_bias_prime(error):
107
- return np.sum(error, axis=0)
108
-
109
-
110
- def hidden_weight_prime(data, error):
111
- return np.dot(data.T, error)
112
-
113
-
114
- def output_weight_prime(hidden_output, error):
115
- return np.dot(hidden_output.T, error)
116
-
117
-
118
- def accuracy_score(y_true, y_pred):
119
- # Ensure y_true and y_pred have the same shape
120
- if y_true.shape != y_pred.shape:
121
- raise ValueError("Input shapes do not match.")
122
-
123
- # Calculate the accuracy
124
- num_samples = len(y_true)
125
- num_correct = np.sum(y_true == y_pred)
126
-
127
- return num_correct / num_samples
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
requirements.txt CHANGED
@@ -1,8 +1,4 @@
1
- Flask==2.2.3
2
- numpy==1.25.2
3
- pandas==1.5.3
4
- requests==2.28.2
5
- scikit_learn==1.3.1
6
- gunicorn==21.2.0
7
- Werkzeug==2.2.2
8
- Flask_Cors==3.0.10
 
1
+ gradio==4.26.0
2
+ numpy==1.26.4
3
+ plotly==5.20.0
4
+ scikit_learn==1.4.1.post1
 
 
 
 
vis.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import plotly.express as px
2
+ from sklearn import datasets
3
+ from sklearn.preprocessing import StandardScaler, OneHotEncoder
4
+ import numpy as np
5
+ import os
6
+
7
+
8
+ def iris_3d_scatter():
9
+ df = px.data.iris()
10
+ fig = px.scatter_3d(
11
+ df,
12
+ x="sepal_length",
13
+ y="sepal_width",
14
+ z="petal_width",
15
+ color="species",
16
+ size="petal_length",
17
+ size_max=18,
18
+ )
19
+ fig.update_layout(margin=dict(l=0, r=0, b=0, t=0))
20
+ return fig