Spaces:
Sleeping
Sleeping
Jensen-holm
commited on
Commit
β’
6307b4f
1
Parent(s):
d04aaf5
switching to gradio and a complete rewrite that I did over in the
Browse files- .gitmodules +0 -3
- README.md +14 -34
- app.py +105 -45
- example/iris.csv +0 -151
- example/main.py +0 -35
- example/mushrooms.csv +0 -0
- ml-vis +0 -1
- nn/__init__.py +3 -0
- nn/activation.py +42 -29
- nn/loss.py +50 -0
- nn/nn.py +153 -53
- nn/test.py +30 -0
- nn/train.py +0 -127
- requirements.txt +4 -8
- vis.py +20 -0
.gitmodules
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
[submodule "ml-vis"]
|
2 |
-
path = ml-vis
|
3 |
-
url = git@github.com:Jensen-holm/ml-vis.git
|
|
|
|
|
|
|
|
README.md
CHANGED
@@ -1,34 +1,14 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
#### Test Size
|
16 |
-
When training a neural network, or any machine learning model for that matter, it is important to split the data into training and testing sets. The test size parameter specifys how to split up the data into these two sets. a test size of 0.2 will split it up so that 80% of the data is used for training, and 20% of the data is used for testing.
|
17 |
-
|
18 |
-
|
19 |
-
## Backprop Algorithm
|
20 |
-
Backpropagation, short for "backward propagation of errors," is the cornerstone of training artificial neural networks. It begins by initializing the network's weights and biases. During the forward pass, input data flows through the network's layers, undergoing weighted sum calculations and activation functions, eventually producing predictions. The algorithm then computes an error or loss by comparing these predictions to the actual target values. In the critical backward pass, starting from the output layer and moving in reverse, gradients of the loss with respect to each layer's outputs, weights, and biases are calculated using calculus and the chain rule. These gradients guide the adjustment of weights and biases in each layer, with the goal of minimizing the loss. This iterative process repeats for multiple epochs, refining the network's parameters until the error reaches an acceptable level or a fixed number of training iterations is completed, ultimately enabling the network to improve its predictions on new data.
|
21 |
-
|
22 |
-
## Implementation
|
23 |
-
Behind the scenes, my API implements the backprop algorithm. The main loop first initializes weights and biases randomly. The algorithm starts by iterating n times where n is the number of epochs you specify above. During each iteration, starting with the randomly initialized weights and biases, the activation function that you choose will be run inside of this compute node function below:
|
24 |
-
|
25 |
-
The activation function plays a crucial role in the behavior of your neural network. The compute node function, which we've discussed earlier, calculates the network's output. In each iteration of the training process, we compare this output to the actual data, which, in this case, represents the iris flower type. The difference between the predicted and actual values guides the algorithm in determining how much to adjust the network's weights and biases for better predictions. However, we must be careful to prevent the neural network from memorizing the training data, a problem in machine learning known as overfitting. To address this, we scale down the derivatives computed for weights and biases by the learning rate you specify, ensuring that the network learns in a controlled and meaningful manner. You'll notice that if you use 1 for the learning rate, the graph on loss/epoch is a lot choppier than it is if you have a lower learning rate like 0.01. The smoother the curve, the better. The process repeats for n epochs, then the final results are calculated, and our final weights and biases saved.
|
26 |
-
|
27 |
-
## Results
|
28 |
-
|
29 |
-
#### Log Loss
|
30 |
-
|
31 |
-
#### Accuracy Score
|
32 |
-
|
33 |
-
|
34 |
-
|
|
|
1 |
+
---
|
2 |
+
title: Backprop Playground
|
3 |
+
emoji: π
|
4 |
+
colorFrom: yellow
|
5 |
+
colorTo: blue
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 4.26.0
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
license: mit
|
11 |
+
---
|
12 |
+
|
13 |
+
This web app uses a neural network framework that I built from scratch in <br>
|
14 |
+
python, using numpy as the only 3rd party library in the framework itself. <br>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
app.py
CHANGED
@@ -1,48 +1,108 @@
|
|
1 |
-
|
2 |
-
from
|
3 |
-
from
|
4 |
-
from
|
5 |
-
|
6 |
-
import
|
7 |
-
import
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
45 |
|
46 |
|
47 |
if __name__ == "__main__":
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import plotly.express as px
|
2 |
+
from sklearn import datasets
|
3 |
+
from sklearn.preprocessing import StandardScaler, OneHotEncoder
|
4 |
+
from sklearn.model_selection import train_test_split
|
5 |
+
import numpy as np
|
6 |
+
import gradio as gr
|
7 |
+
from vis import iris_3d_scatter
|
8 |
+
import nn # custom neural network module
|
9 |
+
|
10 |
+
|
11 |
+
def _preprocess_iris_data(
|
12 |
+
seed: int,
|
13 |
+
) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
|
14 |
+
iris = datasets.load_iris()
|
15 |
+
X = iris["data"]
|
16 |
+
y = iris["target"]
|
17 |
+
# normalize the features
|
18 |
+
X = StandardScaler().fit_transform(X)
|
19 |
+
# one hot encode the target variables
|
20 |
+
y = OneHotEncoder().fit_transform(y.reshape(-1, 1)).toarray()
|
21 |
+
return train_test_split(
|
22 |
+
X,
|
23 |
+
y,
|
24 |
+
test_size=0.2,
|
25 |
+
random_state=seed,
|
26 |
+
)
|
27 |
+
|
28 |
+
|
29 |
+
X_train, X_test, y_train, y_test = _preprocess_iris_data(seed=1)
|
30 |
+
|
31 |
+
|
32 |
+
def main(
|
33 |
+
Seed: int = 0,
|
34 |
+
Activation_Func: str = "SoftMax",
|
35 |
+
Loss_Func: str = "CrossEntropy",
|
36 |
+
Epochs: int = 100,
|
37 |
+
Hidden_Size: int = 8,
|
38 |
+
Learning_Rate: float = 0.01,
|
39 |
+
) -> gr.Plot:
|
40 |
+
|
41 |
+
iris_classifier = nn.NN(
|
42 |
+
epochs=Epochs,
|
43 |
+
learning_rate=Learning_Rate,
|
44 |
+
activation_fn=Activation_Func,
|
45 |
+
loss_fn=Loss_Func,
|
46 |
+
hidden_size=Hidden_Size,
|
47 |
+
input_size=4, # number of features in iris dataset
|
48 |
+
output_size=3, # three classes in iris dataset
|
49 |
+
seed=Seed,
|
50 |
+
)
|
51 |
+
|
52 |
+
iris_classifier.train(X_train=X_train, y_train=y_train)
|
53 |
+
loss_fig = px.line(
|
54 |
+
x=[i for i in range(len(iris_classifier._loss_history))],
|
55 |
+
y=iris_classifier._loss_history,
|
56 |
+
)
|
57 |
+
|
58 |
+
return gr.Plot(loss_fig)
|
59 |
|
60 |
|
61 |
if __name__ == "__main__":
|
62 |
+
with gr.Blocks() as interface:
|
63 |
+
gr.Markdown("# Backpropagation Playground")
|
64 |
+
|
65 |
+
with gr.Tab("Classification"):
|
66 |
+
|
67 |
+
with gr.Row():
|
68 |
+
data_plt = iris_3d_scatter()
|
69 |
+
gr.Plot(data_plt)
|
70 |
+
|
71 |
+
with gr.Row():
|
72 |
+
seed_input = [gr.Number(minimum=0, label="Random Seed")]
|
73 |
+
|
74 |
+
# inputs in the same row
|
75 |
+
with gr.Row():
|
76 |
+
with gr.Column():
|
77 |
+
numeric_inputs = [
|
78 |
+
gr.Slider(minimum=100, maximum=10_000, step=50, label="Epochs"),
|
79 |
+
gr.Slider(
|
80 |
+
minimum=2, maximum=64, step=2, label="Hidden Network Size"
|
81 |
+
),
|
82 |
+
gr.Number(minimum=0.00001, maximum=1.5, label="Learning Rate"),
|
83 |
+
]
|
84 |
+
with gr.Column():
|
85 |
+
fn_inputs = [
|
86 |
+
gr.Dropdown(
|
87 |
+
choices=["SoftMax"], label="Activation Function"
|
88 |
+
),
|
89 |
+
gr.Dropdown(choices=["CrossEntropy"], label="Loss Function"),
|
90 |
+
]
|
91 |
+
|
92 |
+
with gr.Row():
|
93 |
+
train_btn = gr.Button("Train", variant="primary")
|
94 |
+
|
95 |
+
# outputs in row below inputs
|
96 |
+
with gr.Row():
|
97 |
+
plt_outputs = [gr.Plot()]
|
98 |
+
|
99 |
+
train_btn.click(
|
100 |
+
fn=main,
|
101 |
+
inputs=seed_input + fn_inputs + numeric_inputs,
|
102 |
+
outputs=plt_outputs,
|
103 |
+
)
|
104 |
+
|
105 |
+
with gr.Tab("Regression"):
|
106 |
+
...
|
107 |
+
|
108 |
+
interface.launch(show_error=True)
|
example/iris.csv
DELETED
@@ -1,151 +0,0 @@
|
|
1 |
-
sepal length,sepal width,petal length,petal width,species
|
2 |
-
5.1,3.5,1.4,0.2,Iris-setosa
|
3 |
-
4.9,3.0,1.4,0.2,Iris-setosa
|
4 |
-
4.7,3.2,1.3,0.2,Iris-setosa
|
5 |
-
4.6,3.1,1.5,0.2,Iris-setosa
|
6 |
-
5.0,3.6,1.4,0.2,Iris-setosa
|
7 |
-
5.4,3.9,1.7,0.4,Iris-setosa
|
8 |
-
4.6,3.4,1.4,0.3,Iris-setosa
|
9 |
-
5.0,3.4,1.5,0.2,Iris-setosa
|
10 |
-
4.4,2.9,1.4,0.2,Iris-setosa
|
11 |
-
4.9,3.1,1.5,0.1,Iris-setosa
|
12 |
-
5.4,3.7,1.5,0.2,Iris-setosa
|
13 |
-
4.8,3.4,1.6,0.2,Iris-setosa
|
14 |
-
4.8,3.0,1.4,0.1,Iris-setosa
|
15 |
-
4.3,3.0,1.1,0.1,Iris-setosa
|
16 |
-
5.8,4.0,1.2,0.2,Iris-setosa
|
17 |
-
5.7,4.4,1.5,0.4,Iris-setosa
|
18 |
-
5.4,3.9,1.3,0.4,Iris-setosa
|
19 |
-
5.1,3.5,1.4,0.3,Iris-setosa
|
20 |
-
5.7,3.8,1.7,0.3,Iris-setosa
|
21 |
-
5.1,3.8,1.5,0.3,Iris-setosa
|
22 |
-
5.4,3.4,1.7,0.2,Iris-setosa
|
23 |
-
5.1,3.7,1.5,0.4,Iris-setosa
|
24 |
-
4.6,3.6,1.0,0.2,Iris-setosa
|
25 |
-
5.1,3.3,1.7,0.5,Iris-setosa
|
26 |
-
4.8,3.4,1.9,0.2,Iris-setosa
|
27 |
-
5.0,3.0,1.6,0.2,Iris-setosa
|
28 |
-
5.0,3.4,1.6,0.4,Iris-setosa
|
29 |
-
5.2,3.5,1.5,0.2,Iris-setosa
|
30 |
-
5.2,3.4,1.4,0.2,Iris-setosa
|
31 |
-
4.7,3.2,1.6,0.2,Iris-setosa
|
32 |
-
4.8,3.1,1.6,0.2,Iris-setosa
|
33 |
-
5.4,3.4,1.5,0.4,Iris-setosa
|
34 |
-
5.2,4.1,1.5,0.1,Iris-setosa
|
35 |
-
5.5,4.2,1.4,0.2,Iris-setosa
|
36 |
-
4.9,3.1,1.5,0.2,Iris-setosa
|
37 |
-
5.0,3.2,1.2,0.2,Iris-setosa
|
38 |
-
5.5,3.5,1.3,0.2,Iris-setosa
|
39 |
-
4.9,3.6,1.4,0.1,Iris-setosa
|
40 |
-
4.4,3.0,1.3,0.2,Iris-setosa
|
41 |
-
5.1,3.4,1.5,0.2,Iris-setosa
|
42 |
-
5.0,3.5,1.3,0.3,Iris-setosa
|
43 |
-
4.5,2.3,1.3,0.3,Iris-setosa
|
44 |
-
4.4,3.2,1.3,0.2,Iris-setosa
|
45 |
-
5.0,3.5,1.6,0.6,Iris-setosa
|
46 |
-
5.1,3.8,1.9,0.4,Iris-setosa
|
47 |
-
4.8,3.0,1.4,0.3,Iris-setosa
|
48 |
-
5.1,3.8,1.6,0.2,Iris-setosa
|
49 |
-
4.6,3.2,1.4,0.2,Iris-setosa
|
50 |
-
5.3,3.7,1.5,0.2,Iris-setosa
|
51 |
-
5.0,3.3,1.4,0.2,Iris-setosa
|
52 |
-
7.0,3.2,4.7,1.4,Iris-versicolor
|
53 |
-
6.4,3.2,4.5,1.5,Iris-versicolor
|
54 |
-
6.9,3.1,4.9,1.5,Iris-versicolor
|
55 |
-
5.5,2.3,4.0,1.3,Iris-versicolor
|
56 |
-
6.5,2.8,4.6,1.5,Iris-versicolor
|
57 |
-
5.7,2.8,4.5,1.3,Iris-versicolor
|
58 |
-
6.3,3.3,4.7,1.6,Iris-versicolor
|
59 |
-
4.9,2.4,3.3,1.0,Iris-versicolor
|
60 |
-
6.6,2.9,4.6,1.3,Iris-versicolor
|
61 |
-
5.2,2.7,3.9,1.4,Iris-versicolor
|
62 |
-
5.0,2.0,3.5,1.0,Iris-versicolor
|
63 |
-
5.9,3.0,4.2,1.5,Iris-versicolor
|
64 |
-
6.0,2.2,4.0,1.0,Iris-versicolor
|
65 |
-
6.1,2.9,4.7,1.4,Iris-versicolor
|
66 |
-
5.6,2.9,3.6,1.3,Iris-versicolor
|
67 |
-
6.7,3.1,4.4,1.4,Iris-versicolor
|
68 |
-
5.6,3.0,4.5,1.5,Iris-versicolor
|
69 |
-
5.8,2.7,4.1,1.0,Iris-versicolor
|
70 |
-
6.2,2.2,4.5,1.5,Iris-versicolor
|
71 |
-
5.6,2.5,3.9,1.1,Iris-versicolor
|
72 |
-
5.9,3.2,4.8,1.8,Iris-versicolor
|
73 |
-
6.1,2.8,4.0,1.3,Iris-versicolor
|
74 |
-
6.3,2.5,4.9,1.5,Iris-versicolor
|
75 |
-
6.1,2.8,4.7,1.2,Iris-versicolor
|
76 |
-
6.4,2.9,4.3,1.3,Iris-versicolor
|
77 |
-
6.6,3.0,4.4,1.4,Iris-versicolor
|
78 |
-
6.8,2.8,4.8,1.4,Iris-versicolor
|
79 |
-
6.7,3.0,5.0,1.7,Iris-versicolor
|
80 |
-
6.0,2.9,4.5,1.5,Iris-versicolor
|
81 |
-
5.7,2.6,3.5,1.0,Iris-versicolor
|
82 |
-
5.5,2.4,3.8,1.1,Iris-versicolor
|
83 |
-
5.5,2.4,3.7,1.0,Iris-versicolor
|
84 |
-
5.8,2.7,3.9,1.2,Iris-versicolor
|
85 |
-
6.0,2.7,5.1,1.6,Iris-versicolor
|
86 |
-
5.4,3.0,4.5,1.5,Iris-versicolor
|
87 |
-
6.0,3.4,4.5,1.6,Iris-versicolor
|
88 |
-
6.7,3.1,4.7,1.5,Iris-versicolor
|
89 |
-
6.3,2.3,4.4,1.3,Iris-versicolor
|
90 |
-
5.6,3.0,4.1,1.3,Iris-versicolor
|
91 |
-
5.5,2.5,4.0,1.3,Iris-versicolor
|
92 |
-
5.5,2.6,4.4,1.2,Iris-versicolor
|
93 |
-
6.1,3.0,4.6,1.4,Iris-versicolor
|
94 |
-
5.8,2.6,4.0,1.2,Iris-versicolor
|
95 |
-
5.0,2.3,3.3,1.0,Iris-versicolor
|
96 |
-
5.6,2.7,4.2,1.3,Iris-versicolor
|
97 |
-
5.7,3.0,4.2,1.2,Iris-versicolor
|
98 |
-
5.7,2.9,4.2,1.3,Iris-versicolor
|
99 |
-
6.2,2.9,4.3,1.3,Iris-versicolor
|
100 |
-
5.1,2.5,3.0,1.1,Iris-versicolor
|
101 |
-
5.7,2.8,4.1,1.3,Iris-versicolor
|
102 |
-
6.3,3.3,6.0,2.5,Iris-virginica
|
103 |
-
5.8,2.7,5.1,1.9,Iris-virginica
|
104 |
-
7.1,3.0,5.9,2.1,Iris-virginica
|
105 |
-
6.3,2.9,5.6,1.8,Iris-virginica
|
106 |
-
6.5,3.0,5.8,2.2,Iris-virginica
|
107 |
-
7.6,3.0,6.6,2.1,Iris-virginica
|
108 |
-
4.9,2.5,4.5,1.7,Iris-virginica
|
109 |
-
7.3,2.9,6.3,1.8,Iris-virginica
|
110 |
-
6.7,2.5,5.8,1.8,Iris-virginica
|
111 |
-
7.2,3.6,6.1,2.5,Iris-virginica
|
112 |
-
6.5,3.2,5.1,2.0,Iris-virginica
|
113 |
-
6.4,2.7,5.3,1.9,Iris-virginica
|
114 |
-
6.8,3.0,5.5,2.1,Iris-virginica
|
115 |
-
5.7,2.5,5.0,2.0,Iris-virginica
|
116 |
-
5.8,2.8,5.1,2.4,Iris-virginica
|
117 |
-
6.4,3.2,5.3,2.3,Iris-virginica
|
118 |
-
6.5,3.0,5.5,1.8,Iris-virginica
|
119 |
-
7.7,3.8,6.7,2.2,Iris-virginica
|
120 |
-
7.7,2.6,6.9,2.3,Iris-virginica
|
121 |
-
6.0,2.2,5.0,1.5,Iris-virginica
|
122 |
-
6.9,3.2,5.7,2.3,Iris-virginica
|
123 |
-
5.6,2.8,4.9,2.0,Iris-virginica
|
124 |
-
7.7,2.8,6.7,2.0,Iris-virginica
|
125 |
-
6.3,2.7,4.9,1.8,Iris-virginica
|
126 |
-
6.7,3.3,5.7,2.1,Iris-virginica
|
127 |
-
7.2,3.2,6.0,1.8,Iris-virginica
|
128 |
-
6.2,2.8,4.8,1.8,Iris-virginica
|
129 |
-
6.1,3.0,4.9,1.8,Iris-virginica
|
130 |
-
6.4,2.8,5.6,2.1,Iris-virginica
|
131 |
-
7.2,3.0,5.8,1.6,Iris-virginica
|
132 |
-
7.4,2.8,6.1,1.9,Iris-virginica
|
133 |
-
7.9,3.8,6.4,2.0,Iris-virginica
|
134 |
-
6.4,2.8,5.6,2.2,Iris-virginica
|
135 |
-
6.3,2.8,5.1,1.5,Iris-virginica
|
136 |
-
6.1,2.6,5.6,1.4,Iris-virginica
|
137 |
-
7.7,3.0,6.1,2.3,Iris-virginica
|
138 |
-
6.3,3.4,5.6,2.4,Iris-virginica
|
139 |
-
6.4,3.1,5.5,1.8,Iris-virginica
|
140 |
-
6.0,3.0,4.8,1.8,Iris-virginica
|
141 |
-
6.9,3.1,5.4,2.1,Iris-virginica
|
142 |
-
6.7,3.1,5.6,2.4,Iris-virginica
|
143 |
-
6.9,3.1,5.1,2.3,Iris-virginica
|
144 |
-
5.8,2.7,5.1,1.9,Iris-virginica
|
145 |
-
6.8,3.2,5.9,2.3,Iris-virginica
|
146 |
-
6.7,3.3,5.7,2.5,Iris-virginica
|
147 |
-
6.7,3.0,5.2,2.3,Iris-virginica
|
148 |
-
6.3,2.5,5.0,1.9,Iris-virginica
|
149 |
-
6.5,3.0,5.2,2.0,Iris-virginica
|
150 |
-
6.2,3.4,5.4,2.3,Iris-virginica
|
151 |
-
5.9,3.0,5.1,1.8,Iris-virginica
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
example/main.py
DELETED
@@ -1,35 +0,0 @@
|
|
1 |
-
import requests
|
2 |
-
|
3 |
-
with open("mushrooms.csv", "rb") as csv:
|
4 |
-
data = csv.read()
|
5 |
-
|
6 |
-
# class,cap-shape,cap-surface,cap-color,bruises,odor,gill-attachment,gill-spacing,gill-size,gill-color,stalk-shape,stalk-root,stalk-surface-above-ring,stalk-surface-below-ring,stalk-color-above-ring,stalk-color-below-ring,veil-type,veil-color,ring-number,ring-type,spore-print-color,population,habitat
|
7 |
-
|
8 |
-
ARGS = {
|
9 |
-
"epochs": 1_000,
|
10 |
-
"hidden_size": 8,
|
11 |
-
"learning_rate": 0.0001,
|
12 |
-
"test_size": 0.1,
|
13 |
-
"activation": "relu",
|
14 |
-
"features": [
|
15 |
-
"cap-shape",
|
16 |
-
"cap-surface",
|
17 |
-
"cap-color",
|
18 |
-
"bruises",
|
19 |
-
"odor",
|
20 |
-
"gill-attachment",
|
21 |
-
"gill-spacing",
|
22 |
-
"gill-size",
|
23 |
-
"gill-color",
|
24 |
-
],
|
25 |
-
"target": "class",
|
26 |
-
"data": data.decode("utf-8"),
|
27 |
-
}
|
28 |
-
|
29 |
-
if __name__ == "__main__":
|
30 |
-
r = requests.post(
|
31 |
-
"http://127.0.0.1:5000/neural-network",
|
32 |
-
json=ARGS, # Send the data as a JSON object
|
33 |
-
)
|
34 |
-
|
35 |
-
print(r.text)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
example/mushrooms.csv
DELETED
The diff for this file is too large to render.
See raw diff
|
|
ml-vis
DELETED
@@ -1 +0,0 @@
|
|
1 |
-
Subproject commit bebe25b27a895c1de71743fbf808b8e592e80806
|
|
|
|
nn/__init__.py
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
from nn.nn import NN
|
2 |
+
from nn.activation import ACTIVATIONS
|
3 |
+
from nn.loss import LOSSES
|
nn/activation.py
CHANGED
@@ -1,46 +1,59 @@
|
|
1 |
-
from typing import Callable
|
2 |
-
from nn.nn import NN
|
3 |
import numpy as np
|
|
|
4 |
|
5 |
|
6 |
-
|
7 |
-
a = nn.activation
|
8 |
-
funcs = {
|
9 |
-
"relu": relu,
|
10 |
-
"sigmoid": sigmoid,
|
11 |
-
"tanh": tanh,
|
12 |
-
}
|
13 |
|
14 |
-
prime_funcs = {
|
15 |
-
"sigmoid": sigmoid_prime,
|
16 |
-
"tanh": tanh_prime,
|
17 |
-
"relu": relu_prime,
|
18 |
-
}
|
19 |
|
20 |
-
|
21 |
-
|
|
|
|
|
22 |
|
|
|
|
|
|
|
23 |
|
24 |
-
def relu(x):
|
25 |
-
return np.maximum(0.0, x)
|
26 |
|
|
|
|
|
|
|
27 |
|
28 |
-
def
|
29 |
-
|
30 |
|
31 |
|
32 |
-
|
33 |
-
|
|
|
34 |
|
|
|
|
|
35 |
|
36 |
-
def sigmoid_prime(x):
|
37 |
-
s = sigmoid(x)
|
38 |
-
return s * (1 - s)
|
39 |
|
|
|
|
|
|
|
40 |
|
41 |
-
def
|
42 |
-
|
|
|
43 |
|
44 |
|
45 |
-
|
46 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
import numpy as np
|
2 |
+
from abc import abstractmethod, ABC
|
3 |
|
4 |
|
5 |
+
__all__ = ["Activation", "Relu", "TanH", "Sigmoid", "SoftMax", "ACTIVATIONS"]
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
|
|
|
|
|
|
|
|
|
|
7 |
|
8 |
+
class Activation(ABC):
|
9 |
+
@abstractmethod
|
10 |
+
def forward(self, X: np.ndarray) -> np.ndarray:
|
11 |
+
pass
|
12 |
|
13 |
+
@abstractmethod
|
14 |
+
def backward(self, X: np.ndarray) -> np.ndarray:
|
15 |
+
pass
|
16 |
|
|
|
|
|
17 |
|
18 |
+
class Relu(Activation):
|
19 |
+
def forward(self, X: np.ndarray) -> np.ndarray:
|
20 |
+
return np.maximum(0, X)
|
21 |
|
22 |
+
def backward(self, X: np.ndarray) -> np.ndarray:
|
23 |
+
return np.where(X > 0, 1, 0)
|
24 |
|
25 |
|
26 |
+
class TanH(Activation):
|
27 |
+
def forward(self, X: np.ndarray) -> np.ndarray:
|
28 |
+
return np.tanh(X)
|
29 |
|
30 |
+
def backward(self, X: np.ndarray) -> np.ndarray:
|
31 |
+
return 1 - self.forward(X) ** 2
|
32 |
|
|
|
|
|
|
|
33 |
|
34 |
+
class Sigmoid(Activation):
|
35 |
+
def forward(self, X: np.ndarray) -> np.ndarray:
|
36 |
+
return 1.0 / (1.0 + np.exp(-X))
|
37 |
|
38 |
+
def backward(self, X: np.ndarray) -> np.ndarray:
|
39 |
+
s = self.forward(X)
|
40 |
+
return s - (1 - s)
|
41 |
|
42 |
|
43 |
+
class SoftMax(Activation):
|
44 |
+
def forward(self, X: np.ndarray) -> np.ndarray:
|
45 |
+
exps = np.exp(
|
46 |
+
X - np.max(X, axis=1, keepdims=True)
|
47 |
+
) # Avoid numerical instability
|
48 |
+
return exps / np.sum(exps, axis=1, keepdims=True)
|
49 |
+
|
50 |
+
def backward(self, X: np.ndarray) -> np.ndarray:
|
51 |
+
return X
|
52 |
+
|
53 |
+
|
54 |
+
ACTIVATIONS: dict[str, Activation] = {
|
55 |
+
"Relu": Relu(),
|
56 |
+
"Sigmoid": Sigmoid(),
|
57 |
+
"Tanh": TanH(),
|
58 |
+
"SoftMax": SoftMax(),
|
59 |
+
}
|
nn/loss.py
ADDED
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from abc import ABC, abstractmethod
|
2 |
+
from nn.activation import SoftMax
|
3 |
+
import numpy as np
|
4 |
+
|
5 |
+
|
6 |
+
__all__ = ["Loss", "MSE", "CrossEntropy", "LOSSES"]
|
7 |
+
|
8 |
+
|
9 |
+
class Loss(ABC):
|
10 |
+
@abstractmethod
|
11 |
+
def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
|
12 |
+
pass
|
13 |
+
|
14 |
+
@abstractmethod
|
15 |
+
def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
|
16 |
+
pass
|
17 |
+
|
18 |
+
|
19 |
+
class MSE(Loss):
|
20 |
+
def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
|
21 |
+
return np.sum(np.square(y_hat - y_true)) / y_true.shape[0]
|
22 |
+
|
23 |
+
def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
|
24 |
+
return (y_hat - y_true) * (2 / y_true.shape[0])
|
25 |
+
|
26 |
+
|
27 |
+
class CrossEntropy(Loss):
|
28 |
+
def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
|
29 |
+
y_hat = np.asarray(y_hat)
|
30 |
+
y_true = np.asarray(y_true)
|
31 |
+
m = y_true.shape[0]
|
32 |
+
p = self._softmax(y_hat)
|
33 |
+
log_likelihood = -np.log(p[range(m), y_true.argmax(axis=1)])
|
34 |
+
loss = np.sum(log_likelihood) / m
|
35 |
+
return loss
|
36 |
+
|
37 |
+
def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
|
38 |
+
y_hat = np.asarray(y_hat)
|
39 |
+
y_true = np.asarray(y_true)
|
40 |
+
return (y_hat - y_true) / y_true.shape[0]
|
41 |
+
|
42 |
+
@staticmethod
|
43 |
+
def _softmax(X: np.ndarray) -> np.ndarray:
|
44 |
+
return SoftMax().forward(X)
|
45 |
+
|
46 |
+
|
47 |
+
LOSSES: dict[str, Loss] = {
|
48 |
+
"MSE": MSE(),
|
49 |
+
"CrossEntropy": CrossEntropy(),
|
50 |
+
}
|
nn/nn.py
CHANGED
@@ -1,63 +1,163 @@
|
|
1 |
-
from typing import
|
2 |
-
from
|
3 |
-
import
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
|
5 |
|
6 |
class NN:
|
7 |
def __init__(
|
8 |
self,
|
9 |
epochs: int,
|
10 |
-
hidden_size: int,
|
11 |
learning_rate: float,
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
|
|
18 |
self.epochs = epochs
|
19 |
-
self.hidden_size = hidden_size
|
20 |
self.learning_rate = learning_rate
|
21 |
-
self.
|
22 |
-
self.
|
23 |
-
self.
|
24 |
-
self.
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
self.
|
34 |
-
self.
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
self.
|
48 |
-
self.
|
49 |
-
|
50 |
-
def
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
62 |
"""
|
63 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from typing import Optional
|
2 |
+
from nn.activation import ACTIVATIONS, Activation
|
3 |
+
from nn.loss import LOSSES, Loss
|
4 |
+
import numpy as np
|
5 |
+
|
6 |
+
import gradio as gr
|
7 |
+
|
8 |
+
|
9 |
+
DTYPE = np.float32
|
10 |
|
11 |
|
12 |
class NN:
|
13 |
def __init__(
|
14 |
self,
|
15 |
epochs: int,
|
|
|
16 |
learning_rate: float,
|
17 |
+
hidden_size: int,
|
18 |
+
input_size: int,
|
19 |
+
output_size: int,
|
20 |
+
activation_fn: str,
|
21 |
+
loss_fn: str,
|
22 |
+
seed: int,
|
23 |
+
) -> None:
|
24 |
self.epochs = epochs
|
|
|
25 |
self.learning_rate = learning_rate
|
26 |
+
self.hidden_size = hidden_size
|
27 |
+
self.input_size = input_size
|
28 |
+
self.output_size = output_size
|
29 |
+
self.seed = seed
|
30 |
+
|
31 |
+
# try to get activation function and loss funciton
|
32 |
+
act_fn = ACTIVATIONS.get(activation_fn, None)
|
33 |
+
if act_fn is None:
|
34 |
+
raise KeyError(f"Invalid Activation function '{activation_fn}'")
|
35 |
+
loss_fn = LOSSES.get(loss_fn, None)
|
36 |
+
if loss_fn is None:
|
37 |
+
raise KeyError(f"Invalid Activation function '{activation_fn}'")
|
38 |
+
self._activation_fn: Activation = act_fn
|
39 |
+
self._loss_fn: Loss = loss_fn
|
40 |
+
|
41 |
+
self._loss_history = list()
|
42 |
+
self._weight_history = {
|
43 |
+
"wo": [],
|
44 |
+
"wh": [],
|
45 |
+
"bo": [],
|
46 |
+
"bh": [],
|
47 |
+
}
|
48 |
+
|
49 |
+
self._wo: Optional[np.ndarray] = None
|
50 |
+
self._wh: Optional[np.ndarray] = None
|
51 |
+
self._bo: Optional[np.ndarray] = None
|
52 |
+
self._bh: Optional[np.ndarray] = None
|
53 |
+
self._init_weights_and_biases()
|
54 |
+
|
55 |
+
def _init_weights_and_biases(self) -> None:
|
56 |
+
"""
|
57 |
+
NN._init_weights_and_biases(): Should only be ran once, right before training loop
|
58 |
+
in order to initialize the weights and biases randomly.
|
59 |
+
|
60 |
+
params:
|
61 |
+
NN object with hidden layer size, output size, and input size
|
62 |
+
defined.
|
63 |
+
|
64 |
+
returns:
|
65 |
+
self, modifies _bh, _bo, _wo, _wh NN attributes in place.
|
66 |
+
"""
|
67 |
+
np.random.seed(self.seed)
|
68 |
+
self._bh = np.zeros((1, self.hidden_size), dtype=DTYPE)
|
69 |
+
self._bo = np.zeros((1, self.output_size), dtype=DTYPE)
|
70 |
+
self._wh = np.asarray(
|
71 |
+
np.random.randn(self.input_size, self.hidden_size)
|
72 |
+
* np.sqrt(2 / self.input_size),
|
73 |
+
dtype=DTYPE,
|
74 |
+
)
|
75 |
+
self._wo = np.asarray(
|
76 |
+
np.random.randn(self.hidden_size, self.output_size)
|
77 |
+
* np.sqrt(2 / self.hidden_size),
|
78 |
+
dtype=DTYPE,
|
79 |
+
)
|
80 |
+
return
|
81 |
+
|
82 |
+
def _forward(self, X_train: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
|
83 |
"""
|
84 |
+
_forward(X_train): ran as the first step of each epoch during training.
|
85 |
+
|
86 |
+
params:
|
87 |
+
X_train: np.ndarray -> data that we are training the NN on.
|
88 |
+
|
89 |
+
returns:
|
90 |
+
output layer np array containing the predicted outputs calculated using
|
91 |
+
the weights and biases of the current epoch.
|
92 |
+
"""
|
93 |
+
assert self._activation_fn is not None
|
94 |
+
|
95 |
+
# hidden layer
|
96 |
+
hidden_layer_output = self._activation_fn.forward(
|
97 |
+
np.dot(X_train, self._wh) + self._bh
|
98 |
+
)
|
99 |
+
# output layer (prediction layer)
|
100 |
+
y_hat = self._activation_fn.forward(
|
101 |
+
np.dot(hidden_layer_output, self._wo) + self._bo
|
102 |
+
)
|
103 |
+
return y_hat, hidden_layer_output
|
104 |
+
|
105 |
+
def _backward(
|
106 |
+
self,
|
107 |
+
X_train: np.ndarray,
|
108 |
+
y_hat: np.ndarray,
|
109 |
+
y_train: np.ndarray,
|
110 |
+
hidden_output: np.ndarray,
|
111 |
+
) -> None:
|
112 |
+
assert self._activation_fn is not None
|
113 |
+
assert self._wo is not None
|
114 |
+
assert self._loss_fn is not None
|
115 |
+
|
116 |
+
# Calculate the error at the output
|
117 |
+
# This should be the derivative of the loss function with respect to the output of the network
|
118 |
+
error_output = self._loss_fn.backward(
|
119 |
+
y_hat, y_train
|
120 |
+
) * self._activation_fn.backward(y_hat)
|
121 |
+
|
122 |
+
# Calculate gradients for output layer weights and biases
|
123 |
+
wo_prime = np.dot(hidden_output.T, error_output) * self.learning_rate
|
124 |
+
bo_prime = np.sum(error_output, axis=0, keepdims=True) * self.learning_rate
|
125 |
+
|
126 |
+
# Propagate the error back to the hidden layer
|
127 |
+
error_hidden = np.dot(error_output, self._wo.T) * self._activation_fn.backward(
|
128 |
+
hidden_output
|
129 |
+
)
|
130 |
+
|
131 |
+
# Calculate gradients for hidden layer weights and biases
|
132 |
+
wh_prime = np.dot(X_train.T, error_hidden) * self.learning_rate
|
133 |
+
bh_prime = np.sum(error_hidden, axis=0, keepdims=True) * self.learning_rate
|
134 |
+
|
135 |
+
# Update weights and biases
|
136 |
+
self._wo -= wo_prime
|
137 |
+
self._wh -= wh_prime
|
138 |
+
self._bo -= bo_prime
|
139 |
+
self._bh -= bh_prime
|
140 |
+
|
141 |
+
def train(self, X_train: np.ndarray, y_train: np.ndarray) -> "NN":
|
142 |
+
assert self._loss_fn is not None
|
143 |
+
|
144 |
+
for _ in gr.Progress().tqdm(range(self.epochs)):
|
145 |
+
y_hat, hidden_output = self._forward(X_train=X_train)
|
146 |
+
loss = self._loss_fn.forward(y_hat=y_hat, y_true=y_train)
|
147 |
+
self._loss_history.append(loss)
|
148 |
+
self._backward(
|
149 |
+
X_train=X_train,
|
150 |
+
y_hat=y_hat,
|
151 |
+
y_train=y_train,
|
152 |
+
hidden_output=hidden_output,
|
153 |
+
)
|
154 |
+
|
155 |
+
# keep track of weights an biases at each epoch for visualization
|
156 |
+
self._weight_history["wo"].append(self._wo[0, 0])
|
157 |
+
self._weight_history["wh"].append(self._wh[0, 0])
|
158 |
+
self._weight_history["bo"].append(self._bo[0, 0])
|
159 |
+
self._weight_history["bh"].append(self._bh[0, 0])
|
160 |
+
return self
|
161 |
+
|
162 |
+
def predict(self, X_test: np.ndarray) -> np.ndarray:
|
163 |
+
return self._forward(X_train=X_test)[0]
|
nn/test.py
ADDED
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from nn.nn import NN
|
2 |
+
import unittest
|
3 |
+
|
4 |
+
TEST_NN = NN(
|
5 |
+
epochs=100,
|
6 |
+
learning_rate=0.001,
|
7 |
+
hidden_size=8,
|
8 |
+
input_size=2,
|
9 |
+
output_size=1,
|
10 |
+
activation_fn="Sigmoid",
|
11 |
+
loss_fn="MSE",
|
12 |
+
)
|
13 |
+
|
14 |
+
|
15 |
+
class TestNN(unittest.TestCase):
|
16 |
+
def test_init_w_b(self) -> None:
|
17 |
+
return
|
18 |
+
|
19 |
+
def test_forward(self) -> None:
|
20 |
+
return
|
21 |
+
|
22 |
+
def test_backward(self) -> None:
|
23 |
+
return
|
24 |
+
|
25 |
+
def test_train(self) -> None:
|
26 |
+
return
|
27 |
+
|
28 |
+
|
29 |
+
if __name__ == "__main__":
|
30 |
+
unittest.main()
|
nn/train.py
DELETED
@@ -1,127 +0,0 @@
|
|
1 |
-
from sklearn.model_selection import train_test_split
|
2 |
-
from sklearn.metrics import log_loss
|
3 |
-
from typing import Callable
|
4 |
-
from nn.nn import NN
|
5 |
-
import numpy as np
|
6 |
-
|
7 |
-
|
8 |
-
def init_weights_biases(nn: NN):
|
9 |
-
np.random.seed(0)
|
10 |
-
bh = np.zeros((1, nn.hidden_size))
|
11 |
-
bo = np.zeros((1, nn.output_size))
|
12 |
-
wh = np.random.randn(nn.input_size, nn.hidden_size) * \
|
13 |
-
np.sqrt(2 / nn.input_size)
|
14 |
-
wo = np.random.randn(nn.hidden_size, nn.output_size) * \
|
15 |
-
np.sqrt(2 / nn.hidden_size)
|
16 |
-
return wh, wo, bh, bo
|
17 |
-
|
18 |
-
|
19 |
-
def train(nn: NN) -> dict:
|
20 |
-
wh, wo, bh, bo = init_weights_biases(nn=nn)
|
21 |
-
|
22 |
-
X_train, X_test, y_train, y_test = train_test_split(
|
23 |
-
nn.X.to_numpy(),
|
24 |
-
nn.y_dummy.to_numpy(),
|
25 |
-
test_size=nn.test_size,
|
26 |
-
random_state=0,
|
27 |
-
)
|
28 |
-
|
29 |
-
accuracy_scores = []
|
30 |
-
loss_hist: list[float] = []
|
31 |
-
for _ in range(nn.epochs):
|
32 |
-
# compute hidden output
|
33 |
-
hidden_output = compute_node(
|
34 |
-
data=X_train,
|
35 |
-
weights=wh,
|
36 |
-
biases=bh,
|
37 |
-
func=nn.func,
|
38 |
-
)
|
39 |
-
|
40 |
-
# compute output layer
|
41 |
-
y_hat = compute_node(
|
42 |
-
data=hidden_output,
|
43 |
-
weights=wo,
|
44 |
-
biases=bo,
|
45 |
-
func=nn.func,
|
46 |
-
)
|
47 |
-
# compute error & store it
|
48 |
-
error = y_hat - y_train
|
49 |
-
loss = log_loss(y_true=y_train, y_pred=y_hat)
|
50 |
-
accuracy = accuracy_score(y_true=y_train, y_pred=y_hat)
|
51 |
-
accuracy_scores.append(accuracy)
|
52 |
-
loss_hist.append(loss)
|
53 |
-
|
54 |
-
# compute derivatives of weights & biases
|
55 |
-
# update weights & biases using gradient descent after
|
56 |
-
# computing derivatives.
|
57 |
-
dwo = nn.learning_rate * output_weight_prime(hidden_output, error)
|
58 |
-
|
59 |
-
# Use NumPy to sum along the first axis (axis=0)
|
60 |
-
# and then reshape to match the shape of bo
|
61 |
-
dbo = nn.learning_rate * np.sum(output_bias_prime(error), axis=0)
|
62 |
-
|
63 |
-
dhidden = np.dot(error, wo.T) * nn.func_prime(hidden_output)
|
64 |
-
dwh = nn.learning_rate * hidden_weight_prime(X_train, dhidden)
|
65 |
-
dbh = nn.learning_rate * hidden_bias_prime(dhidden)
|
66 |
-
|
67 |
-
wh -= dwh
|
68 |
-
wo -= dwo
|
69 |
-
bh -= dbh
|
70 |
-
bo -= dbo
|
71 |
-
|
72 |
-
# compute final predictions on data not seen
|
73 |
-
hidden_output_test = compute_node(
|
74 |
-
data=X_test,
|
75 |
-
weights=wh,
|
76 |
-
biases=bh,
|
77 |
-
func=nn.func,
|
78 |
-
)
|
79 |
-
y_hat = compute_node(
|
80 |
-
data=hidden_output_test,
|
81 |
-
weights=wo,
|
82 |
-
biases=bo,
|
83 |
-
func=nn.func,
|
84 |
-
)
|
85 |
-
|
86 |
-
return {
|
87 |
-
"loss_hist": loss_hist,
|
88 |
-
"log_loss": log_loss(y_true=y_test, y_pred=y_hat),
|
89 |
-
"accuracy_scores": accuracy_scores,
|
90 |
-
"test_accuracy": accuracy_score(y_true=y_test, y_pred=y_hat)
|
91 |
-
}
|
92 |
-
|
93 |
-
|
94 |
-
def compute_node(data: np.array, weights: np.array, biases: np.array, func: Callable) -> np.array:
|
95 |
-
return func(np.dot(data, weights) + biases)
|
96 |
-
|
97 |
-
|
98 |
-
def mean_squared_error(y: np.array, y_hat: np.array) -> np.array:
|
99 |
-
return np.mean((y - y_hat) ** 2)
|
100 |
-
|
101 |
-
|
102 |
-
def hidden_bias_prime(error):
|
103 |
-
return np.sum(error, axis=0)
|
104 |
-
|
105 |
-
|
106 |
-
def output_bias_prime(error):
|
107 |
-
return np.sum(error, axis=0)
|
108 |
-
|
109 |
-
|
110 |
-
def hidden_weight_prime(data, error):
|
111 |
-
return np.dot(data.T, error)
|
112 |
-
|
113 |
-
|
114 |
-
def output_weight_prime(hidden_output, error):
|
115 |
-
return np.dot(hidden_output.T, error)
|
116 |
-
|
117 |
-
|
118 |
-
def accuracy_score(y_true, y_pred):
|
119 |
-
# Ensure y_true and y_pred have the same shape
|
120 |
-
if y_true.shape != y_pred.shape:
|
121 |
-
raise ValueError("Input shapes do not match.")
|
122 |
-
|
123 |
-
# Calculate the accuracy
|
124 |
-
num_samples = len(y_true)
|
125 |
-
num_correct = np.sum(y_true == y_pred)
|
126 |
-
|
127 |
-
return num_correct / num_samples
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
requirements.txt
CHANGED
@@ -1,8 +1,4 @@
|
|
1 |
-
|
2 |
-
numpy==1.
|
3 |
-
|
4 |
-
|
5 |
-
scikit_learn==1.3.1
|
6 |
-
gunicorn==21.2.0
|
7 |
-
Werkzeug==2.2.2
|
8 |
-
Flask_Cors==3.0.10
|
|
|
1 |
+
gradio==4.26.0
|
2 |
+
numpy==1.26.4
|
3 |
+
plotly==5.20.0
|
4 |
+
scikit_learn==1.4.1.post1
|
|
|
|
|
|
|
|
vis.py
ADDED
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import plotly.express as px
|
2 |
+
from sklearn import datasets
|
3 |
+
from sklearn.preprocessing import StandardScaler, OneHotEncoder
|
4 |
+
import numpy as np
|
5 |
+
import os
|
6 |
+
|
7 |
+
|
8 |
+
def iris_3d_scatter():
|
9 |
+
df = px.data.iris()
|
10 |
+
fig = px.scatter_3d(
|
11 |
+
df,
|
12 |
+
x="sepal_length",
|
13 |
+
y="sepal_width",
|
14 |
+
z="petal_width",
|
15 |
+
color="species",
|
16 |
+
size="petal_length",
|
17 |
+
size_max=18,
|
18 |
+
)
|
19 |
+
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0))
|
20 |
+
return fig
|