Jensen-holm commited on
Commit
e11b37a
1 Parent(s): 28a7ac6

features added:

Browse files

- batch size argument
- new example that is more performant and better actually

Files changed (6) hide show
  1. README.md +5 -30
  2. app.py +54 -64
  3. nn/__init__.py +3 -3
  4. nn/nn.py +36 -22
  5. vis.py +3 -7
  6. warning.md +18 -0
README.md CHANGED
@@ -10,35 +10,10 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- ## What is this? <br>
14
 
15
- The Numpy-Neuron is a GUI built around a neural network framework that I have built from scratch
16
- in [numpy](https://numpy.org/). In this GUI, you can test different hyper parameters that will be fed to this framework and used
17
- to train a neural network on the [MNIST](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) dataset of 8x8 pixel images.
18
 
19
- ## ⚠️ PLEASE READ ⚠️
20
- This application is impossibly slow on the HuggingFace CPU instance that it is running on. It is advised to clone the
21
- repository and run it locally.
22
-
23
- In order to get a decent classification score on the validation set of the MNIST data (hard coded to 20%), you will have to
24
- do somewhere between 15,000 epochs and 50,000 epochs with a learning rate around 0.001, and a hidden layer size
25
- over 10. (roughly the example that I have provided). Running this many epochs with a hidden layer of that size
26
- is pretty expensive on 2 cpu cores that this space has. So if you are actually curious, you might want to clone
27
- this and run it locally because it will be much much faster.
28
-
29
- `git clone https://huggingface.co/spaces/Jensen-holm/Numpy-Neuron`
30
-
31
- After cloning, you will have to install the dependencies from requirements.txt into your environment. (venv reccommended)
32
-
33
- `pip3 install -r requirements.txt`
34
-
35
- Then, you can run the application on local host with the following command.
36
-
37
- `python3 app.py`
38
-
39
-
40
- ## Development
41
-
42
- In order to push from this GitHub repo to the hugging face space:
43
-
44
- `git push --force space main`
 
10
  license: mit
11
  ---
12
 
13
+ ## Dev Notes
14
 
15
+ The remote added to this repo so that it runs on hugging face spaces
16
+ `git remote add space git@hf.co:spaces/Jensen-holm/Numpy-Neuron`
 
17
 
18
+ The command to force push to that space
19
+ `git push --force space main`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -13,10 +13,13 @@ from vis import ( # classification visualization funcitons
13
  )
14
 
15
 
 
 
 
16
  def _preprocess_digits(
17
  seed: int,
18
- ) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
19
- digits = datasets.load_digits()
20
  n_samples = len(digits.images)
21
  data = digits.images.reshape((n_samples, -1))
22
  y = OneHotEncoder().fit_transform(digits.target.reshape(-1, 1)).toarray()
@@ -33,36 +36,43 @@ X_train, X_test, y_train, y_test = _preprocess_digits(seed=1)
33
 
34
 
35
  def classification(
36
- Seed: int = 0,
37
- Hidden_Layer_Activation: str = "Relu",
38
- Activation_Func: str = "SoftMax",
39
- Loss_Func: str = "CrossEntropyWithLogitsLoss",
40
- Epochs: int = 100,
41
- Hidden_Size: int = 8,
42
- Learning_Rate: float = 0.001,
 
43
  ) -> tuple[gr.Plot, gr.Plot, gr.Label]:
44
- assert Activation_Func in nn.ACTIVATIONS
45
- assert Hidden_Layer_Activation in nn.ACTIVATIONS
46
- assert Loss_Func in nn.LOSSES
47
-
48
- classifier = nn.NN(
49
- epochs=Epochs,
50
- learning_rate=Learning_Rate,
51
- hidden_activation_fn=nn.ACTIVATIONS[Hidden_Layer_Activation],
52
- activation_fn=nn.ACTIVATIONS[Activation_Func],
53
- loss_fn=nn.LOSSES[Loss_Func],
54
- hidden_size=Hidden_Size,
55
- input_size=64, # 8x8 image of pixels
 
 
 
 
 
56
  output_size=10, # digits 0-9
57
- seed=Seed,
58
  )
59
- classifier.train(X_train=X_train, y_train=y_train)
60
 
61
- pred = classifier.predict(X_test=X_test)
 
 
62
  hits_and_misses_fig = hits_and_misses(y_pred=pred, y_true=y_test)
63
  loss_fig = loss_history_plt(
64
- loss_history=classifier._loss_history,
65
- loss_fn_name=classifier.loss_fn.__class__.__name__,
66
  )
67
 
68
  label_dict = make_confidence_label(y_pred=pred, y_test=y_test)
@@ -74,38 +84,13 @@ def classification(
74
 
75
 
76
  if __name__ == "__main__":
 
 
 
 
77
  with gr.Blocks() as interface:
78
  gr.Markdown("# Numpy Neuron")
79
- gr.Markdown(
80
- """
81
- ## What is this? <br>
82
-
83
- The Backpropagation Playground is a GUI built around a neural network framework that I have built from scratch
84
- in [numpy](https://numpy.org/). In this GUI, you can test different hyper parameters that will be fed to this framework and used
85
- to train a neural network on the [MNIST](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) dataset of 8x8 pixel images.
86
-
87
- ## ⚠️ PLEASE READ ⚠️
88
- This application is impossibly slow on the HuggingFace CPU instance that it is running on. It is advised to clone the
89
- repository and run it locally.
90
-
91
- In order to get a decent classification score on the validation set of the MNIST data (hard coded to 20%), you will have to
92
- do somewhere between 15,000 epochs and 50,000 epochs with a learning rate around 0.001, and a hidden layer size
93
- over 10. (roughly the example that I have provided). Running this many epochs with a hidden layer of that size
94
- is pretty expensive on 2 cpu cores that this space has. So if you are actually curious, you might want to clone
95
- this and run it locally because it will be much much faster.
96
-
97
- `git clone https://huggingface.co/spaces/Jensen-holm/Numpy-Neuron`
98
-
99
- After cloning, you will have to install the dependencies from requirements.txt into your environment. (venv reccommended)
100
-
101
- `pip3 install -r requirements.txt`
102
-
103
- Then, you can run the application on localhost with the following command.
104
-
105
- `python3 app.py`
106
-
107
- """
108
- )
109
 
110
  with gr.Tab("Classification"):
111
  with gr.Row():
@@ -120,11 +105,12 @@ if __name__ == "__main__":
120
  with gr.Column():
121
  numeric_inputs = [
122
  gr.Slider(
123
- minimum=100, maximum=100_000, step=50, label="Epochs"
124
  ),
125
  gr.Slider(
126
  minimum=2, maximum=64, step=2, label="Hidden Network Size"
127
  ),
 
128
  gr.Number(minimum=0.00001, maximum=1.5, label="Learning Rate"),
129
  ]
130
 
@@ -132,9 +118,12 @@ if __name__ == "__main__":
132
  fn_inputs = [
133
  gr.Dropdown(
134
  choices=["Relu", "Sigmoid", "TanH"],
135
- label="Hidden Layer Activation",
 
 
 
 
136
  ),
137
- gr.Dropdown(choices=["SoftMax", "Sigmoid"], label="Output Activation"),
138
  gr.Dropdown(
139
  choices=["CrossEntropy", "CrossEntropyWithLogitsLoss"],
140
  label="Loss Function",
@@ -151,12 +140,13 @@ if __name__ == "__main__":
151
  [
152
  2,
153
  "Relu",
154
- "SoftMax",
155
  "CrossEntropyWithLogitsLoss",
156
- 15_000,
157
- 14,
158
- 0.001,
159
- ]
 
160
  ],
161
  inputs=inputs,
162
  )
 
13
  )
14
 
15
 
16
+ type number = float | int
17
+
18
+
19
  def _preprocess_digits(
20
  seed: int,
21
+ ) -> tuple[np.ndarray, ...]:
22
+ digits = datasets.load_digits(as_frame=False)
23
  n_samples = len(digits.images)
24
  data = digits.images.reshape((n_samples, -1))
25
  y = OneHotEncoder().fit_transform(digits.target.reshape(-1, 1)).toarray()
 
36
 
37
 
38
  def classification(
39
+ seed: int,
40
+ hidden_layer_activation_fn: str,
41
+ output_layer_activation_fn: str,
42
+ loss_fn_str: str,
43
+ epochs: int,
44
+ hidden_size: int,
45
+ batch_size: number,
46
+ learning_rate: number,
47
  ) -> tuple[gr.Plot, gr.Plot, gr.Label]:
48
+ assert hidden_layer_activation_fn in nn.ACTIVATIONS
49
+ assert output_layer_activation_fn in nn.ACTIVATIONS
50
+ assert loss_fn_str in nn.LOSSES
51
+
52
+ loss_fn: nn.Loss = nn.LOSSES[loss_fn_str]
53
+ h_act_fn: nn.Activation = nn.ACTIVATIONS[hidden_layer_activation_fn]
54
+ o_act_fn: nn.Activation = nn.ACTIVATIONS[output_layer_activation_fn]
55
+
56
+ nn_classifier = nn.NN(
57
+ epochs=epochs,
58
+ hidden_size=hidden_size,
59
+ batch_size=batch_size,
60
+ learning_rate=learning_rate,
61
+ loss_fn=loss_fn,
62
+ hidden_activation_fn=h_act_fn,
63
+ output_activation_fn=o_act_fn,
64
+ input_size=64, # 8x8 pixel grid images
65
  output_size=10, # digits 0-9
66
+ seed=seed,
67
  )
 
68
 
69
+ nn_classifier.train(X_train=X_train, y_train=y_train)
70
+
71
+ pred = nn_classifier.predict(X_test=X_test)
72
  hits_and_misses_fig = hits_and_misses(y_pred=pred, y_true=y_test)
73
  loss_fig = loss_history_plt(
74
+ loss_history=nn_classifier._loss_history,
75
+ loss_fn_name=nn_classifier.loss_fn.__class__.__name__,
76
  )
77
 
78
  label_dict = make_confidence_label(y_pred=pred, y_test=y_test)
 
84
 
85
 
86
  if __name__ == "__main__":
87
+ def _open_warning() -> str:
88
+ with open("warning.md", "r") as f:
89
+ return f.read()
90
+
91
  with gr.Blocks() as interface:
92
  gr.Markdown("# Numpy Neuron")
93
+ gr.Markdown(_open_warning())
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
 
95
  with gr.Tab("Classification"):
96
  with gr.Row():
 
105
  with gr.Column():
106
  numeric_inputs = [
107
  gr.Slider(
108
+ minimum=100, maximum=10_000, step=50, label="Epochs"
109
  ),
110
  gr.Slider(
111
  minimum=2, maximum=64, step=2, label="Hidden Network Size"
112
  ),
113
+ gr.Slider(minimum=0.1, maximum=1, step=0.1, label="Batch Size"),
114
  gr.Number(minimum=0.00001, maximum=1.5, label="Learning Rate"),
115
  ]
116
 
 
118
  fn_inputs = [
119
  gr.Dropdown(
120
  choices=["Relu", "Sigmoid", "TanH"],
121
+ label="Hidden Layer Activation Function",
122
+ ),
123
+ gr.Dropdown(
124
+ choices=["SoftMax", "Sigmoid"],
125
+ label="Output Activation Function",
126
  ),
 
127
  gr.Dropdown(
128
  choices=["CrossEntropy", "CrossEntropyWithLogitsLoss"],
129
  label="Loss Function",
 
140
  [
141
  2,
142
  "Relu",
143
+ "Sigmoid",
144
  "CrossEntropyWithLogitsLoss",
145
+ 2_000,
146
+ 16,
147
+ 1.0,
148
+ 0.01,
149
+ ],
150
  ],
151
  inputs=inputs,
152
  )
nn/__init__.py CHANGED
@@ -1,3 +1,3 @@
1
- from nn.nn import NN
2
- from nn.activation import ACTIVATIONS
3
- from nn.loss import LOSSES
 
1
+ from nn.loss import *
2
+ from nn.activation import *
3
+ from nn.nn import *
nn/nn.py CHANGED
@@ -15,9 +15,10 @@ class NN:
15
  learning_rate: float
16
  hidden_size: int
17
  input_size: int
 
18
  output_size: int
19
  hidden_activation_fn: Activation
20
- activation_fn: Activation
21
  loss_fn: Loss
22
  seed: int
23
 
@@ -26,19 +27,26 @@ class NN:
26
  _wh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
27
  _bo: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
28
  _bh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
29
- _weight_history: dict[str, list[np.ndarray]] = field(
30
- default_factory=lambda: {
31
- "wo": [],
32
- "wh": [],
33
- "bo": [],
34
- "bh": [],
35
- },
36
- init=False,
37
- )
 
 
38
 
39
  def __post_init__(self) -> None:
 
40
  self._init_weights_and_biases()
41
 
 
 
 
 
42
  def _init_weights_and_biases(self) -> None:
43
  """
44
  NN._init_weights_and_biases(): Should only be ran once, right before training loop
@@ -64,7 +72,6 @@ class NN:
64
  * np.sqrt(2 / self.hidden_size),
65
  dtype=DTYPE,
66
  )
67
- return
68
 
69
  # def _forward(self, X_train: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
70
  # # Determine the activation function for the hidden layer
@@ -116,16 +123,16 @@ class NN:
116
  bo_prime = np.sum(error_output, axis=0, keepdims=True) * self.learning_rate
117
 
118
  # Propagate the error back to the hidden layer
119
- error_hidden = np.dot(error_output, self._wo.T) * self.activation_fn.backward(
120
- hidden_output
121
- )
122
 
123
  # Calculate gradients for hidden layer weights and biases
124
  wh_prime = np.dot(X_train.T, error_hidden) * self.learning_rate
125
  bh_prime = np.sum(error_hidden, axis=0, keepdims=True) * self.learning_rate
126
 
127
  # Gradient clipping to prevent overflow
128
- max_norm = 1.0 # You can adjust this threshold
129
  wo_prime = np.clip(wo_prime, -max_norm, max_norm)
130
  bo_prime = np.clip(bo_prime, -max_norm, max_norm)
131
  wh_prime = np.clip(wh_prime, -max_norm, max_norm)
@@ -137,17 +144,24 @@ class NN:
137
  self._bo -= bo_prime
138
  self._bh -= bh_prime
139
 
140
- # TODO: implement batch size in training, this will speed up the training loop
141
- # quite a bit I believe
142
  def train(self, X_train: np.ndarray, y_train: np.ndarray) -> "NN":
143
  for _ in gr.Progress().tqdm(range(self.epochs)):
144
- y_hat, hidden_output = self._forward(X_train=X_train)
145
- loss = self.loss_fn.forward(y_hat=y_hat, y_true=y_train)
 
 
 
 
 
 
 
 
 
146
  self._loss_history.append(loss)
147
  self._backward(
148
- X_train=X_train,
149
  y_hat=y_hat,
150
- y_train=y_train,
151
  hidden_output=hidden_output,
152
  )
153
 
@@ -162,4 +176,4 @@ class NN:
162
 
163
  def predict(self, X_test: np.ndarray) -> np.ndarray:
164
  pred, _ = self._forward(X_test)
165
- return self.activation_fn.forward(pred)
 
15
  learning_rate: float
16
  hidden_size: int
17
  input_size: int
18
+ batch_size: float
19
  output_size: int
20
  hidden_activation_fn: Activation
21
+ output_activation_fn: Activation
22
  loss_fn: Loss
23
  seed: int
24
 
 
27
  _wh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
28
  _bo: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
29
  _bh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
30
+
31
+ # not currently using this, see TODO: at bottom of this file
32
+ # _weight_history: dict[str, list[np.ndarray]] = field(
33
+ # default_factory=lambda: {
34
+ # "wo": [],
35
+ # "wh": [],
36
+ # "bo": [],
37
+ # "bh": [],
38
+ # },
39
+ # init=False,
40
+ # )
41
 
42
  def __post_init__(self) -> None:
43
+ assert 0 < self.batch_size <= 1
44
  self._init_weights_and_biases()
45
 
46
+ @classmethod
47
+ def from_dict(cls, args: dict) -> "NN":
48
+ return cls(**args)
49
+
50
  def _init_weights_and_biases(self) -> None:
51
  """
52
  NN._init_weights_and_biases(): Should only be ran once, right before training loop
 
72
  * np.sqrt(2 / self.hidden_size),
73
  dtype=DTYPE,
74
  )
 
75
 
76
  # def _forward(self, X_train: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
77
  # # Determine the activation function for the hidden layer
 
123
  bo_prime = np.sum(error_output, axis=0, keepdims=True) * self.learning_rate
124
 
125
  # Propagate the error back to the hidden layer
126
+ error_hidden = np.dot(
127
+ error_output, self._wo.T
128
+ ) * self.output_activation_fn.backward(hidden_output)
129
 
130
  # Calculate gradients for hidden layer weights and biases
131
  wh_prime = np.dot(X_train.T, error_hidden) * self.learning_rate
132
  bh_prime = np.sum(error_hidden, axis=0, keepdims=True) * self.learning_rate
133
 
134
  # Gradient clipping to prevent overflow
135
+ max_norm = 1.0 # this is an adjustable threshold
136
  wo_prime = np.clip(wo_prime, -max_norm, max_norm)
137
  bo_prime = np.clip(bo_prime, -max_norm, max_norm)
138
  wh_prime = np.clip(wh_prime, -max_norm, max_norm)
 
144
  self._bo -= bo_prime
145
  self._bh -= bh_prime
146
 
 
 
147
  def train(self, X_train: np.ndarray, y_train: np.ndarray) -> "NN":
148
  for _ in gr.Progress().tqdm(range(self.epochs)):
149
+
150
+ n_samples = int(self.batch_size * X_train.shape[0])
151
+ batch_indeces = np.random.choice(
152
+ X_train.shape[0], size=n_samples, replace=False
153
+ )
154
+
155
+ X_train_batch = X_train[batch_indeces]
156
+ y_train_batch = y_train[batch_indeces]
157
+
158
+ y_hat, hidden_output = self._forward(X_train=X_train_batch)
159
+ loss = self.loss_fn.forward(y_hat=y_hat, y_true=y_train_batch)
160
  self._loss_history.append(loss)
161
  self._backward(
162
+ X_train=X_train_batch,
163
  y_hat=y_hat,
164
+ y_train=y_train_batch,
165
  hidden_output=hidden_output,
166
  )
167
 
 
176
 
177
  def predict(self, X_test: np.ndarray) -> np.ndarray:
178
  pred, _ = self._forward(X_test)
179
+ return self.output_activation_fn.forward(pred)
vis.py CHANGED
@@ -1,6 +1,5 @@
1
  import matplotlib
2
  from sklearn import datasets
3
- import plotly.graph_objects as go
4
  import plotly.express as px
5
  import matplotlib.pyplot as plt
6
  import matplotlib
@@ -15,13 +14,13 @@ def show_digits():
15
  for ax, image, label in zip(axes, digits.images, digits.target):
16
  ax.set_axis_off()
17
  ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
18
- ax.set_title("Training: %i" % label)
19
  return fig
20
 
21
 
22
  def loss_history_plt(loss_history: list[float], loss_fn_name: str):
23
  return px.line(
24
- x=[i for i in range(len(loss_history))],
25
  y=loss_history,
26
  title=f"{loss_fn_name} Loss vs. Training Epoch",
27
  labels={
@@ -42,12 +41,11 @@ def hits_and_misses(y_pred: np.ndarray, y_true: np.ndarray):
42
  "True: " + str(y_true_decoded[i]) + ", Pred: " + str(y_pred_decoded[i])
43
  for i in range(len(y_pred_decoded))
44
  ]
45
-
46
  return px.scatter(
47
  x=np.arange(len(y_pred_decoded)),
48
  y=y_true_decoded,
49
  color=color,
50
- title="Hits and Misses of Predictions",
51
  labels={
52
  "color": "Prediction Correctness",
53
  "x": "Sample Index",
@@ -59,8 +57,6 @@ def hits_and_misses(y_pred: np.ndarray, y_true: np.ndarray):
59
 
60
 
61
  def make_confidence_label(y_pred: np.ndarray, y_test: np.ndarray):
62
- # decode the one hot endoced predictions
63
- y_pred_labels = np.argmax(y_pred, axis=1)
64
  y_test_labels = np.argmax(y_test, axis=1)
65
  confidence_dict: dict[str, float] = {}
66
  for idx, class_name in enumerate([str(i) for i in range(10)]):
 
1
  import matplotlib
2
  from sklearn import datasets
 
3
  import plotly.express as px
4
  import matplotlib.pyplot as plt
5
  import matplotlib
 
14
  for ax, image, label in zip(axes, digits.images, digits.target):
15
  ax.set_axis_off()
16
  ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
17
+ ax.set_title(f"Training: {label}")
18
  return fig
19
 
20
 
21
  def loss_history_plt(loss_history: list[float], loss_fn_name: str):
22
  return px.line(
23
+ x=list(range(len(loss_history))),
24
  y=loss_history,
25
  title=f"{loss_fn_name} Loss vs. Training Epoch",
26
  labels={
 
41
  "True: " + str(y_true_decoded[i]) + ", Pred: " + str(y_pred_decoded[i])
42
  for i in range(len(y_pred_decoded))
43
  ]
 
44
  return px.scatter(
45
  x=np.arange(len(y_pred_decoded)),
46
  y=y_true_decoded,
47
  color=color,
48
+ title="Hits and Misses of Predictions on Validation Set",
49
  labels={
50
  "color": "Prediction Correctness",
51
  "x": "Sample Index",
 
57
 
58
 
59
  def make_confidence_label(y_pred: np.ndarray, y_test: np.ndarray):
 
 
60
  y_test_labels = np.argmax(y_test, axis=1)
61
  confidence_dict: dict[str, float] = {}
62
  for idx, class_name in enumerate([str(i) for i in range(10)]):
warning.md ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## What is this?
2
+
3
+ This is a no code platform for interacting with Numpy-Neuron, a neural network framework that I have built from scratch
4
+ using only [numpy](https://numpy.org/). Here, you can test different hyper parameters that will be fed to Numpy-Neuron and used to train a neural network for classification on the [MNIST](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) dataset of 8x8 pixel images of hand drawn numbers.
5
+
6
+ Once training is done, the final model will be tested by making predictions on an unseen subset of the dataset called the validation set. There will be a plot of hits vs. misses, measuring the accuracy of the final model on images that did not see in training. There will also be a label at the bottom that shows the average confidence of the final model when it was making its predictions on unseen data across the different labels (digits 0-9).
7
+
8
+ ## ⚠️ Warning ⚠️
9
+ This application is impossibly slow on the HuggingFace CPU instance that it is running on. It is advised to clone the
10
+ repository and run it locally.
11
+
12
+ ## Steps for running locally:
13
+
14
+ 1. `git clone https://huggingface.co/spaces/Jensen-holm/Numpy-Neuron`
15
+
16
+ 2. `pip3 install -r requirements.txt`
17
+
18
+ 3. `python3 app.py`