Spaces:

Jensen-holm
/

Numpy-Neuron

Sleeping

App Files Files Community

Jensen-holm commited on Apr 17, 2024

Commit

e11b37a

1 Parent(s): 28a7ac6

features added:

Browse files

- batch size argument
- new example that is more performant and better actually

Files changed (6) hide show

README.md +5 -30
app.py +54 -64
nn/__init__.py +3 -3
nn/nn.py +36 -22
vis.py +3 -7
warning.md +18 -0

README.md CHANGED Viewed

@@ -10,35 +10,10 @@ pinned: false
 license: mit
 ---
-## What is this? <br>
-The Numpy-Neuron is a GUI built around a neural network framework that I have built from scratch
-in [numpy](https://numpy.org/). In this GUI, you can test different hyper parameters that will be fed to this framework and used
-to train a neural network on the [MNIST](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) dataset of 8x8 pixel images.
-## ⚠️ PLEASE READ ⚠️
-This application is impossibly slow on the HuggingFace CPU instance that it is running on. It is advised to clone the
-repository and run it locally.
-In order to get a decent classification score on the validation set of the MNIST data (hard coded to 20%), you will have to
-do somewhere between 15,000 epochs and 50,000 epochs with a learning rate around 0.001, and a hidden layer size
-over 10. (roughly the example that I have provided). Running this many epochs with a hidden layer of that size
-is pretty expensive on 2 cpu cores that this space has. So if you are actually curious, you might want to clone
-this and run it locally because it will be much much faster.
-`git clone https://huggingface.co/spaces/Jensen-holm/Numpy-Neuron`
-After cloning, you will have to install the dependencies from requirements.txt into your environment. (venv reccommended)
-`pip3 install -r requirements.txt`
-Then, you can run the application on local host with the following command.
-`python3 app.py`
-## Development
-In order to push from this GitHub repo to the hugging face space:
-`git push --force space main`

 license: mit
 ---
+## Dev Notes
+The remote added to this repo so that it runs on hugging face spaces
+`git remote add space git@hf.co:spaces/Jensen-holm/Numpy-Neuron`
+The command to force push to that space
+`git push --force space main`

app.py CHANGED Viewed

@@ -13,10 +13,13 @@ from vis import (  # classification visualization funcitons
 )
 def _preprocess_digits(
     seed: int,
-) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
-    digits = datasets.load_digits()
     n_samples = len(digits.images)
     data = digits.images.reshape((n_samples, -1))
     y = OneHotEncoder().fit_transform(digits.target.reshape(-1, 1)).toarray()
@@ -33,36 +36,43 @@ X_train, X_test, y_train, y_test = _preprocess_digits(seed=1)
 def classification(
-    Seed: int = 0,
-    Hidden_Layer_Activation: str = "Relu",
-    Activation_Func: str = "SoftMax",
-    Loss_Func: str = "CrossEntropyWithLogitsLoss",
-    Epochs: int = 100,
-    Hidden_Size: int = 8,
-    Learning_Rate: float = 0.001,
 ) -> tuple[gr.Plot, gr.Plot, gr.Label]:
-    assert Activation_Func in nn.ACTIVATIONS
-    assert Hidden_Layer_Activation in nn.ACTIVATIONS
-    assert Loss_Func in nn.LOSSES
-    classifier = nn.NN(
-        epochs=Epochs,
-        learning_rate=Learning_Rate,
-        hidden_activation_fn=nn.ACTIVATIONS[Hidden_Layer_Activation],
-        activation_fn=nn.ACTIVATIONS[Activation_Func],
-        loss_fn=nn.LOSSES[Loss_Func],
-        hidden_size=Hidden_Size,
-        input_size=64,  # 8x8 image of pixels
         output_size=10,  # digits 0-9
-        seed=Seed,
     )
-    classifier.train(X_train=X_train, y_train=y_train)
-    pred = classifier.predict(X_test=X_test)
     hits_and_misses_fig = hits_and_misses(y_pred=pred, y_true=y_test)
     loss_fig = loss_history_plt(
-        loss_history=classifier._loss_history,
-        loss_fn_name=classifier.loss_fn.__class__.__name__,
     )
     label_dict = make_confidence_label(y_pred=pred, y_test=y_test)
@@ -74,38 +84,13 @@ def classification(
 if __name__ == "__main__":
     with gr.Blocks() as interface:
         gr.Markdown("# Numpy Neuron")
-        gr.Markdown(
-            """
-            ## What is this? <br>
-            The Backpropagation Playground is a GUI built around a neural network framework that I have built from scratch
-            in [numpy](https://numpy.org/). In this GUI, you can test different hyper parameters that will be fed to this framework and used
-            to train a neural network on the [MNIST](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) dataset of 8x8 pixel images.
-            ## ⚠️ PLEASE READ ⚠️
-            This application is impossibly slow on the HuggingFace CPU instance that it is running on. It is advised to clone the
-            repository and run it locally.
-            In order to get a decent classification score on the validation set of the MNIST data (hard coded to 20%), you will have to
-            do somewhere between 15,000 epochs and 50,000 epochs with a learning rate around 0.001, and a hidden layer size
-            over 10. (roughly the example that I have provided). Running this many epochs with a hidden layer of that size
-            is pretty expensive on 2 cpu cores that this space has. So if you are actually curious, you might want to clone
-            this and run it locally because it will be much much faster.
-            `git clone https://huggingface.co/spaces/Jensen-holm/Numpy-Neuron`
-            After cloning, you will have to install the dependencies from requirements.txt into your environment. (venv reccommended)
-            `pip3 install -r requirements.txt`
-            Then, you can run the application on localhost with the following command.
-            `python3 app.py`
-            """
-        )
         with gr.Tab("Classification"):
             with gr.Row():
@@ -120,11 +105,12 @@ if __name__ == "__main__":
                 with gr.Column():
                     numeric_inputs = [
                         gr.Slider(
-                            minimum=100, maximum=100_000, step=50, label="Epochs"
                         ),
                         gr.Slider(
                             minimum=2, maximum=64, step=2, label="Hidden Network Size"
                         ),
                         gr.Number(minimum=0.00001, maximum=1.5, label="Learning Rate"),
                     ]
@@ -132,9 +118,12 @@ if __name__ == "__main__":
                     fn_inputs = [
                         gr.Dropdown(
                             choices=["Relu", "Sigmoid", "TanH"],
-                            label="Hidden Layer Activation",
                         ),
-                        gr.Dropdown(choices=["SoftMax", "Sigmoid"], label="Output Activation"),
                         gr.Dropdown(
                             choices=["CrossEntropy", "CrossEntropyWithLogitsLoss"],
                             label="Loss Function",
@@ -151,12 +140,13 @@ if __name__ == "__main__":
                         [
                             2,
                             "Relu",
-                            "SoftMax",
                             "CrossEntropyWithLogitsLoss",
-                            15_000,
-                            14,
-                            0.001,
-                        ]
                     ],
                     inputs=inputs,
                 )

 )
+type number = float | int
 def _preprocess_digits(
     seed: int,
+) -> tuple[np.ndarray, ...]:
+    digits = datasets.load_digits(as_frame=False)
     n_samples = len(digits.images)
     data = digits.images.reshape((n_samples, -1))
     y = OneHotEncoder().fit_transform(digits.target.reshape(-1, 1)).toarray()
 def classification(
+    seed: int,
+    hidden_layer_activation_fn: str,
+    output_layer_activation_fn: str,
+    loss_fn_str: str,
+    epochs: int,
+    hidden_size: int,
+    batch_size: number,
+    learning_rate: number,
 ) -> tuple[gr.Plot, gr.Plot, gr.Label]:
+    assert hidden_layer_activation_fn in nn.ACTIVATIONS
+    assert output_layer_activation_fn in nn.ACTIVATIONS
+    assert loss_fn_str in nn.LOSSES
+    loss_fn: nn.Loss = nn.LOSSES[loss_fn_str]
+    h_act_fn: nn.Activation = nn.ACTIVATIONS[hidden_layer_activation_fn]
+    o_act_fn: nn.Activation = nn.ACTIVATIONS[output_layer_activation_fn]
+    nn_classifier = nn.NN(
+        epochs=epochs,
+        hidden_size=hidden_size,
+        batch_size=batch_size,
+        learning_rate=learning_rate,
+        loss_fn=loss_fn,
+        hidden_activation_fn=h_act_fn,
+        output_activation_fn=o_act_fn,
+        input_size=64,  # 8x8 pixel grid images
         output_size=10,  # digits 0-9
+        seed=seed,
     )
+    nn_classifier.train(X_train=X_train, y_train=y_train)
+    pred = nn_classifier.predict(X_test=X_test)
     hits_and_misses_fig = hits_and_misses(y_pred=pred, y_true=y_test)
     loss_fig = loss_history_plt(
+        loss_history=nn_classifier._loss_history,
+        loss_fn_name=nn_classifier.loss_fn.__class__.__name__,
     )
     label_dict = make_confidence_label(y_pred=pred, y_test=y_test)
 if __name__ == "__main__":
+    def _open_warning() -> str:
+        with open("warning.md", "r") as f:
+            return f.read()
     with gr.Blocks() as interface:
         gr.Markdown("# Numpy Neuron")
+        gr.Markdown(_open_warning())
         with gr.Tab("Classification"):
             with gr.Row():
                 with gr.Column():
                     numeric_inputs = [
                         gr.Slider(
+                            minimum=100, maximum=10_000, step=50, label="Epochs"
                         ),
                         gr.Slider(
                             minimum=2, maximum=64, step=2, label="Hidden Network Size"
                         ),
+                        gr.Slider(minimum=0.1, maximum=1, step=0.1, label="Batch Size"),
                         gr.Number(minimum=0.00001, maximum=1.5, label="Learning Rate"),
                     ]
                     fn_inputs = [
                         gr.Dropdown(
                             choices=["Relu", "Sigmoid", "TanH"],
+                            label="Hidden Layer Activation Function",
+                        ),
+                        gr.Dropdown(
+                            choices=["SoftMax", "Sigmoid"],
+                            label="Output Activation Function",
                         ),
                         gr.Dropdown(
                             choices=["CrossEntropy", "CrossEntropyWithLogitsLoss"],
                             label="Loss Function",
                         [
                             2,
                             "Relu",
+                            "Sigmoid",
                             "CrossEntropyWithLogitsLoss",
+                            2_000,
+                            16,
+                            1.0,
+                            0.01,
+                        ],
                     ],
                     inputs=inputs,
                 )

nn/__init__.py CHANGED Viewed

@@ -1,3 +1,3 @@
-from nn.nn import NN
-from nn.activation import ACTIVATIONS
-from nn.loss import LOSSES

+from nn.loss import *
+from nn.activation import *
+from nn.nn import *

nn/nn.py CHANGED Viewed

@@ -15,9 +15,10 @@ class NN:
     learning_rate: float
     hidden_size: int
     input_size: int
     output_size: int
     hidden_activation_fn: Activation
-    activation_fn: Activation
     loss_fn: Loss
     seed: int
@@ -26,19 +27,26 @@ class NN:
     _wh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
     _bo: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
     _bh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
-    _weight_history: dict[str, list[np.ndarray]] = field(
-        default_factory=lambda: {
-            "wo": [],
-            "wh": [],
-            "bo": [],
-            "bh": [],
-        },
-        init=False,
-    )
     def __post_init__(self) -> None:
         self._init_weights_and_biases()
     def _init_weights_and_biases(self) -> None:
         """
         NN._init_weights_and_biases(): Should only be ran once, right before training loop
@@ -64,7 +72,6 @@ class NN:
             * np.sqrt(2 / self.hidden_size),
             dtype=DTYPE,
         )
-        return
     # def _forward(self, X_train: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
     #     # Determine the activation function for the hidden layer
@@ -116,16 +123,16 @@ class NN:
         bo_prime = np.sum(error_output, axis=0, keepdims=True) * self.learning_rate
         # Propagate the error back to the hidden layer
-        error_hidden = np.dot(error_output, self._wo.T) * self.activation_fn.backward(
-            hidden_output
-        )
         # Calculate gradients for hidden layer weights and biases
         wh_prime = np.dot(X_train.T, error_hidden) * self.learning_rate
         bh_prime = np.sum(error_hidden, axis=0, keepdims=True) * self.learning_rate
         # Gradient clipping to prevent overflow
-        max_norm = 1.0  # You can adjust this threshold
         wo_prime = np.clip(wo_prime, -max_norm, max_norm)
         bo_prime = np.clip(bo_prime, -max_norm, max_norm)
         wh_prime = np.clip(wh_prime, -max_norm, max_norm)
@@ -137,17 +144,24 @@ class NN:
         self._bo -= bo_prime
         self._bh -= bh_prime
-    # TODO: implement batch size in training, this will speed up the training loop
-    # quite a bit I believe
     def train(self, X_train: np.ndarray, y_train: np.ndarray) -> "NN":
         for _ in gr.Progress().tqdm(range(self.epochs)):
-            y_hat, hidden_output = self._forward(X_train=X_train)
-            loss = self.loss_fn.forward(y_hat=y_hat, y_true=y_train)
             self._loss_history.append(loss)
             self._backward(
-                X_train=X_train,
                 y_hat=y_hat,
-                y_train=y_train,
                 hidden_output=hidden_output,
             )
@@ -162,4 +176,4 @@ class NN:
     def predict(self, X_test: np.ndarray) -> np.ndarray:
         pred, _ = self._forward(X_test)
-        return self.activation_fn.forward(pred)

     learning_rate: float
     hidden_size: int
     input_size: int
+    batch_size: float
     output_size: int
     hidden_activation_fn: Activation
+    output_activation_fn: Activation
     loss_fn: Loss
     seed: int
     _wh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
     _bo: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
     _bh: np.ndarray = field(default_factory=lambda: np.ndarray([]), init=False)
+    # not currently using this, see TODO: at bottom of this file
+    # _weight_history: dict[str, list[np.ndarray]] = field(
+    #    default_factory=lambda: {
+    #        "wo": [],
+    #        "wh": [],
+    #        "bo": [],
+    #        "bh": [],
+    #    },
+    #    init=False,
+    # )
     def __post_init__(self) -> None:
+        assert 0 < self.batch_size <= 1
         self._init_weights_and_biases()
+    @classmethod
+    def from_dict(cls, args: dict) -> "NN":
+        return cls(**args)
     def _init_weights_and_biases(self) -> None:
         """
         NN._init_weights_and_biases(): Should only be ran once, right before training loop
             * np.sqrt(2 / self.hidden_size),
             dtype=DTYPE,
         )
     # def _forward(self, X_train: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
     #     # Determine the activation function for the hidden layer
         bo_prime = np.sum(error_output, axis=0, keepdims=True) * self.learning_rate
         # Propagate the error back to the hidden layer
+        error_hidden = np.dot(
+            error_output, self._wo.T
+        ) * self.output_activation_fn.backward(hidden_output)
         # Calculate gradients for hidden layer weights and biases
         wh_prime = np.dot(X_train.T, error_hidden) * self.learning_rate
         bh_prime = np.sum(error_hidden, axis=0, keepdims=True) * self.learning_rate
         # Gradient clipping to prevent overflow
+        max_norm = 1.0  # this is an adjustable threshold
         wo_prime = np.clip(wo_prime, -max_norm, max_norm)
         bo_prime = np.clip(bo_prime, -max_norm, max_norm)
         wh_prime = np.clip(wh_prime, -max_norm, max_norm)
         self._bo -= bo_prime
         self._bh -= bh_prime
     def train(self, X_train: np.ndarray, y_train: np.ndarray) -> "NN":
         for _ in gr.Progress().tqdm(range(self.epochs)):
+            n_samples = int(self.batch_size * X_train.shape[0])
+            batch_indeces = np.random.choice(
+                X_train.shape[0], size=n_samples, replace=False
+            )
+            X_train_batch = X_train[batch_indeces]
+            y_train_batch = y_train[batch_indeces]
+            y_hat, hidden_output = self._forward(X_train=X_train_batch)
+            loss = self.loss_fn.forward(y_hat=y_hat, y_true=y_train_batch)
             self._loss_history.append(loss)
             self._backward(
+                X_train=X_train_batch,
                 y_hat=y_hat,
+                y_train=y_train_batch,
                 hidden_output=hidden_output,
             )
     def predict(self, X_test: np.ndarray) -> np.ndarray:
         pred, _ = self._forward(X_test)
+        return self.output_activation_fn.forward(pred)

vis.py CHANGED Viewed

@@ -1,6 +1,5 @@
 import matplotlib
 from sklearn import datasets
-import plotly.graph_objects as go
 import plotly.express as px
 import matplotlib.pyplot as plt
 import matplotlib
@@ -15,13 +14,13 @@ def show_digits():
     for ax, image, label in zip(axes, digits.images, digits.target):
         ax.set_axis_off()
         ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
-        ax.set_title("Training: %i" % label)
     return fig
 def loss_history_plt(loss_history: list[float], loss_fn_name: str):
     return px.line(
-        x=[i for i in range(len(loss_history))],
         y=loss_history,
         title=f"{loss_fn_name} Loss vs. Training Epoch",
         labels={
@@ -42,12 +41,11 @@ def hits_and_misses(y_pred: np.ndarray, y_true: np.ndarray):
         "True: " + str(y_true_decoded[i]) + ", Pred: " + str(y_pred_decoded[i])
         for i in range(len(y_pred_decoded))
     ]
     return px.scatter(
         x=np.arange(len(y_pred_decoded)),
         y=y_true_decoded,
         color=color,
-        title="Hits and Misses of Predictions",
         labels={
             "color": "Prediction Correctness",
             "x": "Sample Index",
@@ -59,8 +57,6 @@ def hits_and_misses(y_pred: np.ndarray, y_true: np.ndarray):
 def make_confidence_label(y_pred: np.ndarray, y_test: np.ndarray):
-    # decode the one hot endoced predictions
-    y_pred_labels = np.argmax(y_pred, axis=1)
     y_test_labels = np.argmax(y_test, axis=1)
     confidence_dict: dict[str, float] = {}
     for idx, class_name in enumerate([str(i) for i in range(10)]):

 import matplotlib
 from sklearn import datasets
 import plotly.express as px
 import matplotlib.pyplot as plt
 import matplotlib
     for ax, image, label in zip(axes, digits.images, digits.target):
         ax.set_axis_off()
         ax.imshow(image, cmap=plt.cm.gray_r, interpolation="nearest")
+        ax.set_title(f"Training: {label}")
     return fig
 def loss_history_plt(loss_history: list[float], loss_fn_name: str):
     return px.line(
+        x=list(range(len(loss_history))),
         y=loss_history,
         title=f"{loss_fn_name} Loss vs. Training Epoch",
         labels={
         "True: " + str(y_true_decoded[i]) + ", Pred: " + str(y_pred_decoded[i])
         for i in range(len(y_pred_decoded))
     ]
     return px.scatter(
         x=np.arange(len(y_pred_decoded)),
         y=y_true_decoded,
         color=color,
+        title="Hits and Misses of Predictions on Validation Set",
         labels={
             "color": "Prediction Correctness",
             "x": "Sample Index",
 def make_confidence_label(y_pred: np.ndarray, y_test: np.ndarray):
     y_test_labels = np.argmax(y_test, axis=1)
     confidence_dict: dict[str, float] = {}
     for idx, class_name in enumerate([str(i) for i in range(10)]):

warning.md ADDED Viewed

	@@ -0,0 +1,18 @@

+## What is this?
+This is a no code platform for interacting with Numpy-Neuron, a neural network framework that I have built from scratch
+using only [numpy](https://numpy.org/). Here, you can test different hyper parameters that will be fed to Numpy-Neuron and used to train a neural network for classification on the [MNIST](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) dataset of 8x8 pixel images of hand drawn numbers.
+Once training is done, the final model will be tested by making predictions on an unseen subset of the dataset called the validation set. There will be a plot of hits vs. misses, measuring the accuracy of the final model on images that did not see in training. There will also be a label at the bottom that shows the average confidence of the final model when it was making its predictions on unseen data across the different labels (digits 0-9).
+## ⚠️ Warning ⚠️
+This application is impossibly slow on the HuggingFace CPU instance that it is running on. It is advised to clone the
+repository and run it locally.
+## Steps for running locally:
+1. `git clone https://huggingface.co/spaces/Jensen-holm/Numpy-Neuron`
+2. `pip3 install -r requirements.txt`
+3. `python3 app.py`