Spaces:

Jensen-holm
/

Numpy-Neuron

Sleeping

App Files Files Community

Jensen-holm commited on Apr 12, 2024

Commit

6307b4f

1 Parent(s): d04aaf5

switching to gradio and a complete rewrite that I did over in the

Browse files

Files changed (15) hide show

.gitmodules +0 -3
README.md +14 -34
app.py +105 -45
example/iris.csv +0 -151
example/main.py +0 -35
example/mushrooms.csv +0 -0
ml-vis +0 -1
nn/__init__.py +3 -0
nn/activation.py +42 -29
nn/loss.py +50 -0
nn/nn.py +153 -53
nn/test.py +30 -0
nn/train.py +0 -127
requirements.txt +4 -8
vis.py +20 -0

.gitmodules DELETED Viewed

@@ -1,3 +0,0 @@
-[submodule "ml-vis"]
-	path = ml-vis
-	url = git@github.com:Jensen-holm/ml-vis.git

README.md CHANGED Viewed

@@ -1,34 +1,14 @@
-# Neural Network Classification (from-scratch)
-## Parameters
-Think of epochs as rounds of training for your neural network. Each epoch means the network has gone through the entire dataset once, learning and adjusting its parameters. More epochs can lead to better accuracy, but too many can also overfit the model to your training data.
-#### Activation functions
-introduce non-linearity to your neural network, allowing it to model complex relationships in data. The choice of activation function (like sigmoid, ReLU, or tanh) affects how the network processes and passes information between its layers.
-#### Hidden Size
-This refers to the number of neurons or units in the hidden layer(s) of your neural network. More hidden units can make the network more capable of learning complex patterns, but it can also make training slower and increase the risk of overfitting.
-#### Learning Rate
-Imagine this as the step size your neural network takes during training. It determines how much the network's parameters are updated based on the error it observes. A higher learning rate means bigger steps but can lead to overshooting the optimal values, while a smaller learning rate may take longer to converge or find the best values.
-#### Test Size
-When training a neural network, or any machine learning model for that matter, it is important to split the data into training and testing sets. The test size parameter specifys how to split up the data into these two sets. a test size of 0.2 will split it up so that 80% of the data is used for training, and 20% of the data is used for testing.
-## Backprop Algorithm
-Backpropagation, short for "backward propagation of errors," is the cornerstone of training artificial neural networks. It begins by initializing the network's weights and biases. During the forward pass, input data flows through the network's layers, undergoing weighted sum calculations and activation functions, eventually producing predictions. The algorithm then computes an error or loss by comparing these predictions to the actual target values. In the critical backward pass, starting from the output layer and moving in reverse, gradients of the loss with respect to each layer's outputs, weights, and biases are calculated using calculus and the chain rule. These gradients guide the adjustment of weights and biases in each layer, with the goal of minimizing the loss. This iterative process repeats for multiple epochs, refining the network's parameters until the error reaches an acceptable level or a fixed number of training iterations is completed, ultimately enabling the network to improve its predictions on new data.
-## Implementation
-Behind the scenes, my API implements the backprop algorithm. The main loop first initializes weights and biases randomly. The algorithm starts by iterating n times where n is the number of epochs you specify above. During each iteration, starting with the randomly initialized weights and biases, the activation function that you choose will be run inside of this compute node function below:
-The activation function plays a crucial role in the behavior of your neural network. The compute node function, which we've discussed earlier, calculates the network's output. In each iteration of the training process, we compare this output to the actual data, which, in this case, represents the iris flower type. The difference between the predicted and actual values guides the algorithm in determining how much to adjust the network's weights and biases for better predictions. However, we must be careful to prevent the neural network from memorizing the training data, a problem in machine learning known as overfitting. To address this, we scale down the derivatives computed for weights and biases by the learning rate you specify, ensuring that the network learns in a controlled and meaningful manner. You'll notice that if you use 1 for the learning rate, the graph on loss/epoch is a lot choppier than it is if you have a lower learning rate like 0.01. The smoother the curve, the better. The process repeats for n epochs, then the final results are calculated, and our final weights and biases saved.
-## Results
-#### Log Loss
-#### Accuracy Score

+---
+title: Backprop Playground
+emoji: 🔙
+colorFrom: yellow
+colorTo: blue
+sdk: gradio
+sdk_version: 4.26.0
+app_file: app.py
+pinned: false
+license: mit
+---
+This web app uses a neural network framework that I built from scratch in <br>
+python, using numpy as the only 3rd party library in the framework itself. <br>

app.py CHANGED Viewed

@@ -1,48 +1,108 @@
-from flask import Flask, request, jsonify, Response
-from flask_cors import CORS
-from nn.nn import NN
-from nn import train as train_nn
-from nn import activation
-import pandas as pd
-import io
-app = Flask(__name__)
-CORS(app, origins="*")
-@app.route("/neural-network", methods=["POST"])
-def neural_net():
-    args = request.json
-    try:
-        net = NN.from_dict(args)
-    except Exception as e:
-        return Response(
-            response=f"issue with request args: {e}",
-            status=400,
-        )
-    try:
-        df = pd.read_csv(io.StringIO(net.data))
-        net.set_df(df=df)
-    except Exception as e:
-        return Response(
-            response=f"error reading csv data: {e}",
-            status=400,
-        )
-    try:
-        activation.get_activation(nn=net)
-    except Exception:
-        return Response(
-            response="invalid activation function",
-            status=400,
-        )
-    result = train_nn.train(nn=net)
-    return jsonify(result)
 if __name__ == "__main__":
-    app.run()

+import plotly.express as px
+from sklearn import datasets
+from sklearn.preprocessing import StandardScaler, OneHotEncoder
+from sklearn.model_selection import train_test_split
+import numpy as np
+import gradio as gr
+from vis import iris_3d_scatter
+import nn  # custom neural network module
+def _preprocess_iris_data(
+    seed: int,
+) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
+    iris = datasets.load_iris()
+    X = iris["data"]
+    y = iris["target"]
+    # normalize the features
+    X = StandardScaler().fit_transform(X)
+    # one hot encode the target variables
+    y = OneHotEncoder().fit_transform(y.reshape(-1, 1)).toarray()
+    return train_test_split(
+        X,
+        y,
+        test_size=0.2,
+        random_state=seed,
+    )
+X_train, X_test, y_train, y_test = _preprocess_iris_data(seed=1)
+def main(
+    Seed: int = 0,
+    Activation_Func: str = "SoftMax",
+    Loss_Func: str = "CrossEntropy",
+    Epochs: int = 100,
+    Hidden_Size: int = 8,
+    Learning_Rate: float = 0.01,
+) -> gr.Plot:
+    iris_classifier = nn.NN(
+        epochs=Epochs,
+        learning_rate=Learning_Rate,
+        activation_fn=Activation_Func,
+        loss_fn=Loss_Func,
+        hidden_size=Hidden_Size,
+        input_size=4,  # number of features in iris dataset
+        output_size=3,  # three classes in iris dataset
+        seed=Seed,
+    )
+    iris_classifier.train(X_train=X_train, y_train=y_train)
+    loss_fig = px.line(
+        x=[i for i in range(len(iris_classifier._loss_history))],
+        y=iris_classifier._loss_history,
+    )
+    return gr.Plot(loss_fig)
 if __name__ == "__main__":
+    with gr.Blocks() as interface:
+        gr.Markdown("# Backpropagation Playground")
+        with gr.Tab("Classification"):
+            with gr.Row():
+                data_plt = iris_3d_scatter()
+                gr.Plot(data_plt)
+            with gr.Row():
+                seed_input = [gr.Number(minimum=0, label="Random Seed")]
+            # inputs in the same row
+            with gr.Row():
+                with gr.Column():
+                    numeric_inputs = [
+                        gr.Slider(minimum=100, maximum=10_000, step=50, label="Epochs"),
+                        gr.Slider(
+                            minimum=2, maximum=64, step=2, label="Hidden Network Size"
+                        ),
+                        gr.Number(minimum=0.00001, maximum=1.5, label="Learning Rate"),
+                    ]
+                with gr.Column():
+                    fn_inputs = [
+                        gr.Dropdown(
+                            choices=["SoftMax"], label="Activation Function"
+                        ),
+                        gr.Dropdown(choices=["CrossEntropy"], label="Loss Function"),
+                    ]
+            with gr.Row():
+                train_btn = gr.Button("Train", variant="primary")
+            # outputs in row below inputs
+            with gr.Row():
+                plt_outputs = [gr.Plot()]
+            train_btn.click(
+                fn=main,
+                inputs=seed_input + fn_inputs + numeric_inputs,
+                outputs=plt_outputs,
+            )
+        with gr.Tab("Regression"):
+            ...
+    interface.launch(show_error=True)

example/iris.csv DELETED Viewed

@@ -1,151 +0,0 @@
-sepal length,sepal width,petal length,petal width,species
-5.1,3.5,1.4,0.2,Iris-setosa
-4.9,3.0,1.4,0.2,Iris-setosa
-4.7,3.2,1.3,0.2,Iris-setosa
-4.6,3.1,1.5,0.2,Iris-setosa
-5.0,3.6,1.4,0.2,Iris-setosa
-5.4,3.9,1.7,0.4,Iris-setosa
-4.6,3.4,1.4,0.3,Iris-setosa
-5.0,3.4,1.5,0.2,Iris-setosa
-4.4,2.9,1.4,0.2,Iris-setosa
-4.9,3.1,1.5,0.1,Iris-setosa
-5.4,3.7,1.5,0.2,Iris-setosa
-4.8,3.4,1.6,0.2,Iris-setosa
-4.8,3.0,1.4,0.1,Iris-setosa
-4.3,3.0,1.1,0.1,Iris-setosa
-5.8,4.0,1.2,0.2,Iris-setosa
-5.7,4.4,1.5,0.4,Iris-setosa
-5.4,3.9,1.3,0.4,Iris-setosa
-5.1,3.5,1.4,0.3,Iris-setosa
-5.7,3.8,1.7,0.3,Iris-setosa
-5.1,3.8,1.5,0.3,Iris-setosa
-5.4,3.4,1.7,0.2,Iris-setosa
-5.1,3.7,1.5,0.4,Iris-setosa
-4.6,3.6,1.0,0.2,Iris-setosa
-5.1,3.3,1.7,0.5,Iris-setosa
-4.8,3.4,1.9,0.2,Iris-setosa
-5.0,3.0,1.6,0.2,Iris-setosa
-5.0,3.4,1.6,0.4,Iris-setosa
-5.2,3.5,1.5,0.2,Iris-setosa
-5.2,3.4,1.4,0.2,Iris-setosa
-4.7,3.2,1.6,0.2,Iris-setosa
-4.8,3.1,1.6,0.2,Iris-setosa
-5.4,3.4,1.5,0.4,Iris-setosa
-5.2,4.1,1.5,0.1,Iris-setosa
-5.5,4.2,1.4,0.2,Iris-setosa
-4.9,3.1,1.5,0.2,Iris-setosa
-5.0,3.2,1.2,0.2,Iris-setosa
-5.5,3.5,1.3,0.2,Iris-setosa
-4.9,3.6,1.4,0.1,Iris-setosa
-4.4,3.0,1.3,0.2,Iris-setosa
-5.1,3.4,1.5,0.2,Iris-setosa
-5.0,3.5,1.3,0.3,Iris-setosa
-4.5,2.3,1.3,0.3,Iris-setosa
-4.4,3.2,1.3,0.2,Iris-setosa
-5.0,3.5,1.6,0.6,Iris-setosa
-5.1,3.8,1.9,0.4,Iris-setosa
-4.8,3.0,1.4,0.3,Iris-setosa
-5.1,3.8,1.6,0.2,Iris-setosa
-4.6,3.2,1.4,0.2,Iris-setosa
-5.3,3.7,1.5,0.2,Iris-setosa
-5.0,3.3,1.4,0.2,Iris-setosa
-7.0,3.2,4.7,1.4,Iris-versicolor
-6.4,3.2,4.5,1.5,Iris-versicolor
-6.9,3.1,4.9,1.5,Iris-versicolor
-5.5,2.3,4.0,1.3,Iris-versicolor
-6.5,2.8,4.6,1.5,Iris-versicolor
-5.7,2.8,4.5,1.3,Iris-versicolor
-6.3,3.3,4.7,1.6,Iris-versicolor
-4.9,2.4,3.3,1.0,Iris-versicolor
-6.6,2.9,4.6,1.3,Iris-versicolor
-5.2,2.7,3.9,1.4,Iris-versicolor
-5.0,2.0,3.5,1.0,Iris-versicolor
-5.9,3.0,4.2,1.5,Iris-versicolor
-6.0,2.2,4.0,1.0,Iris-versicolor
-6.1,2.9,4.7,1.4,Iris-versicolor
-5.6,2.9,3.6,1.3,Iris-versicolor
-6.7,3.1,4.4,1.4,Iris-versicolor
-5.6,3.0,4.5,1.5,Iris-versicolor
-5.8,2.7,4.1,1.0,Iris-versicolor
-6.2,2.2,4.5,1.5,Iris-versicolor
-5.6,2.5,3.9,1.1,Iris-versicolor
-5.9,3.2,4.8,1.8,Iris-versicolor
-6.1,2.8,4.0,1.3,Iris-versicolor
-6.3,2.5,4.9,1.5,Iris-versicolor
-6.1,2.8,4.7,1.2,Iris-versicolor
-6.4,2.9,4.3,1.3,Iris-versicolor
-6.6,3.0,4.4,1.4,Iris-versicolor
-6.8,2.8,4.8,1.4,Iris-versicolor
-6.7,3.0,5.0,1.7,Iris-versicolor
-6.0,2.9,4.5,1.5,Iris-versicolor
-5.7,2.6,3.5,1.0,Iris-versicolor
-5.5,2.4,3.8,1.1,Iris-versicolor
-5.5,2.4,3.7,1.0,Iris-versicolor
-5.8,2.7,3.9,1.2,Iris-versicolor
-6.0,2.7,5.1,1.6,Iris-versicolor
-5.4,3.0,4.5,1.5,Iris-versicolor
-6.0,3.4,4.5,1.6,Iris-versicolor
-6.7,3.1,4.7,1.5,Iris-versicolor
-6.3,2.3,4.4,1.3,Iris-versicolor
-5.6,3.0,4.1,1.3,Iris-versicolor
-5.5,2.5,4.0,1.3,Iris-versicolor
-5.5,2.6,4.4,1.2,Iris-versicolor
-6.1,3.0,4.6,1.4,Iris-versicolor
-5.8,2.6,4.0,1.2,Iris-versicolor
-5.0,2.3,3.3,1.0,Iris-versicolor
-5.6,2.7,4.2,1.3,Iris-versicolor
-5.7,3.0,4.2,1.2,Iris-versicolor
-5.7,2.9,4.2,1.3,Iris-versicolor
-6.2,2.9,4.3,1.3,Iris-versicolor
-5.1,2.5,3.0,1.1,Iris-versicolor
-5.7,2.8,4.1,1.3,Iris-versicolor
-6.3,3.3,6.0,2.5,Iris-virginica
-5.8,2.7,5.1,1.9,Iris-virginica
-7.1,3.0,5.9,2.1,Iris-virginica
-6.3,2.9,5.6,1.8,Iris-virginica
-6.5,3.0,5.8,2.2,Iris-virginica
-7.6,3.0,6.6,2.1,Iris-virginica
-4.9,2.5,4.5,1.7,Iris-virginica
-7.3,2.9,6.3,1.8,Iris-virginica
-6.7,2.5,5.8,1.8,Iris-virginica
-7.2,3.6,6.1,2.5,Iris-virginica
-6.5,3.2,5.1,2.0,Iris-virginica
-6.4,2.7,5.3,1.9,Iris-virginica
-6.8,3.0,5.5,2.1,Iris-virginica
-5.7,2.5,5.0,2.0,Iris-virginica
-5.8,2.8,5.1,2.4,Iris-virginica
-6.4,3.2,5.3,2.3,Iris-virginica
-6.5,3.0,5.5,1.8,Iris-virginica
-7.7,3.8,6.7,2.2,Iris-virginica
-7.7,2.6,6.9,2.3,Iris-virginica
-6.0,2.2,5.0,1.5,Iris-virginica
-6.9,3.2,5.7,2.3,Iris-virginica
-5.6,2.8,4.9,2.0,Iris-virginica
-7.7,2.8,6.7,2.0,Iris-virginica
-6.3,2.7,4.9,1.8,Iris-virginica
-6.7,3.3,5.7,2.1,Iris-virginica
-7.2,3.2,6.0,1.8,Iris-virginica
-6.2,2.8,4.8,1.8,Iris-virginica
-6.1,3.0,4.9,1.8,Iris-virginica
-6.4,2.8,5.6,2.1,Iris-virginica
-7.2,3.0,5.8,1.6,Iris-virginica
-7.4,2.8,6.1,1.9,Iris-virginica
-7.9,3.8,6.4,2.0,Iris-virginica
-6.4,2.8,5.6,2.2,Iris-virginica
-6.3,2.8,5.1,1.5,Iris-virginica
-6.1,2.6,5.6,1.4,Iris-virginica
-7.7,3.0,6.1,2.3,Iris-virginica
-6.3,3.4,5.6,2.4,Iris-virginica
-6.4,3.1,5.5,1.8,Iris-virginica
-6.0,3.0,4.8,1.8,Iris-virginica
-6.9,3.1,5.4,2.1,Iris-virginica
-6.7,3.1,5.6,2.4,Iris-virginica
-6.9,3.1,5.1,2.3,Iris-virginica
-5.8,2.7,5.1,1.9,Iris-virginica
-6.8,3.2,5.9,2.3,Iris-virginica
-6.7,3.3,5.7,2.5,Iris-virginica
-6.7,3.0,5.2,2.3,Iris-virginica
-6.3,2.5,5.0,1.9,Iris-virginica
-6.5,3.0,5.2,2.0,Iris-virginica
-6.2,3.4,5.4,2.3,Iris-virginica
-5.9,3.0,5.1,1.8,Iris-virginica

example/main.py DELETED Viewed

@@ -1,35 +0,0 @@
-import requests
-with open("mushrooms.csv", "rb") as csv:
-    data = csv.read()
-# class,cap-shape,cap-surface,cap-color,bruises,odor,gill-attachment,gill-spacing,gill-size,gill-color,stalk-shape,stalk-root,stalk-surface-above-ring,stalk-surface-below-ring,stalk-color-above-ring,stalk-color-below-ring,veil-type,veil-color,ring-number,ring-type,spore-print-color,population,habitat
-ARGS = {
-    "epochs": 1_000,
-    "hidden_size": 8,
-    "learning_rate": 0.0001,
-    "test_size": 0.1,
-    "activation": "relu",
-    "features": [
-        "cap-shape",
-        "cap-surface",
-        "cap-color",
-        "bruises",
-        "odor",
-        "gill-attachment",
-        "gill-spacing",
-        "gill-size",
-        "gill-color",
-    ],
-    "target": "class",
-    "data": data.decode("utf-8"),
-}
-if __name__ == "__main__":
-    r = requests.post(
-        "http://127.0.0.1:5000/neural-network",
-        json=ARGS,  # Send the data as a JSON object
-    )
-    print(r.text)

example/mushrooms.csv DELETED Viewed

The diff for this file is too large to render. See raw diff

ml-vis DELETED Viewed

	@@ -1 +0,0 @@
1	- Subproject commit bebe25b27a895c1de71743fbf808b8e592e80806

nn/__init__.py ADDED Viewed

	@@ -0,0 +1,3 @@

+from nn.nn import NN
+from nn.activation import ACTIVATIONS
+from nn.loss import LOSSES

nn/activation.py CHANGED Viewed

@@ -1,46 +1,59 @@
-from typing import Callable
-from nn.nn import NN
 import numpy as np
-def get_activation(nn: NN) -> Callable:
-    a = nn.activation
-    funcs = {
-        "relu": relu,
-        "sigmoid": sigmoid,
-        "tanh": tanh,
-    }
-    prime_funcs = {
-        "sigmoid": sigmoid_prime,
-        "tanh": tanh_prime,
-        "relu": relu_prime,
-    }
-    nn.set_func(funcs[a])
-    nn.set_func_prime(prime_funcs[a])
-def relu(x):
-    return np.maximum(0.0, x)
-def relu_prime(x):
-    return np.maximum(0, x)
-def sigmoid(x):
-    return 1.0 / (1.0 + np.exp(-x))
-def sigmoid_prime(x):
-    s = sigmoid(x)
-    return s * (1 - s)
-def tanh(x):
-    return np.tanh(x)
-def tanh_prime(x):
-    return 1 - np.tanh(x)**2

 import numpy as np
+from abc import abstractmethod, ABC
+__all__ = ["Activation", "Relu", "TanH", "Sigmoid", "SoftMax", "ACTIVATIONS"]
+class Activation(ABC):
+    @abstractmethod
+    def forward(self, X: np.ndarray) -> np.ndarray:
+        pass
+    @abstractmethod
+    def backward(self, X: np.ndarray) -> np.ndarray:
+        pass
+class Relu(Activation):
+    def forward(self, X: np.ndarray) -> np.ndarray:
+        return np.maximum(0, X)
+    def backward(self, X: np.ndarray) -> np.ndarray:
+        return np.where(X > 0, 1, 0)
+class TanH(Activation):
+    def forward(self, X: np.ndarray) -> np.ndarray:
+        return np.tanh(X)
+    def backward(self, X: np.ndarray) -> np.ndarray:
+        return 1 - self.forward(X) ** 2
+class Sigmoid(Activation):
+    def forward(self, X: np.ndarray) -> np.ndarray:
+        return 1.0 / (1.0 + np.exp(-X))
+    def backward(self, X: np.ndarray) -> np.ndarray:
+        s = self.forward(X)
+        return s - (1 - s)
+class SoftMax(Activation):
+    def forward(self, X: np.ndarray) -> np.ndarray:
+        exps = np.exp(
+            X - np.max(X, axis=1, keepdims=True)
+        )  # Avoid numerical instability
+        return exps / np.sum(exps, axis=1, keepdims=True)
+    def backward(self, X: np.ndarray) -> np.ndarray:
+        return X
+ACTIVATIONS: dict[str, Activation] = {
+    "Relu": Relu(),
+    "Sigmoid": Sigmoid(),
+    "Tanh": TanH(),
+    "SoftMax": SoftMax(),
+}

nn/loss.py ADDED Viewed

	@@ -0,0 +1,50 @@

+from abc import ABC, abstractmethod
+from nn.activation import SoftMax
+import numpy as np
+__all__ = ["Loss", "MSE", "CrossEntropy", "LOSSES"]
+class Loss(ABC):
+    @abstractmethod
+    def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
+        pass
+    @abstractmethod
+    def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
+        pass
+class MSE(Loss):
+    def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
+        return np.sum(np.square(y_hat - y_true)) / y_true.shape[0]
+    def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
+        return (y_hat - y_true) * (2 / y_true.shape[0])
+class CrossEntropy(Loss):
+    def forward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
+        y_hat = np.asarray(y_hat)
+        y_true = np.asarray(y_true)
+        m = y_true.shape[0]
+        p = self._softmax(y_hat)
+        log_likelihood = -np.log(p[range(m), y_true.argmax(axis=1)])
+        loss = np.sum(log_likelihood) / m
+        return loss
+    def backward(self, y_hat: np.ndarray, y_true: np.ndarray) -> np.ndarray:
+        y_hat = np.asarray(y_hat)
+        y_true = np.asarray(y_true)
+        return (y_hat - y_true) / y_true.shape[0]
+    @staticmethod
+    def _softmax(X: np.ndarray) -> np.ndarray:
+        return SoftMax().forward(X)
+LOSSES: dict[str, Loss] = {
+    "MSE": MSE(),
+    "CrossEntropy": CrossEntropy(),
+}

nn/nn.py CHANGED Viewed

@@ -1,63 +1,163 @@
-from typing import Callable
-from sklearn.preprocessing import StandardScaler
-import pandas as pd
 class NN:
     def __init__(
         self,
         epochs: int,
-        hidden_size: int,
         learning_rate: float,
-        test_size: float,
-        activation: str,
-        features: list[str],
-        target: str,
-        data: str,
-    ):
         self.epochs = epochs
-        self.hidden_size = hidden_size
         self.learning_rate = learning_rate
-        self.test_size = test_size
-        self.activation = activation
-        self.features = features
-        self.target = target
-        self.data = data
-        self.loss_hist: list[float] = None
-        self.func_prime: Callable = None
-        self.func: Callable = None
-        self.X: pd.DataFrame = None
-        self.y: pd.DataFrame = None
-        self.y_dummy: pd.DataFrame = None
-        self.input_size: int = None
-        self.output_size: int = None
-    def set_df(self, df: pd.DataFrame) -> None:
-        assert isinstance(df, pd.DataFrame)
-        x = df[self.features]
-        y = df[self.target]
-        self.X = pd.get_dummies(x, columns=self.features)
-        self.y_dummy = pd.get_dummies(y, columns=self.target)
-        self.input_size = len(self.X.columns)
-        self.output_size = len(self.y_dummy.columns)
-    def normalize(self):
-        scaler = StandardScaler()
-        self.y_dummy = scaler.fit_transform(self.y_dummy)
-        self.X = scaler.fit_transform(self.X)
-    def set_func(self, f: Callable) -> None:
-        assert isinstance(f, Callable)
-        self.func = f
-    def set_func_prime(self, f: Callable) -> None:
-        assert isinstance(f, Callable)
-        self.func_prime = f
-    @classmethod
-    def from_dict(cls, dct):
-        """ Creates an instance of NN given a dictionary
-        we can use this to make sure that the arguments are right
         """
-        return cls(**dct)

+from typing import Optional
+from nn.activation import ACTIVATIONS, Activation
+from nn.loss import LOSSES, Loss
+import numpy as np
+import gradio as gr
+DTYPE = np.float32
 class NN:
     def __init__(
         self,
         epochs: int,
         learning_rate: float,
+        hidden_size: int,
+        input_size: int,
+        output_size: int,
+        activation_fn: str,
+        loss_fn: str,
+        seed: int,
+    ) -> None:
         self.epochs = epochs
         self.learning_rate = learning_rate
+        self.hidden_size = hidden_size
+        self.input_size = input_size
+        self.output_size = output_size
+        self.seed = seed
+        # try to get activation function and loss funciton
+        act_fn = ACTIVATIONS.get(activation_fn, None)
+        if act_fn is None:
+            raise KeyError(f"Invalid Activation function '{activation_fn}'")
+        loss_fn = LOSSES.get(loss_fn, None)
+        if loss_fn is None:
+            raise KeyError(f"Invalid Activation function '{activation_fn}'")
+        self._activation_fn: Activation = act_fn
+        self._loss_fn: Loss = loss_fn
+        self._loss_history = list()
+        self._weight_history = {
+            "wo": [],
+            "wh": [],
+            "bo": [],
+            "bh": [],
+        }
+        self._wo: Optional[np.ndarray] = None
+        self._wh: Optional[np.ndarray] = None
+        self._bo: Optional[np.ndarray] = None
+        self._bh: Optional[np.ndarray] = None
+        self._init_weights_and_biases()
+    def _init_weights_and_biases(self) -> None:
+        """
+        NN._init_weights_and_biases(): Should only be ran once, right before training loop
+            in order to initialize the weights and biases randomly.
+        params:
+            NN object with hidden layer size, output size, and input size
+            defined.
+        returns:
+            self, modifies _bh, _bo, _wo, _wh NN attributes in place.
+        """
+        np.random.seed(self.seed)
+        self._bh = np.zeros((1, self.hidden_size), dtype=DTYPE)
+        self._bo = np.zeros((1, self.output_size), dtype=DTYPE)
+        self._wh = np.asarray(
+            np.random.randn(self.input_size, self.hidden_size)
+            * np.sqrt(2 / self.input_size),
+            dtype=DTYPE,
+        )
+        self._wo = np.asarray(
+            np.random.randn(self.hidden_size, self.output_size)
+            * np.sqrt(2 / self.hidden_size),
+            dtype=DTYPE,
+        )
+        return
+    def _forward(self, X_train: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
         """
+        _forward(X_train): ran as the first step of each epoch during training.
+        params:
+            X_train: np.ndarray -> data that we are training the NN on.
+        returns:
+            output layer np array containing the predicted outputs calculated using
+            the weights and biases of the current epoch.
+        """
+        assert self._activation_fn is not None
+        # hidden layer
+        hidden_layer_output = self._activation_fn.forward(
+            np.dot(X_train, self._wh) + self._bh
+        )
+        # output layer (prediction layer)
+        y_hat = self._activation_fn.forward(
+            np.dot(hidden_layer_output, self._wo) + self._bo
+        )
+        return y_hat, hidden_layer_output
+    def _backward(
+        self,
+        X_train: np.ndarray,
+        y_hat: np.ndarray,
+        y_train: np.ndarray,
+        hidden_output: np.ndarray,
+    ) -> None:
+        assert self._activation_fn is not None
+        assert self._wo is not None
+        assert self._loss_fn is not None
+        # Calculate the error at the output
+        # This should be the derivative of the loss function with respect to the output of the network
+        error_output = self._loss_fn.backward(
+            y_hat, y_train
+        ) * self._activation_fn.backward(y_hat)
+        # Calculate gradients for output layer weights and biases
+        wo_prime = np.dot(hidden_output.T, error_output) * self.learning_rate
+        bo_prime = np.sum(error_output, axis=0, keepdims=True) * self.learning_rate
+        # Propagate the error back to the hidden layer
+        error_hidden = np.dot(error_output, self._wo.T) * self._activation_fn.backward(
+            hidden_output
+        )
+        # Calculate gradients for hidden layer weights and biases
+        wh_prime = np.dot(X_train.T, error_hidden) * self.learning_rate
+        bh_prime = np.sum(error_hidden, axis=0, keepdims=True) * self.learning_rate
+        # Update weights and biases
+        self._wo -= wo_prime
+        self._wh -= wh_prime
+        self._bo -= bo_prime
+        self._bh -= bh_prime
+    def train(self, X_train: np.ndarray, y_train: np.ndarray) -> "NN":
+        assert self._loss_fn is not None
+        for _ in gr.Progress().tqdm(range(self.epochs)):
+            y_hat, hidden_output = self._forward(X_train=X_train)
+            loss = self._loss_fn.forward(y_hat=y_hat, y_true=y_train)
+            self._loss_history.append(loss)
+            self._backward(
+                X_train=X_train,
+                y_hat=y_hat,
+                y_train=y_train,
+                hidden_output=hidden_output,
+            )
+            # keep track of weights an biases at each epoch for visualization
+            self._weight_history["wo"].append(self._wo[0, 0])
+            self._weight_history["wh"].append(self._wh[0, 0])
+            self._weight_history["bo"].append(self._bo[0, 0])
+            self._weight_history["bh"].append(self._bh[0, 0])
+        return self
+    def predict(self, X_test: np.ndarray) -> np.ndarray:
+        return self._forward(X_train=X_test)[0]

nn/test.py ADDED Viewed

	@@ -0,0 +1,30 @@

+from nn.nn import NN
+import unittest
+TEST_NN = NN(
+    epochs=100,
+    learning_rate=0.001,
+    hidden_size=8,
+    input_size=2,
+    output_size=1,
+    activation_fn="Sigmoid",
+    loss_fn="MSE",
+)
+class TestNN(unittest.TestCase):
+    def test_init_w_b(self) -> None:
+        return
+    def test_forward(self) -> None:
+        return
+    def test_backward(self) -> None:
+        return
+    def test_train(self) -> None:
+        return
+if __name__ == "__main__":
+    unittest.main()

nn/train.py DELETED Viewed

@@ -1,127 +0,0 @@
-from sklearn.model_selection import train_test_split
-from sklearn.metrics import log_loss
-from typing import Callable
-from nn.nn import NN
-import numpy as np
-def init_weights_biases(nn: NN):
-    np.random.seed(0)
-    bh = np.zeros((1, nn.hidden_size))
-    bo = np.zeros((1, nn.output_size))
-    wh = np.random.randn(nn.input_size, nn.hidden_size) * \
-        np.sqrt(2 / nn.input_size)
-    wo = np.random.randn(nn.hidden_size, nn.output_size) * \
-        np.sqrt(2 / nn.hidden_size)
-    return wh, wo, bh, bo
-def train(nn: NN) -> dict:
-    wh, wo, bh, bo = init_weights_biases(nn=nn)
-    X_train, X_test, y_train, y_test = train_test_split(
-        nn.X.to_numpy(),
-        nn.y_dummy.to_numpy(),
-        test_size=nn.test_size,
-        random_state=0,
-    )
-    accuracy_scores = []
-    loss_hist: list[float] = []
-    for _ in range(nn.epochs):
-        # compute hidden output
-        hidden_output = compute_node(
-            data=X_train,
-            weights=wh,
-            biases=bh,
-            func=nn.func,
-        )
-        # compute output layer
-        y_hat = compute_node(
-            data=hidden_output,
-            weights=wo,
-            biases=bo,
-            func=nn.func,
-        )
-        # compute error & store it
-        error = y_hat - y_train
-        loss = log_loss(y_true=y_train, y_pred=y_hat)
-        accuracy = accuracy_score(y_true=y_train, y_pred=y_hat)
-        accuracy_scores.append(accuracy)
-        loss_hist.append(loss)
-        # compute derivatives of weights & biases
-        # update weights & biases using gradient descent after
-        # computing derivatives.
-        dwo = nn.learning_rate * output_weight_prime(hidden_output, error)
-        # Use NumPy to sum along the first axis (axis=0)
-        # and then reshape to match the shape of bo
-        dbo = nn.learning_rate * np.sum(output_bias_prime(error), axis=0)
-        dhidden = np.dot(error, wo.T) * nn.func_prime(hidden_output)
-        dwh = nn.learning_rate * hidden_weight_prime(X_train, dhidden)
-        dbh = nn.learning_rate * hidden_bias_prime(dhidden)
-        wh -= dwh
-        wo -= dwo
-        bh -= dbh
-        bo -= dbo
-    # compute final predictions on data not seen
-    hidden_output_test = compute_node(
-        data=X_test,
-        weights=wh,
-        biases=bh,
-        func=nn.func,
-    )
-    y_hat = compute_node(
-        data=hidden_output_test,
-        weights=wo,
-        biases=bo,
-        func=nn.func,
-    )
-    return {
-        "loss_hist": loss_hist,
-        "log_loss": log_loss(y_true=y_test, y_pred=y_hat),
-        "accuracy_scores": accuracy_scores,
-        "test_accuracy": accuracy_score(y_true=y_test, y_pred=y_hat)
-    }
-def compute_node(data: np.array, weights: np.array, biases: np.array, func: Callable) -> np.array:
-    return func(np.dot(data, weights) + biases)
-def mean_squared_error(y: np.array, y_hat: np.array) -> np.array:
-    return np.mean((y - y_hat) ** 2)
-def hidden_bias_prime(error):
-    return np.sum(error, axis=0)
-def output_bias_prime(error):
-    return np.sum(error, axis=0)
-def hidden_weight_prime(data, error):
-    return np.dot(data.T, error)
-def output_weight_prime(hidden_output, error):
-    return np.dot(hidden_output.T, error)
-def accuracy_score(y_true, y_pred):
-    # Ensure y_true and y_pred have the same shape
-    if y_true.shape != y_pred.shape:
-        raise ValueError("Input shapes do not match.")
-    # Calculate the accuracy
-    num_samples = len(y_true)
-    num_correct = np.sum(y_true == y_pred)
-    return num_correct / num_samples

requirements.txt CHANGED Viewed

@@ -1,8 +1,4 @@
-Flask==2.2.3
-numpy==1.25.2
-pandas==1.5.3
-requests==2.28.2
-scikit_learn==1.3.1
-gunicorn==21.2.0
-Werkzeug==2.2.2
-Flask_Cors==3.0.10

+gradio==4.26.0
+numpy==1.26.4
+plotly==5.20.0
+scikit_learn==1.4.1.post1

vis.py ADDED Viewed

	@@ -0,0 +1,20 @@

+import plotly.express as px
+from sklearn import datasets
+from sklearn.preprocessing import StandardScaler, OneHotEncoder
+import numpy as np
+import os
+def iris_3d_scatter():
+    df = px.data.iris()
+    fig = px.scatter_3d(
+        df,
+        x="sepal_length",
+        y="sepal_width",
+        z="petal_width",
+        color="species",
+        size="petal_length",
+        size_max=18,
+    )
+    fig.update_layout(margin=dict(l=0, r=0, b=0, t=0))
+    return fig