diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 0000000..dd84ea7 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,38 @@ +--- +name: Bug report +about: Create a report to help us improve +title: '' +labels: '' +assignees: '' + +--- + +**Describe the bug** +A clear and concise description of what the bug is. + +**To Reproduce** +Steps to reproduce the behavior: +1. Go to '...' +2. Click on '....' +3. Scroll down to '....' +4. See error + +**Expected behavior** +A clear and concise description of what you expected to happen. + +**Screenshots** +If applicable, add screenshots to help explain your problem. + +**Desktop (please complete the following information):** + - OS: [e.g. iOS] + - Browser [e.g. chrome, safari] + - Version [e.g. 22] + +**Smartphone (please complete the following information):** + - Device: [e.g. iPhone6] + - OS: [e.g. iOS8.1] + - Browser [e.g. stock browser, safari] + - Version [e.g. 22] + +**Additional context** +Add any other context about the problem here. diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 0000000..bbcbbe7 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,20 @@ +--- +name: Feature request +about: Suggest an idea for this project +title: '' +labels: '' +assignees: '' + +--- + +**Is your feature request related to a problem? Please describe.** +A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] + +**Describe the solution you'd like** +A clear and concise description of what you want to happen. + +**Describe alternatives you've considered** +A clear and concise description of any alternative solutions or features you've considered. + +**Additional context** +Add any other context or screenshots about the feature request here. diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000..5f2a828 --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,76 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +In the interest of fostering an open and welcoming environment, we as +contributors and maintainers pledge to making participation in our project and +our community a harassment-free experience for everyone, regardless of age, body +size, disability, ethnicity, sex characteristics, gender identity and expression, +level of experience, education, socio-economic status, nationality, personal +appearance, race, religion, or sexual identity and orientation. + +## Our Standards + +Examples of behavior that contributes to creating a positive environment +include: + +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members + +Examples of unacceptable behavior by participants include: + +* The use of sexualized language or imagery and unwelcome sexual attention or + advances +* Trolling, insulting/derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or electronic + address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Our Responsibilities + +Project maintainers are responsible for clarifying the standards of acceptable +behavior and are expected to take appropriate and fair corrective action in +response to any instances of unacceptable behavior. + +Project maintainers have the right and responsibility to remove, edit, or +reject comments, commits, code, wiki edits, issues, and other contributions +that are not aligned to this Code of Conduct, or to ban temporarily or +permanently any contributor for other behaviors that they deem inappropriate, +threatening, offensive, or harmful. + +## Scope + +This Code of Conduct applies both within project spaces and in public spaces +when an individual is representing the project or its community. Examples of +representing a project or community include using an official project e-mail +address, posting via an official social media account, or acting as an appointed +representative at an online or offline event. Representation of a project may be +further defined and clarified by project maintainers. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported by contacting the project team at ryan.dsilva.98@gmail.com. All +complaints will be reviewed and investigated and will result in a response that +is deemed necessary and appropriate to the circumstances. The project team is +obligated to maintain confidentiality with regard to the reporter of an incident. +Further details of specific enforcement policies may be posted separately. + +Project maintainers who do not follow or enforce the Code of Conduct in good +faith may face temporary or permanent repercussions as determined by other +members of the project's leadership. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, +available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html + +[homepage]: https://www.contributor-covenant.org + +For answers to common questions about this code of conduct, see +https://www.contributor-covenant.org/faq diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..888972d --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,8 @@ +# Contributing + +Keeping this simple. To contribute, +1. Create an issue with the feature request/bug or ask to be assigned one of the existing issues. +2. Create a new branch from develop and make all your changes in that branch. +3. Ask to be merged into develop, adding either [@RyanDsilva](https://github.com/RyanDsilva) or [@sanfernoronha](https://github.com/sanfernoronha) as reviewer + +Thanks for contributing, let's make this project help thousands of people get started with Neural Networks diff --git a/README.md b/README.md index 6d1f878..5430d92 100644 --- a/README.md +++ b/README.md @@ -66,20 +66,32 @@ True Values: ## Roadmap 📑 -- [x] Basic Activation Functions -- [x] Basic Loss Functions -- [x] Gradient Descent +- [ ] Activation Functions + - [x] Linear + - [x] Sigmoid + - [x] Tanh + - [x] Tanh + - [x] ReLu + - [ ] LeakyReLu + - [ ] SoftMax + - [ ] GeLu +- [ ] Loss Functions + - [x] MAE + - [x] MSE + - [ ] CrossEntropy +- [ ] Optimizers Functions + - [x] Gradient Descent + - [x] Gradient Descent w/ Momentum + - [ ] Nestrov's Accelerated + - [ ] RMSProp + - [ ] Adam +- [ ] Regularization + - [ ] L1 + - [ ] L2 + - [ ] Dropout - [x] Layer Architecture - [x] Wrapper Classes - [x] Hyperparameters Configuration -- [ ] Exotic Functions - - [ ] SoftMax Activation - - [ ] Gradient Descent w/ Momentum - - [ ] RMSProp Optimizer - - [ ] Adam Optimizer - - [ ] CrossEntropy Loss Function - - [ ] GeLu Activation -- [ ] Regularization - [ ] Clean Architecture - [ ] UI (Similar to Tensorflow Playground) @@ -87,6 +99,15 @@ True Values: ###### Collaborations in implementing and maintaining this project are welcome. Kindly reach out to me if interested. +## Contributers 🌟 + + + + + + + + ## References 📚 - Deep Learning Specialization, Andrew NG - Coursera diff --git a/core/dense.py b/core/dense.py index 1762e7d..5dde8c9 100644 --- a/core/dense.py +++ b/core/dense.py @@ -9,6 +9,8 @@ class Dense(Layer): def __init__(self, input_size, output_size): self.weights = np.random.rand(input_size, output_size) - 0.5 self.bias = np.random.rand(1, output_size) - 0.5 + self.vW = np.zeros([input_size, output_size]) + self.vB = np.zeros([1, output_size]) def forward_propagation(self, input_data): self.input = input_data @@ -19,8 +21,12 @@ def backward_propagation(self, output_error, optimizer_fn, learning_rate): input_error = np.dot(output_error, self.weights.T) dW = np.dot(self.input.T, output_error) dB = output_error - w_updated, b_updated = optimizer_fn( - self.weights, self.bias, dW, dB, learning_rate) + + w_updated, b_updated, vW_updated, vB_updated = optimizer_fn.minimize( + self.weights, self.bias, dW, dB, self.vW, self.vB, learning_rate + ) self.weights = w_updated self.bias = b_updated + self.vW = vW_updated + self.vB = vB_updated return input_error diff --git a/main.py b/main.py index 82e975e..59c408a 100644 --- a/main.py +++ b/main.py @@ -1,4 +1,5 @@ import numpy as np +import time import config from core.network import Network @@ -6,26 +7,26 @@ from core.activation_layer import Activation from activations.activation_functions import Tanh, dTanh from loss.loss_functions import MSE, dMSE -from optimizers.optimizer_functions import GradientDescent +from optimizers.optimizer_functions import Momentum from keras.datasets import mnist from keras.utils import np_utils # Load MNIST (x_train, y_train), (x_test, y_test) = mnist.load_data() -x_train = x_train.reshape(x_train.shape[0], 1, 28*28) -x_train = x_train.astype('float32') +x_train = x_train.reshape(x_train.shape[0], 1, 28 * 28) +x_train = x_train.astype("float32") x_train /= 255 y_train = np_utils.to_categorical(y_train) -x_test = x_test.reshape(x_test.shape[0], 1, 28*28) -x_test = x_test.astype('float32') +x_test = x_test.reshape(x_test.shape[0], 1, 28 * 28) +x_test = x_test.astype("float32") x_test /= 255 y_test = np_utils.to_categorical(y_test) # Model nn = Network() -nn.add(Dense(28*28, 100)) +nn.add(Dense(28 * 28, 100)) nn.add(Activation(Tanh, dTanh)) nn.add(Dense(100, 50)) nn.add(Activation(Tanh, dTanh)) @@ -33,10 +34,12 @@ nn.add(Activation(Tanh, dTanh)) # Training + nn.useLoss(MSE, dMSE) -nn.useOptimizer(GradientDescent, learning_rate=config.learning_rate) +nn.useOptimizer(Momentum(), learning_rate=config.learning_rate) nn.fit(x_train[0:2000], y_train[0:2000], epochs=config.epochs) + # Prediction out = nn.predict(x_test[0:2]) print("\nPredicted Values: ") diff --git a/optimizers/README.md b/optimizers/README.md index a1bf819..1e12546 100644 --- a/optimizers/README.md +++ b/optimizers/README.md @@ -18,6 +18,8 @@ Optimizer Functions help us update the parameters in the most efficient way poss + `vdW: accumulator for weight parameter | beta: momentum term (dampening factor) | dJ/dW: weights gradient (obtained from loss function)` + - RMSProp diff --git a/optimizers/optimizer_functions.py b/optimizers/optimizer_functions.py index ecc2efb..3936221 100644 --- a/optimizers/optimizer_functions.py +++ b/optimizers/optimizer_functions.py @@ -1,8 +1,9 @@ import numpy as np -def GradientDescent(w, b, dW, dB, learning_rate=0.01): - """Implements Gradient Descent to find minima of cost function +class GradientDescent: + def minimize(self, w, b, dW, dB, vW, vB, learning_rate=0.01): + """Implements Gradient Descent to find minima of cost function Parameters: - w (numpy array): weights matrix @@ -16,32 +17,66 @@ def GradientDescent(w, b, dW, dB, learning_rate=0.01): - b_updated (numpy array): updated bias """ - w_updated = w - learning_rate*dW - b_updated = b - learning_rate*dB - return w_updated, b_updated + w_updated = w - learning_rate * dW + b_updated = b - learning_rate * dB + return w_updated, b_updated, vW, vB -def Momentum(w, b, dW, dB, learning_rate=0.01, beta=0.9): - """Implements Gradient Descent with Momentum to find minima of cost function +class Momentum: + def minimize(self, w, b, dW, dB, vW, vB, learning_rate=0.01, beta=0.9): + """Implements Gradient Descent with Momentum to find minima of cost function - Parameters: - - w (numpy array): weights matrix - - b (numpy array): bias matrix - - dW (numpy array): gradient of weights matrix wrt cost function - - dB (numpy array): gradient of bias matrix wrt cost function - - learning_rate (double): learning rate used to update weights - - beta (double): + Parameters: + - w (numpy array): weights matrix + - b (numpy array): bias matrix + - dW (numpy array): gradient of weights matrix wrt cost function + - dB (numpy array): gradient of bias matrix wrt cost function + - learning_rate (double): learning rate used to update weights + - beta (double): Momentum term for smoothing + - vW (numpy array): holds the state of the optimizer for previous iteration (weights) + - vB (numpy array): holds the state of the optimizer for previous iterations (biases) - Returns: - - w_updated (numpy array): updated weights - - b_updated (numpy array): updated bias + Returns: + - w_updated (numpy array): updated weights + - b_updated (numpy array): updated bias + - vW (numpy array): updated state of the optimizer for current iteration (weights) + - vB (numpy array): updated state of the optimizer for current iteration (biases) - """ - pass + """ + vW = beta * vW + (1 - beta) * dW + vB = beta * vB + (1 - beta) * dB + w_updated = w - learning_rate * vW + b_updated = b - learning_rate * vB + return w_updated, b_updated, vW, vB -def RMSProp(w, b, dW, dB, learning_rate, beta, epsilon): - pass + +class RMSProp: + def minimize(self, w, b, dW, dB, sW, sB, learning_rate=0.01, beta=0.9,epsilon=1e-07): + """Implements Gradient Descent with RMSprop to find minima of cost function + Parameters: + - w (numpy array): weights matrix + - b (numpy array): bias matrix + - dW (numpy array): gradient of weights matrix wrt cost function + - dB (numpy array): gradient of bias matrix wrt cost function + - learning_rate (double): learning rate used to update weights + - beta (double): Momentum term for smoothing + - sW (numpy array): holds the state of the optimizer for previous iteration (weights) + - sB (numpy array): holds the state of the optimizer for previous iterations (biases) + - epsilon(double): a small constant for numerical stability + + Returns: + - w_updated (numpy array): updated weights + - b_updated (numpy array): updated bias + - sW (numpy array): updated state of the optimizer for current iteration (weights) + - sB (numpy array): updated state of the optimizer for current iteration (biases) + """ + sW = beta*sW + (1-beta)*np.square(dW) + sB = beta*sB + (1-beta)*np.square(dB) + w_updated = w - (learning_rate*dW)/(np.sqrt(sW)+epsilon) + b_updated = b - (learning_rate*dB)/(np.sqrt(sB)+epsilon) + + return w_updated, b_updated, sW, sB def Adam(w, b, dW, dB, learning_rate, beta1, beta2, epsilon):