Skip to content

Conversation

@AtharvaPatange
Copy link
Contributor

Overview
This implementation defines a Convolutional Neural Network (CNN) in R using the Keras library. The CNN is designed to automatically extract spatial and hierarchical features from image data for tasks such as image classification, object detection, and visual recognition. The model consists of convolutional, pooling, flatten, and dense layers, with ReLU and Softmax activations.

Features

Automatically extracts local spatial patterns from images using convolutional layers.
Reduces spatial dimensions and retains important features with max-pooling layers.
Adds non-linearity using ReLU activations to capture complex patterns.
Converts 2D feature maps into 1D vectors for fully connected processing.
Dense layers combine extracted features for final classification.
Output layer provides class probabilities using Softmax activation.
Compatible with image datasets such as MNIST and CIFAR-10.
Fully modular architecture that can be expanded with additional layers.

Complexity

Time Complexity: O(E × N × F × K²), where E = epochs, N = number of samples, F = number of filters, K = kernel size

Space Complexity: O(parameters + feature maps)

Demonstration

The included R script defines the CNN model and displays its summary.
To train the model, use model %>% compile() followed by model %>% fit() on an image dataset.
Example datasets: MNIST (28×28 grayscale images) or CIFAR-10 (32×32 RGB images).

Summary
This implementation provides a fully documented CNN architecture in R for image-based tasks. It is suitable for automatic feature extraction and classification and serves as a foundation for building more complex deep learning models in computer vision.

Copilot AI review requested due to automatic review settings October 18, 2025 07:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds new machine learning and dynamic programming algorithms in R, with a focus on introducing a CNN implementation using Keras. It also includes a from-scratch k-NN and Kadane’s algorithm with examples.

  • Add a CNN model definition using Keras with summary output
  • Implement a from-scratch k-NN (classification and regression) with normalization, weighting, and examples
  • Implement Kadane’s algorithm (including circular variant) with examples and timing

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File Description
machine_learning/cnn.r Introduces a CNN architecture and prints model summary; provides comments on complexity and usage.
machine_learning/k-NN.r Adds a complete k-NN implementation (train/predict), utilities, and example runs for classification and regression.
dynamic_programming/kadane's_algo.r Adds Kadane’s algorithm (max subarray) plus a circular variant and multiple example tests.

Comment on lines +35 to +36
filters = 32, kernel_size = c(3, 3), activation = "relu",
input_shape = c(28, 28, 1), padding = "same"
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code hard-codes input_shape = c(28, 28, 1), which supports MNIST-like grayscale, but the PR description claims compatibility with CIFAR-10 (32×32×3). Either generalize input_shape via parameters (e.g., img_height, img_width, channels) or update the comments to reflect the currently supported input.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,272 @@
# k-Nearest Neighbors implementation in R
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File naming is inconsistent with repository conventions; avoid uppercase and punctuation in filenames. Please rename machine_learning/k-NN.r to a lowercase, underscore-based name such as machine_learning/k_nearest_neighbors.r or machine_learning/knn.r.

Copilot generated this review using guidance from repository custom instructions.
@@ -0,0 +1,162 @@
# Kadane's Algorithm in R
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filename contains an apostrophe, which is problematic for tooling and inconsistent with repository naming patterns. Please rename dynamic_programming/kadane's_algo.r to dynamic_programming/kadane_algorithm.r or dynamic_programming/kadane.r (lowercase, underscores, no special characters).

Copilot generated this review using guidance from repository custom instructions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the '

# ---------------------------
knn_accuracy <- function(y_true, y_pred) {
if (length(y_true) != length(y_pred)) stop("Lengths mismatch.")
mean(y_true == y_pred)
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing factors with differing levels can yield warnings and NAs, breaking accuracy computation. Coerce to character (or the same factor levels) before comparison, e.g., mean(as.character(y_true) == as.character(y_pred), na.rm = TRUE).

Suggested change
mean(y_true == y_pred)
mean(as.character(y_true) == as.character(y_pred), na.rm = TRUE)

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +102
# We build matrix: outer(A_sq, rep(1,n)) - 2*cross + outer(rep(1,m), B_sq)
outer(A_sq, rep(1, length(B_sq))) - 2 * cross + outer(rep(1, length(A_sq)), B_sq)
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using outer with rep(1, ...) allocates intermediate matrices and is slower for large inputs. Consider constructing the two additive matrices via recycling to reduce overhead: matrix(A_sq, nrow = length(A_sq), ncol = length(B_sq)) - 2 * cross + matrix(B_sq, nrow = length(A_sq), ncol = length(B_sq), byrow = TRUE).

Suggested change
# We build matrix: outer(A_sq, rep(1,n)) - 2*cross + outer(rep(1,m), B_sq)
outer(A_sq, rep(1, length(B_sq))) - 2 * cross + outer(rep(1, length(A_sq)), B_sq)
# We build matrix: matrix(A_sq, nrow = length(A_sq), ncol = length(B_sq)) - 2*cross + matrix(B_sq, nrow = length(A_sq), ncol = length(B_sq), byrow = TRUE)
matrix(A_sq, nrow = length(A_sq), ncol = length(B_sq)) - 2 * cross + matrix(B_sq, nrow = length(A_sq), ncol = length(B_sq), byrow = TRUE)

Copilot uses AI. Check for mistakes.
Comment on lines +155 to +162
# weights: 1 / (dist + eps)
w <- 1 / (neighbor_dists + eps)
# if any dist==0, set weight large for exact matches
if (any(neighbor_dists == 0)) {
w <- as.numeric(neighbor_dists == 0) * 1e12 # very large weight for exact matches
}
tab <- tapply(w, neighbor_labels, sum)
} else {
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of substituting a very large weight (1e12) for exact matches, short-circuit and return the exact-match class when any neighbor_dists == 0; this avoids extreme weights and simplifies logic. Example: if (any(neighbor_dists == 0)) { preds[i] <- neighbor_labels[which(neighbor_dists == 0)[1]]; probs[i, ] <- as.numeric(model$classes == preds[i]); next }.

Suggested change
# weights: 1 / (dist + eps)
w <- 1 / (neighbor_dists + eps)
# if any dist==0, set weight large for exact matches
if (any(neighbor_dists == 0)) {
w <- as.numeric(neighbor_dists == 0) * 1e12 # very large weight for exact matches
}
tab <- tapply(w, neighbor_labels, sum)
} else {
# Short-circuit: if any neighbor has zero distance, assign its label directly
if (any(neighbor_dists == 0)) {
# Pick the first exact match
preds[i] <- neighbor_labels[which(neighbor_dists == 0)[1]]
probs[i, ] <- as.numeric(model$classes == preds[i])
next
}
# weights: 1 / (dist + eps)
w <- 1 / (neighbor_dists + eps)
tab <- tapply(w, neighbor_labels, sum)

Copilot uses AI. Check for mistakes.
Comment on lines +191 to +195
w <- 1 / (neighbor_dists + eps)
if (any(neighbor_dists == 0)) {
w <- as.numeric(neighbor_dists == 0) * 1e12
}
preds_reg[i] <- sum(w * neighbor_vals) / sum(w)
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For regression, if any neighbor has zero distance, directly return the corresponding neighbor value (or the mean if there are multiple zero-distance neighbors) to avoid using extreme weights. Example: if (any(neighbor_dists == 0)) { z <- neighbor_vals[neighbor_dists == 0]; preds_reg[i] <- mean(z); next }.

Suggested change
w <- 1 / (neighbor_dists + eps)
if (any(neighbor_dists == 0)) {
w <- as.numeric(neighbor_dists == 0) * 1e12
}
preds_reg[i] <- sum(w * neighbor_vals) / sum(w)
if (any(neighbor_dists == 0)) {
# If any neighbor has zero distance, use the mean of their values
z <- neighbor_vals[neighbor_dists == 0]
preds_reg[i] <- mean(z)
} else {
w <- 1 / (neighbor_dists + eps)
preds_reg[i] <- sum(w * neighbor_vals) / sum(w)
}

Copilot uses AI. Check for mistakes.
if (max_wrap > normal) {
return(list(max_sum = max_wrap, start = NA, end = NA, subarray = NA)) # indices for wrap-around not computed here
} else {
return(list(max_sum = normal, start = kadane(arr)$start, end = kadane(arr)$end, subarray = kadane(arr)$subarray))
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid repeated calls to kadane(arr) in the else-branch; call it once, store the result, and reuse it. This prevents redundant O(n) passes.

Suggested change
return(list(max_sum = normal, start = kadane(arr)$start, end = kadane(arr)$end, subarray = kadane(arr)$subarray))
normal_res <- kadane(arr)
return(list(max_sum = normal_res$max_sum, start = normal_res$start, end = normal_res$end, subarray = normal_res$subarray))

Copilot uses AI. Check for mistakes.
# Optional: function to get maximum circular subarray (Kadane + total sum trick)
kadane_circular <- function(arr) {
#' Finds max subarray sum for circular arrays (wrap-around allowed)
#' If all elements are negative, returns max element (non-wrap).
Copy link

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document that when the wrap-around case wins, start/end indices and subarray are returned as NA (not computed). This clarifies API behavior for consumers.

Suggested change
#' If all elements are negative, returns max element (non-wrap).
#' If all elements are negative, returns max element (non-wrap).
#' When the wrap-around case wins, start/end indices and subarray are returned as NA (not computed).

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,162 @@
# Kadane's Algorithm in R
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the '

@github-actions
Copy link

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 23, 2025
@github-actions
Copy link

github-actions bot commented Dec 6, 2025

This PR was closed because it has been stalled for 7 days with no activity.

@github-actions github-actions bot closed this Dec 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants