Skip to content

Commit 4a09b4f

Browse files
authored
Merge pull request #1228 from ramana2074/main
object detection using YOLO
2 parents 15a9c7d + cb5ea49 commit 4a09b4f

File tree

4 files changed

+114
-0
lines changed

4 files changed

+114
-0
lines changed
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Object Detection using YOLO
2+
## Introduction
3+
The task of object detection involves identifying and localizing multiple objects within an image or video. In this project, we use the YOLO (You Only Look Once) algorithm, a state-of-the-art object detection model, to detect and classify objects from an image. YOLO is known for its high speed and accuracy, making it suitable for real-time object detection applications.
4+
5+
## Algorithms Used
6+
**YOLO (You Only Look Once):**
7+
YOLO is a deep learning-based object detection algorithm that frames object detection as a single regression problem, straight from image pixels to bounding box coordinates and class probabilities. It divides an image into a grid and predicts bounding boxes and probabilities for each grid cell. YOLO is fast and efficient, making it ideal for real-time detection tasks.
8+
9+
**Convolutional Neural Networks (CNNs):**
10+
YOLO relies on CNNs to extract features from images and classify objects. The network is pre-trained on a large dataset (COCO) and then fine-tuned on new datasets for specific tasks.
11+
12+
**Non-Maximum Suppression (NMS):**
13+
This algorithm is used to filter overlapping bounding boxes. NMS ensures that the best prediction is chosen by suppressing weaker, overlapping boxes, reducing redundancy and improving detection accuracy.
14+
15+
## Performance Analysis
16+
17+
**Accuracy:** The accuracy of object detection models like YOLO depends on factors such as image quality, resolution, and object size. In this code, a confidence threshold of 0.7 is used to filter low-confidence predictions, ensuring that only highly confident detections are displayed.
18+
19+
**Speed:** YOLO is known for its real-time detection capabilities. Using a model like YOLOv3, detection speed is optimized. Inference time is generally under a second, making YOLO suitable for video streams or high-throughput image detection tasks.
20+
21+
**Challenges:** False positives or missed detections may occur if objects are small or partially obscured. Model performance can vary with different confidence thresholds or non-maximum suppression settings.
22+
23+
## Result
24+
The model successfully processes an input image and detects objects within it. Detected objects are highlighted with bounding boxes, and the class names are displayed with confidence scores. The result is visualized using matplotlib, which shows the image with detected objects after filtering through non-maximum suppression.
25+
26+
## Future Work
27+
28+
**Custom Dataset Fine-tuning:**
29+
30+
Fine-tuning the YOLO model on a custom dataset specific to certain use cases (e.g., bird species detection) could lead to improved accuracy in specialized domains.
31+
32+
**Integration with Real-Time Systems:**
33+
Implementing the YOLO model in a real-time system, such as live video streams, can make it useful for applications like surveillance, traffic monitoring, or wildlife observation.
34+
35+
**Improved Data Augmentation:**
36+
Data augmentation techniques like image rotation, flipping, and cropping could be applied to the training set to increase the model’s robustness to variations in lighting, angles, and object positions.
361 KB
Loading
887 KB
Loading
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
import cv2
2+
import numpy as np
3+
import matplotlib.pyplot as plt
4+
5+
# Load YOLO model
6+
# The yolo model can be downloaded directly from the official website of YOLO algorithm
7+
yolo = cv2.dnn.readNet("C:\\Users\\billa\\OneDrive\\Desktop\\Programs\\ML_DL\\yolov3.weights",
8+
"C:\\Users\\billa\\OneDrive\\Desktop\\Programs\\ML_DL\\yolov3.cfg")
9+
10+
# Load class names
11+
classes = []
12+
with open("C:\\Users\\billa\\OneDrive\\Desktop\\Programs\\ML_DL\\coco (1).names", 'r') as f:
13+
classes = f.read().splitlines()
14+
15+
# Load image
16+
img = cv2.imread("C:\\Users\\billa\\OneDrive\\Desktop\\Programs\\ML_DL\\ggg.jpg")
17+
if img is None:
18+
print("Error loading image.")
19+
height, width = img.shape[:2] # Get image height and width
20+
21+
# Prepare the image for YOLO
22+
blob = cv2.dnn.blobFromImage(img, 1/255, (416, 416), (0, 0, 0), swapRB=True, crop=False)
23+
yolo.setInput(blob)
24+
25+
# Get output layer names and run forward pass
26+
output_layers_names = yolo.getUnconnectedOutLayersNames()
27+
layer_output = yolo.forward(output_layers_names)
28+
29+
# Initialize lists
30+
boxes = []
31+
confidences = []
32+
class_ids = []
33+
34+
# Process each detection
35+
for output in layer_output:
36+
for detection in output:
37+
scores = detection[5:]
38+
class_id = np.argmax(scores)
39+
confidence = scores[class_id]
40+
if confidence > 0.7: # Increased confidence threshold
41+
center_x = int(detection[0] * width)
42+
center_y = int(detection[1] * height)
43+
w = int(detection[2] * width)
44+
h = int(detection[3] * height)
45+
46+
# Calculate top-left corner coordinates
47+
x = int(center_x - w / 2)
48+
y = int(center_y - h / 2)
49+
50+
# Append detection information
51+
boxes.append([x, y, w, h])
52+
confidences.append(float(confidence))
53+
class_ids.append(class_id)
54+
55+
# Perform Non-Maximum Suppression
56+
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
57+
58+
# Font for displaying labels
59+
font = cv2.FONT_HERSHEY_PLAIN
60+
# Random colors for each box
61+
colors = np.random.randint(0, 255, size=(len(boxes), 3), dtype='uint8')
62+
63+
# Check if any boxes are returned
64+
if len(indexes) > 0:
65+
indexes = indexes.flatten() # Flatten the list of indexes
66+
67+
# Draw bounding boxes and labels
68+
for i in indexes:
69+
x, y, w, h = boxes[i]
70+
label = str(classes[class_ids[i]])
71+
color = [int(c) for c in colors[i]]
72+
cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
73+
cv2.putText(img, label, (x, y - 10), font, 2, (255, 255, 255), 2)
74+
75+
# Display the image with matplotlib
76+
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
77+
plt.axis('off') # Hide axis
78+
plt.show()

0 commit comments

Comments
 (0)