|
| 1 | +# Object Detection with OpenCV and SSD MobileNet v3 |
| 2 | +## Description |
| 3 | + |
| 4 | + |
| 5 | + |
| 6 | + |
| 7 | +This Python script leverages OpenCV and a pre-trained SSD MobileNet v3 deep learning model for real-time object detection in video streams. It utilizes the COCO (Common Objects in Context) dataset for object class recognition. |
| 8 | + |
| 9 | +## Requirements |
| 10 | + |
| 11 | +- Python 3.x ([Download](https://www.python.org/downloads/)) |
| 12 | +- OpenCV library: Install via `pip install opencv-python` |
| 13 | +- NumPy library: Install via `pip install numpy` (usually included with OpenCV) |
| 14 | +- `coco.names` file: Contains class labels for the COCO dataset (download from a reliable source) |
| 15 | +- `ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt` file: Model configuration file (download from a reliable source) |
| 16 | +- `frozen_inference_graph.pb` file: Model weights file (download from a reliable source) |
| 17 | + |
| 18 | +## Instructions |
| 19 | + |
| 20 | +1. **Download necessary files:** |
| 21 | + - Obtain the `coco.names`, `ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt`, and `frozen_inference_graph.pb` files from a trusted source (ensure compatibility with your OpenCV version). Place them in the same directory as your Python script. |
| 22 | + |
| 23 | +2. **Run the script:** |
| 24 | + - Open a terminal or command prompt, navigate to the directory containing your script and files. |
| 25 | + - Execute the script using: |
| 26 | + ```sh |
| 27 | + python object_detection.py |
| 28 | + ``` |
| 29 | + |
| 30 | +## Code Breakdown |
| 31 | + |
| 32 | +1. **Imports:** |
| 33 | + - `cv2`: Imports the OpenCV library for computer vision tasks. |
| 34 | + |
| 35 | +2. **Threshold and Video Capture:** |
| 36 | + - `thres`: Sets the confidence threshold for object detection (adjust as needed). |
| 37 | + - `cap`: Initializes a video capture object (`VideoCapture(1)`) to access your webcam (or a different video source by providing its index or path). |
| 38 | + - `cap.set()`: Sets video capture properties: |
| 39 | + - `3`: Width (adjust for desired resolution) |
| 40 | + - `4`: Height (adjust for desired resolution) |
| 41 | + - `10`: Brightness (adjust for lighting conditions) |
| 42 | +
|
| 43 | +3. **Load Class Names:** |
| 44 | + - `classNames`: Creates an empty list to store object class names. |
| 45 | + - `classFile`: Path to the `coco.names` file. |
| 46 | + - Loads class names from the file and splits them into a list. |
| 47 | +
|
| 48 | +4. **Load Model Configuration and Weights:** |
| 49 | + - `configPath`: Path to the `ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt` file. |
| 50 | + - `weightsPath`: Path to the `frozen_inference_graph.pb` file. |
| 51 | + - `net`: Creates a detection model object using `cv2.dnn_DetectionModel()`. |
| 52 | + - Sets model input size, scale, mean, and color channel swapping parameters for compatibility with the model. |
| 53 | +
|
| 54 | +5. **Main Loop:** |
| 55 | + - `while True`: Continuously captures frames from the video stream. |
| 56 | + - `success, img`: Reads a frame and checks for success. Exits if unsuccessful. |
| 57 | + - `classIds`, `confs`, `bbox`: Performs object detection using the model on the current frame. |
| 58 | + - `classIds`: List of detected object class IDs. |
| 59 | + - `confs`: List of corresponding confidence scores (0-1). |
| 60 | + - `bbox`: List of bounding boxes (coordinates) for detected objects. |
| 61 | + - Prints detected object IDs and bounding boxes for debugging (optional). |
| 62 | +
|
| 63 | +6. **Draw Bounding Boxes and Labels:** |
| 64 | + - Checks if any objects were detected (`len(classIds) != 0`). |
| 65 | + - Iterates through detected objects using `zip`: |
| 66 | + - `box`: Current bounding box coordinates. |
| 67 | + - `classId`: Current object class ID (minus 1 for indexing). |
| 68 | + - `confidence`: Current object confidence score. |
| 69 | + - Draws a green rectangle around the detected object using `cv2.rectangle()`. |
| 70 | + - Displays the corresponding class name (uppercase) and confidence score (rounded to two decimal places) using `cv2.putText()`. |
| 71 | +
|
| 72 | +7. **Display and Exit:** |
| 73 | + - `cv2.imshow()`: Displays the processed frame with bounding boxes and labels in a window titled "Output". |
| 74 | + - `cv2.waitKey(1)`: Waits for a key press for 1 millisecond. |
| 75 | + - Exits the loop and releases resources if the 'q' key is pressed. |
| 76 | +
|
| 77 | +8. **Cleanup:** |
| 78 | + - Releases the video capture object and closes all OpenCV windows. |
| 79 | + ```python |
| 80 | + cap.release() |
| 81 | + cv2.destroyAllWindows() |
| 82 | + ``` |
| 83 | +
|
| 84 | +## License |
| 85 | +
|
| 86 | +This project is licensed under the MIT License. |
| 87 | +thannk |
| 88 | +
|
| 89 | +
|
0 commit comments