Skip to content

Conversation

yiyujin
Copy link

@yiyujin yiyujin commented Jun 26, 2025

Bring back object detection in ml5.js

To start, I added legacy object detection code) and p5js example code. Example code is from github reference doc

History

Make code compatible with ml5 1.0
  • load the model in preload() for basic example. (this will change later with p5.js 2.0 integration) -> 7deb4c8
  • new detectStart() and detectStop() functions (instead of calling detect() recursively over and over. -> ce82f51
  • Swap order of error and results argument for results callback. This way the error is optional and handling is not required in basic example. -> 4db442f

Make code scalable to support more models

  • refactor ObjectDetector Class (ObjectDetector wraps model -> ObjectDetector as model) -> 9375570

Next Steps

- p5 2.0 examples : Having this coco-ssd work with p5.js 2.0 is priority over supporting Transformers.js.
img
p5js 2.0 error (although it doesn't halt the detection)

@shiffman
Copy link
Member

Yay! Great job @yiyujin! As discussed, if it's helpful here's a quick list of things that we changed with ml5.js 1.0.

  • load the model in preload() for basic example. (this will change later with p5.js 2.0 integration)
  • new detectStart() and detectStop() functions (instead of calling detect() recursively over and over.
  • Swap order of error and results argument for results callback. This way the error is optional and handling is not required in basic example.

yiyujin added 2 commits July 1, 2025 22:28
…ct (instance.model), added mediaReady utils so that detection waits for video to be ready before detecting, updated tf.backend to webgl to avoid webgpu warning
@yiyujin
Copy link
Author

yiyujin commented Jul 10, 2025

Thank you @shiffman for the neat instructions!

  • load the model in preload() for basic example. (this will change later with p5.js 2.0 integration) -> 7deb4c8

  • updated ObjectDetector Class to return object (instance.model) so it can be loaded in preload()

  • added mediaReady utils so that detection waits for video to be ready before detecting.

  • added tf.backend(webgl) to avoid webgpu warning

  • new detectStart() and detectStop() functions (instead of calling detect() recursively over and over. -> ce82f51

  • Swap order of error and results argument for results callback. This way the error is optional and handling is not required in basic example. -> 4db442f

@yiyujin
Copy link
Author

yiyujin commented Jul 10, 2025

Based on today's meetup with @shiffman cc.@nasif-co

After all, final output for the project would be the p5js example code.

Example code (in order of priority) : bcd1788

  • object detection on image
  • object detection on webcam
  • object detection on video (would be interesting for curation purpose. ex. blazePose - was easy to demo with a shot video)

I guess next will be the docs, which I can draft out in parallel with the code !

@yiyujin
Copy link
Author

yiyujin commented Jul 11, 2025

I made the 3 changes to code so it works with ml5.js 1.0

💪 This week, I want to digest code better and draft a docs that explains the common pattern (functions and its roles) used across models : start, stop etc. I need it for myself anyways + could be a useful onboarding docs for future dev team.

Some notes on naming conventions :
Last meeting (w Dan, Nasif) I recall we briefly talked about naming conventions, how tensorflow.js uses a general word "predict" while ml5.js uses more model specific words like classify, detect, etc. I guess it's also different by context : user code vs. source code (such as results vs. detections / classifications..).
Anyhow, consistency matters so maybe a central docs can include some naming convention info too. Not much opinion on this so far, but just wanted to mention before I forget :-)

(speaking of which.. I think my branch should have been object-detector, not object-detection..? 😭)

  • Object Detection - Dev Notes (meant for onboarding)

@yiyujin
Copy link
Author

yiyujin commented Jul 11, 2025

After a brief code review from @pearmini 🙏 (wow I mindlessly copied the code to begin with then didn't think through!)

  • ObjectDetector is a redundant layer now because it just has a constructor.

  • Class is meant for methods or shared behavior across models.

  • refactor ObjectDetector Class (ObjectDetector wraps model -> ObjectDetector as model) -> 9375570

@pearmini can you take a glance at this please?

@yiyujin
Copy link
Author

yiyujin commented Jul 15, 2025

Just dropping notes for future tasks :

  • converting JSDoc to Typescript ?
  • object segmentation model

@yiyujin
Copy link
Author

yiyujin commented Jul 15, 2025

Copy link
Member

@shiffman shiffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yiyujin fantastic work! I left a few small comments in the basic example!

ml5 Example
Object Detection using COCOSSD
This example uses a callback pattern to create the classifier
=== */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tiny thing, but sometime this year we adopted a "friendlier" pattern for comments at the top of an example sketch. I don't think we need the copyright and technically the license should be the ml5.js one. Here is what it looks like from a hand pose example:

/*
 * 👋 Hello! This is an ml5.js example made and shared with ❤️.
 * Learn more about the ml5.js project: https://ml5js.org/
 * ml5.js license and Code of Conduct: https://github.com/ml5js/ml5-next-gen/blob/main/LICENSE.md
 *
 * This example demonstrates face tracking on live video through ml5.faceMesh.
 */

(This also reminds me that we discussed moving the Code of Conduct to the website, I forgot where we are with that though maybe @MOQN remembers?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll include it for sure!

image(video, 0, 0);

for (let i = 0; i < detections.length; i += 1) {
const detection = detections[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const detection = detections[i];
let detection = detections[i];

We're adopting p5.js style of using let even where const would be more typical JS.


function gotDetections(results) {
detections = results;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't over do it, but a few concise explanatory comments might be good to add.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shiffman thank you for the comments! I will update it for review.

Comment on lines 50 to 60
switch (this.modelName) {
case "cocossd":
this.modelToUse = cocoSsd;
break;
case "yolo":
this.modelToUse = yolo;
break;
// more models... currently only cocossd is supported
default:
console.warn(`Unknown model: ${this.modelName}, defaulting to CocoSsd`);
this.modelToUse = cocoSsd;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Where is variable yolo and coco defined?
  2. You can move the switch logic into the wrapper function, so you pass the model directly instead a string.

@yiyujin
Copy link
Author

yiyujin commented Jul 23, 2025

I updated the three example codes -> bcd1788

- webcam (examples/objectDetection-webcam/sketch.js)
- single image file (examples/objectDetection-sigle-image/sketch.js)
- video file (examples/objectDetection-video/sketch.js)

I'm esp. not sure if the video file example is at its best, since I'm adding a bunch of video eventListeners to check if video is loadedmetadata/play/pause/ended. I don't think p5 has a method on video to do that if I'm correct..

Comment on lines 20 to 44
canvasElement = createCanvas(640, 480);
canvasElement.position(0, 0);

// Create video element (paused by default)
video = createVideo('test.mov');
video.position(0, 0);
video.volume(0);
video.showControls();

// Make canvas transparent and on top
canvasElement.style('z-index', '1');
canvasElement.style('pointer-events', 'none'); // Allow clicks to pass through to video
video.style('z-index', '-1');

// Set up video event listeners
video.elt.addEventListener('loadedmetadata', () => {
console.log('Video metadata loaded');

// Resize canvas to match video size
resizeCanvas(video.elt.videoWidth, video.elt.videoHeight);
});

video.elt.addEventListener('play', () => {
console.log('Video started playing');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I woudl avoid all this extra code if you can! You can treat the video just like the webcam, have it auto-start and loop. Here's an example from BlazePose you can use as a model:

https://editor.p5js.org/codingtrain/sketches/ftALPDieT

But yours can be simpler b/c you can hide the video element and draw it on the canvas.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh okay! I will forget about showControls and try video.hide + video.loop(). Thank you 🙏

Copy link
Member

@shiffman shiffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple last small comments on the video example and then some discoveries about video resizing! Great work @yiyujin!


// Load and loop the video for object detection
video = createVideo('test.mov');
video.size(width, height);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting our discussion to use a video we have permission for. Also, I would suggest sizing the video to 640x480 so it doesn't appear squashed or skewed. Or just make the canvas size identical to the video. You can add a code comment that explains that if the video does not match the canvas size, you may need to adjust the code to scale.

Comment on lines 40 to 49
let scaleX = width / video.elt.videoWidth;
let scaleY = height / video.elt.videoHeight;

for (let i = 0; i < detections.length; i++) {
let detection = detections[i];

let x = detection.x * scaleX;
let y = detection.y * scaleY;
let w = detection.width * scaleX;
let h = detection.height * scaleY;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that if the video is sized the same as the canvas then this is unnecessary which is preferable for a basic example. However, if we do have to use this we should use the p5.js video.width and video.height property directly.

I would have thought if you call video.size(640, 480) then the x, y, width, height returned would match the new size. Could this be related to the bug @nasif-co is fixing in #264??

Comment on lines +18 to +22
createCanvas(640, 480);

// Using webcam feed as video input, hiding html element to avoid duplicate with canvas
video = createCapture(VIDEO);
video.size(width, height);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just tested and confirm that the #264 applies here too! If I change the webcam to 320,240 then the bounding boxes are all off. @yiyujin I don't think this has to be fixed in this PR, it can be done separately after the fact or along with #264. @nasif-co maybe you can help track this in the issue thread or a new issue so that we can merge at least the SelfieSegmentation fix?

@yiyujin yiyujin marked this pull request as ready for review July 29, 2025 20:45
@yiyujin
Copy link
Author

yiyujin commented Jul 30, 2025

  • rename test.mp4 to sth more meaningful (thank you @gohai )

946b084
Not too creative, changed it to ball_lifting.mp4

@yiyujin
Copy link
Author

yiyujin commented Jul 31, 2025

  • create p5 2.0 examples

https://github.com/ml5js/ml5-next-gen/tree/main/examples#p5js-20-examples

-> will be handled in another PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants