Skip to content

Conversation

@gbeane
Copy link
Collaborator

@gbeane gbeane commented Jan 10, 2026

Optimize Classification Performance (~2x Faster)

Summary

This PR optimizes classification performance by eliminating redundant classifier calls during prediction. Both GUI and command-line classification are now approximately 2x faster with identical results.

Performance Improvement

Problem

During classification, the code was calling both predict() and predict_proba() separately on the same data:

predictions = classifier.predict(data, frame_indexes)
probabilities = classifier.predict_proba(data, frame_indexes)

This caused the classifier to run twice per identity because predict() internally computes probabilities and then thresholds them at 0.5. For large pose files with multiple identities, this resulted in unnecessarily slow classification.

Solution

Call predict_proba() once and derive predictions using np.argmax():

# Get probabilities for all classes
probabilities = classifier.predict_proba(data, frame_indexes)

# Derive predictions by taking argmax (class with highest probability)
# This is equivalent to predict() but avoids duplicate computation
predictions = np.argmax(probabilities, axis=1).astype(np.int8)

The np.argmax(probabilities, axis=1) returns the index of the class with the highest probability for each frame, which for binary classification is 0 (not behavior) or 1 (behavior) - exactly what predict() would return.

Files Changed

  • src/jabs/ui/classification_thread.py - GUI classification in the main JABS application
  • src/jabs/scripts/classify.py - Command-line jabs-classify tool

Correctness

Results are identical - np.argmax(predict_proba()) produces the exact same predictions as predict()

No breaking changes - All existing code continues to work

Tested - Classification produces identical results with the optimization

@gbeane gbeane self-assigned this Jan 10, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes classification performance by eliminating redundant classifier calls. Instead of calling both predict() and predict_proba() separately, it now calls predict_proba() once and derives predictions using np.argmax(), resulting in approximately 2x faster classification with identical results.

Changes:

  • Replaced separate predict() and predict_proba() calls with a single predict_proba() call
  • Derived predictions using np.argmax() on probability outputs
  • Updated comments to reflect the optimization approach

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/jabs/ui/classification_thread.py Optimized GUI classification thread to eliminate duplicate classifier calls
src/jabs/scripts/classify.py Optimized command-line classification script to eliminate duplicate classifier calls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gbeane gbeane merged commit bbf64ed into main Jan 12, 2026
2 checks passed
@gbeane gbeane deleted the avoid-duplicate-computation branch January 12, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants