Skip to content

[ENH] Feature as Predictor#6852

Merged
markotoplak merged 2 commits intobiolab:masterfrom
janezd:classify-by-column
Jun 13, 2025
Merged

[ENH] Feature as Predictor#6852
markotoplak merged 2 commits intobiolab:masterfrom
janezd:classify-by-column

Conversation

@janezd
Copy link
Contributor

@janezd janezd commented Jul 13, 2024

Issue

Closes #6813.

Description of changes

A widget that "predicts" classes from a single column. I have several questions.

  • The widget offers
    • discrete columns whose values are the same, or a subset of, class values,
    • and numeric columns if the class is binary.
  • When using numeric columns, its values are used as probabilities of class with index 1 (in which case they must be between 0 and 1) or mapped through logistic function with user-specified offset and coefficient.
  • Widget also outputs a model, so it can be fed into Test Learner and compared with other models.

I've put the widget into category Evaluate. This is not a model but rather a trick to turn a Table into Evaluation Results, hence it belongs there because any user interested in this transformation would look for a widget in this category.

Ideas from the discussion:

  • Replace radios with a check box to apply log reg. The check box is disabled and checked when the column contains values outside 0 - 1.
  • Remove line edits and always compute logistic regression
  • Support numeric outcomes; offer fitting with linear regression
Includes
  • Code changes
  • Tests
  • Documentation

@codecov
Copy link

codecov bot commented Jul 13, 2024

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.76%. Comparing base (32f4d6d) to head (3f68b3b).
⚠️ Report is 65 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6852      +/-   ##
==========================================
+ Coverage   88.73%   88.76%   +0.03%     
==========================================
  Files         332      334       +2     
  Lines       73451    73676     +225     
==========================================
+ Hits        65178    65402     +224     
- Misses       8273     8274       +1     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@janezd janezd changed the title Add Classify by Column Add "Column as Model" Jul 14, 2024
@janezd janezd force-pushed the classify-by-column branch 2 times, most recently from e6c0c49 to 62d47f2 Compare July 16, 2024 20:24
@janezd janezd added the needs discussion Core developers need to discuss the issue label Nov 28, 2024
@janezd janezd force-pushed the classify-by-column branch from 62d47f2 to 2c09a68 Compare November 30, 2024 21:06
@janezd janezd removed the needs discussion Core developers need to discuss the issue label Nov 30, 2024
@janezd janezd force-pushed the classify-by-column branch 2 times, most recently from ec278bd to bccbd34 Compare November 30, 2024 21:55
@markotoplak markotoplak changed the title Add "Column as Model" [ENH] Feature as Predictor Dec 1, 2024
@janezd janezd force-pushed the classify-by-column branch 2 times, most recently from 91afd14 to 06841a9 Compare May 22, 2025 09:37
@janezd janezd requested a review from Copilot May 22, 2025 09:43
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new widget, Feature as Predictor, which repurposes a table column (numeric or discrete) to generate evaluation results and outputs a corresponding model. Key changes include new i18n message entries, widget UI and behavior modifications in owfeatureaspredictor.py along with extensive test coverage, and updates to modelling and classification modules to support the new widget.

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.

Show a summary per file
File Description
i18n/si/msgs.jaml Added new translation messages for column learner/model errors.
Orange/widgets/evaluate/owfeatureaspredictor.py Implemented the new widget with control updates and commit logic.
Orange/widgets/evaluate/tests/test_owfeatureaspredictor.py Added tests to verify behavior and UI interaction of the widget.
Orange/modelling/column.py Introduced ColumnLearner/ColumnModel with logistic and linear paths.
Orange/tests/test_classification.py Updated tests to account for ColumnLearner behavior changes.
Orange/modelling/tests/test_column.py Added tests validating column modelling functionality.
Orange/classification/tests/test_column.py Added tests ensuring ColumnClassifier handles mapping and predictions.
Orange/modelling/init.py and classification/init.py Updated exports to include new column modules.
Comments suppressed due to low confidence (1)

Orange/modelling/column.py:99

  • The criteria for setting the 'value_mapping' in ColumnModel relies on an implicit slice comparison between 'class_var.values' and 'column.values', which may be fragile if the ordering or lengths differ. Consider adding a clarifying comment or refactoring this logic to explicitly document the intended mapping behavior.
if (column.is_discrete and class_var.values[:len(column.values)] != column.values):

@janezd janezd force-pushed the classify-by-column branch 2 times, most recently from 85569fa to 0c36cf7 Compare May 22, 2025 12:15
@janezd janezd self-assigned this Jun 13, 2025
@janezd janezd force-pushed the classify-by-column branch 2 times, most recently from 7c61b51 to 4badd5f Compare June 13, 2025 07:50
@markotoplak markotoplak marked this pull request as ready for review June 13, 2025 08:15
@markotoplak markotoplak merged commit ffaa776 into biolab:master Jun 13, 2025
21 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How to create 'Evaluation Result' directly from 'Table' data?

4 participants