Skip to content

For multi-targets, SMOTE is not recommended even if there is a bias #36

@mariko-sugawara

Description

@mariko-sugawara

Describe the bug
At _get_target_imbalance_score(), if the target column has multiclass, the imbalance score is calculated as 0 and SMOTE is not recommended for preprocess.

def _get_target_imbalance_score(Y):

To Reproduce
Steps to reproduce the behavior:

  1. Show your code calling generate_code().
script
# Paste your code here. The following is an example.
from sapientml import SapientMLGenerator
sml = SapientMLGenerator()
sml.generate_code('your arguments')
  1. Attach the datasets or dataframes input to generate_code() if possible.
  2. Show the generated code such as 1_default.py when it was generated.
generated code
# Paste the generated code here.
  1. Show the messages of SapientML and/or generated code.

Expected behavior
A clear and concise description of what you expected to happen.

Environment (please complete the following information):

  • OS: [e.g. Ubuntu 20.04]
  • Docker Version (if applicable): [Docker version 20.10.17, build 100c701]
  • Python Version: [e.g. 3.9.12]
  • SapientML Version: 0.5.4

Additional context

  • For the following code line 1020, fix the condition to vc.shape[0] > 10(if follow the comment) or delete it.
        # if there are more than 10 categories, probably it is a regression problem
        if vc.shape[0] > 2:
            return 0
  • In addition, offline learning is required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions