Skip to content

Remove deprecated parameters #519

@amontanez24

Description

@amontanez24

Problem Description

We have a number of deprecated parameters in the benchmark_single_table function that still remain in the code. There are a lot of code artifacts tied to these parameters that clutter the codebase and make it difficult to read. We should clean this code up by removing the parameters and any functions or conditionals that are related to them. The deprecated parameters are:

  • run_on_ec2 - this is not needed since we have a separate function for it now
  • output_filepath - this was replaced by output_destination
  • detailed_results_folder - Also replaced by output_destination
  • multi_processing_config - We never actually use the multi-processing feature in runs.

Expected behavior

  • The parameters above should be completely removed from the function
  • Any code that handles these parameters or exists because of them should be deleted

Additional context

  • The multi_processing_config parameter has a lot of code tied to it that we don't really use or need. This can be seen here

    SDGym/sdgym/benchmark.py

    Lines 835 to 852 in b8b7b13

    workers = 1
    if multi_processing_config:
    if multi_processing_config['package_name'] == 'dask':
    workers = 'dask'
    scores = _run_on_dask(job_args_list, show_progress)
    else:
    num_gpus = get_num_gpus()
    if num_gpus > 0:
    workers = num_gpus
    else:
    workers = multiprocessing.cpu_count()
    job_args_list = [job_args + (result_writer,) for job_args in job_args_list]
    if workers in (0, 1):
    scores = map(_run_job, job_args_list)
    elif workers != 'dask':
    pool = concurrent.futures.ProcessPoolExecutor(workers)
    scores = pool.map(_run_job, job_args_list)
  • We don't advertise it and it defaults to None, so in practice our runs just map the run to the job_arg_list directly
    scores = map(_run_job, job_args_list)
  • This defeats the purpose of all these nested function. We could instead have the benchmark_single_table method just call the _run_job function directly.

Metadata

Metadata

Assignees

Labels

internalThe issue doesn't change the API or functionality

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions