Skip to content

Conversation

@sven-oly
Copy link
Collaborator

This removes excessive debug output from conformance test logs.

from schema_files import ALL_TEST_TYPES

logger = logging.Logger("Checking Test Data vs. Schemas LOGGER")
logger.setLevel(logging.INFO)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: often these libraries have a way to set the logging level from an environment variable or from the CLI, so that you can do something like DEBUG=* ./genData or ./genData -vv

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's possible. I'll take a look.

Also, I've removed lots of excessive logging in the latest commits.

One more thing: when executing in parallel, the individual threads cannot have a logger in their objects because a logger "cannot be pickled". That's a small complication / limit to logging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think don't pickle the logger; just make a new one for each thread


# The following gets information from all the tests
summary_report = SummaryReport(self.file_base)
summary_report.logger = logger
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you assigning the logger into a field of a SummaryReport instance? So far, we've made loggers as class/module-level global objects because they're just about capturing and printing information from noteworthy events in your code. They're not data that has inherent semantic meaning that needs to be stored, represented, passed around, etc. Also, the SummaryReport class doesn't have a logger field.

if self.options.run_limit:
self.run_limit = int(self.options.run_limit)
logging.debug('!!! RUN LIMIT SET: %d', self.run_limit)
if self.debug:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To preface, I'm okay with using self.debug to conditionally run code because you regularly want to test and debug things differently locally vs. run things efficiently during a proper production run.

But using that debug field for conditional printing is an anti-pattern that we already replaced by using a logger, setting the logging level for each logging statement, and configuring the log levels appropriately for both the console output & what gets stored to a file.

There are a lot of instances throughout the PR with the pattern "if debug then print (via logging)". We should get rid of the "if debug" part in those places. That's not a loss because viewing the full debug level log output can be done via the file. It's easiest to view the log file in VS Code, which we've seen updates it's view of a file's contents in real time and nicely colorizes log file lines based on the log level of the line.


logger = logging.Logger("TEST_GENERATE LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By hard-coding the logger level for this file to be WARNING, and given that most of the logging statements here are either DEBUG or INFO, you're effectively turning off all of the logging happening in this module. Is that your intention?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I want to get rid of most of the non-urgent log results.

from testplan import TestPlan

logger = logging.Logger("TEST DRIVER LOGGER")
logger.setLevel(logging.WARNING)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's a good reason, we should not manually (hardcode) the logger level in code. Instead, we should allow the logger level to be set by our existing config file. So here & other places in the PR, where you see logger.setLevel(...), we should remove those statements.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine

return result

# OK, there's no CLDR-based JSON data available.
run_list = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is this dead code?

exec_list.append('-run_limit')
exec_list.append(str(self.run_limit))

run_list = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is this dead code?

@sffc
Copy link
Member

sffc commented Sep 29, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively reduces the debug log noise across several scripts by introducing named loggers and adjusting log levels. However, there are several critical issues that need to be addressed. The new loggers are created incorrectly using logging.Logger() instead of logging.getLogger(), which will prevent any log messages from being displayed. There are also a couple of logic errors introduced, such as making a run_limit feature dependent on a debug flag, and removing a necessary sys.exit() call. I've provided detailed comments and suggestions to fix these issues.

Comment on lines 449 to +450
logger = logging.Logger("TEST_GENERATE LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

You should use logging.getLogger() instead of logging.Logger() to create or retrieve a logger. logging.getLogger() ensures that the logger is part of the logging hierarchy, allowing it to inherit configurations from parent loggers. Using logging.Logger() directly creates a standalone logger that won't have any handlers by default, so your log messages will not be processed or displayed.

Suggested change
logger = logging.Logger("TEST_GENERATE LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)
logger = logging.getLogger("TEST_GENERATE LOGGER")
logger.setLevel(logging.WARNING)

Comment on lines +21 to +22
logger = logging.Logger("TEST DRIVER LOGGER")
logger.setLevel(logging.WARNING)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

You should use logging.getLogger() instead of logging.Logger() to create or retrieve a logger. logging.getLogger() ensures that the logger is part of the logging hierarchy, allowing it to inherit configurations from parent loggers. Using logging.Logger() directly creates a standalone logger that won't have any handlers by default, so your log messages will not be processed or displayed.

Suggested change
logger = logging.Logger("TEST DRIVER LOGGER")
logger.setLevel(logging.WARNING)
logger = logging.getLogger("TEST DRIVER LOGGER")
logger.setLevel(logging.WARNING)

Comment on lines 78 to +79
logger = logging.Logger("TEST SCHEMAS LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

You should use logging.getLogger() instead of logging.Logger() to create or retrieve a logger. logging.getLogger() ensures that the logger is part of the logging hierarchy, allowing it to inherit configurations from parent loggers. Using logging.Logger() directly creates a standalone logger that won't have any handlers by default, so your log messages will not be processed or displayed.

Suggested change
logger = logging.Logger("TEST SCHEMAS LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)
logger = logging.getLogger("TEST SCHEMAS LOGGER")
logger.setLevel(logging.WARNING)

Comment on lines 28 to +29

logger = logging.Logger("Checking Test Data vs. Schemas LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

You should use logging.getLogger() instead of logging.Logger() to create or retrieve a logger. logging.getLogger() ensures that the logger is part of the logging hierarchy, allowing it to inherit configurations from parent loggers. Using logging.Logger() directly creates a standalone logger that won't have any handlers by default, so your log messages will not be processed or displayed.

Suggested change
logger = logging.Logger("Checking Test Data vs. Schemas LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)
logger = logging.getLogger("Checking Test Data vs. Schemas LOGGER")
logger.setLevel(logging.WARNING)

Comment on lines 113 to 114
logging.fatal('Cannot create summary_json %s', error)
sys.exit(1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This exception handler is missing a sys.exit(1) call that was present before. Removing it will cause the program to continue after a fatal error, likely leading to a NameError on the next line since summary_json would not have been created. Also, for consistency, logging.fatal should be changed to logger.fatal.

Suggested change
logging.fatal('Cannot create summary_json %s', error)
sys.exit(1)
logger.fatal('Cannot create summary_json %s', error)
sys.exit(1)

Comment on lines 127 to +133
logger = logging.Logger("TEST DRIVER LOGGER")
logger.setLevel(logging.INFO)
logger.setLevel(logging.WARNING)

driver = TestDriver(logger)

logger.debug('ARGS = %s', args)
driver.parse_args(args[1:])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The --log argument added in ddtargs.py is not being used here to configure the logger's level; it's hardcoded to WARNING. To make the --log argument functional, you should parse the arguments before initializing the logger.

A possible refactoring of the main function:

import ddtargs

def main(args):
    arg_parser = ddtargs.DdtArgs(args[1:])
    options = arg_parser.getOptions()

    logger = logging.getLogger("TEST DRIVER LOGGER")
    log_level = getattr(logging, options.log.upper(), logging.WARNING)
    logger.setLevel(log_level)

    driver = TestDriver()
    driver.set_args(options) # Pass parsed options directly

    # ... rest of the function

This would require parse_args to not re-parse arguments.


class TestDriver:
def __init__(self):
def __init__(self, logger=None):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logger parameter in the constructor is not used. The class uses the module-level logger instance instead. You should either use the passed logger by assigning it to an instance variable (e.g., self.logger = logger) and using it throughout the class, or remove the parameter for clarity. Since main is also modified, I suggest removing the parameter here and from the call site in main.

Suggested change
def __init__(self, logger=None):
def __init__(self):

# Set up calls for version data --> results
if not self.request_executor_info():
# TODO: Report problem with executor (somehow).
logging.error('Cannot get executor info');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line ends with a semicolon, which is not standard Python style. Please remove it.

Suggested change
logging.error('Cannot get executor info');
logging.error('Cannot get executor info')

logger.critical(' %s', failure)
sys.exit(1)
else:
logging.info("All %d generated test data files match with schema", schema_count)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency, you should use the logger instance defined at the module level instead of the root logging object. This ensures all logging in this module goes through the same configured logger.

Suggested change
logging.info("All %d generated test data files match with schema", schema_count)
logger.info("All %d generated test data files match with schema", schema_count)

Comment on lines 21 to +22
ch = logging.StreamHandler()
ch.setLevel(logging.INFO)
ch.setLevel(logging.WARNING)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The StreamHandler instance ch is created but never used. This appears to be dead code and should be removed.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems about right. Gemini has some feedback.

@sven-oly sven-oly merged commit f9d08ac into unicode-org:main Oct 10, 2025
10 checks passed
@sven-oly sven-oly deleted the reduce_noise_336 branch October 10, 2025 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants