log errors also in the Fact process step #1456

NichtJens · 2025-09-22T08:43:41Z

Currently, errors in the command step of a Fact are logged like this:

--> Preparing operation files...
    Loading: test.py
    [localhost] cat: doesnotexist: No such file or directory
    [localhost] Error: could not load fact: files.FileContents path=doesnotexist

Errors in the process step are raised instead*:

--> Preparing operation files...
    Loading: test.py

--> Disconnecting from hosts...
--> An exception occurred in: test.py:

Traceback (most recent call last):
  File ".../site-packages/pyinfra_cli/util.py", line 65, in exec_file
    exec(PYTHON_CODES[filename], data)
  File "test.py", line 16, in <module>
    j = host.get_fact(JSONFileContents, "broken.json")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../site-packages/pyinfra/api/host.py", line 367, in get_fact
    return get_fact(self.state, self, name_or_cls, args=args, kwargs=kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../site-packages/pyinfra/api/facts.py", line 185, in get_fact
    return _get_fact(
           ^^^^^^^^^^
  File ".../site-packages/pyinfra/api/facts.py", line 270, in _get_fact
    data = fact.process(stdout_lines)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../jsonfiles.py", line 15, in process
    output = json.loads(output)
             ^^^^^^^^^^^^^^^^^^
  File ".../json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

This PR proposes to treat these errors the same as errors in the command step:

--> Preparing operation files...
    Loading: test.py
    [localhost] JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    [localhost] Error: could not process fact: jsonfiles.JSONFileContents path=broken.json

I tried to be consistent with the styling of existing command error logging. It might be desirable to also log the full traceback, though. This would need some changes to the current logger setup since passing in exc_info to logger.error via log_error_or_warning does not seem to work ATM. I assume this is because of the custom LogHandler/LogFormatter. Either way, I think the proposed change is an improvement over the current behavior.

*the code for JSONFileContents is:

class JSONFileContents(files.FileContents):

    def process(self, output):
        output = super().process(output)
        output = "\n".join(output)
        output = json.loads(output)
        return output

Pull request is based on the default branch (3.x at this time)
Pull request includes tests for any new/updated operations/facts
Pull request includes documentation for any new/updated operations/facts
Tests pass (see scripts/dev-test.sh)
Type checking & code style passes (see scripts/dev-lint.sh)

Tests will be added if the proposed changes are actually desired behavior. If not, writing tests now is probably not worth it.
I don't think the documentation needs to change.

maisim · 2025-09-22T10:01:02Z

Hi @NichtJens

Thanx, it sould be nice to have a more consistent and compact output but, if I don"t mess anything we loose usefull informations to debug the problem:

I broke the AptSources fact for the test

3.x

With this PR changes:

gh pr checkout 1456
M	src/pyinfra/facts/apt.py
Basculement sur la branche 'NichtJens/3.x'
simon@pyinfra-dev:~/workspace/reviews/pyinfra$ uv run pyinfra localhost fact apt.AptSources
--> Loading config...
--> Loading inventory...
--> Connecting to hosts...
    [localhost] Connected

--> Gathering facts...
    [localhost] TypeError: can only concatenate list (not "str") to list
    [localhost] Error: could not process fact: apt.AptSources 
    [localhost] Loaded fact apt.AptSources

--> Fact data for: apt.AptSources
{
    "localhost": []
}

--> Disconnecting from hosts...

It seems now difficult to find the problem.
We lost the file/line and the pyinfra-debug.log
We have to keep the log file and maybe be able to activate a verbose mode ? What do you think?

NichtJens · 2025-09-22T12:15:13Z

@maisim Mhmm... Yeah, that's not good.

I guess this brings us back to my remark:

It might be desirable to also log the full traceback, though. This would need some changes to the current logger setup since passing in exc_info to logger.error via log_error_or_warning does not seem to work ATM. I assume this is because of the custom LogHandler/LogFormatter.

Do you by chance understand where the exc_info gets lost? If we could (re-)enable this, we would get the best of both worlds.

Fizzadar · 2025-10-01T14:41:17Z

I think the exception here is correct, following the logic that:

commands are executed on the target machine, and thus fail outside of pyinfra's own context, thus should be treated as a server, not local, failure
processing is executed locally within the pyinfra process, failing here is just a bug

However, this basically dictates that if the command executes OK the process MUST always work. But given this example:

run command to get json output, gives blank or some non-json, but returns ok (exit 0)
processing fails to parse the JSON

How do we indicate failure? We can a) blow up w/exception -> this problem or b) return None or an empty dictionary but this makes the response ambiguous "did we get a null back or invalid JSON?".

I don't think catching all exceptions here is going to work since we'll miss bugs. But an explicit FactProcessError exception would work, I think?

NichtJens · 2025-10-08T18:02:19Z

Let's see if I got this right :)

I define:

class FactProcessError(RuntimeError):
    pass

... and change from the current PR to this:

try:
    data = fact.process(stdout_lines)
except FactProcessError as e:
    log_error_or_warning(
    ...

Then, I'd use it like this:

class JSONFileContents(files.FileContents):

    def process(self, output):
        output = super().process(output)
        output = "\n".join(output)
        try:
            output = json.loads(output)
        except Exception as e:
            raise FactProcessError from e
        return output

Probably, we still want to log the output. This would mean, I change the current PR to this:

def log_error_or_warning(
    ...

    if exception:
        cause = exception.__cause__
        exc_text = "{0}: {1}".format(type(cause).__name__, cause)
        log_func(
        ...

Now only FactProcessError will be logged, everything else stays a crash.
The usage is to "mark" with FactProcessError places in the processing where we expect it to go wrong and want to handle the problem, while unexpected bugs can still be seen with the full traceback.

Is that correct?

Sven Augustin added 2 commits September 22, 2025 09:57

log errors also in the fact process step

bdf51fb

black

b1d8760

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

log errors also in the Fact process step #1456

log errors also in the Fact process step #1456

Uh oh!

NichtJens commented Sep 22, 2025

Uh oh!

maisim commented Sep 22, 2025

Uh oh!

NichtJens commented Sep 22, 2025

Uh oh!

Fizzadar commented Oct 1, 2025

Uh oh!

NichtJens commented Oct 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

log errors also in the Fact process step #1456

Are you sure you want to change the base?

log errors also in the Fact process step #1456

Uh oh!

Conversation

NichtJens commented Sep 22, 2025

Uh oh!

maisim commented Sep 22, 2025

Uh oh!

NichtJens commented Sep 22, 2025

Uh oh!

Fizzadar commented Oct 1, 2025

Uh oh!

NichtJens commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

NichtJens commented Oct 8, 2025 •

edited

Loading