Skip to content

Conversation

NichtJens
Copy link
Contributor

Currently, errors in the command step of a Fact are logged like this:

--> Preparing operation files...
    Loading: test.py
    [localhost] cat: doesnotexist: No such file or directory
    [localhost] Error: could not load fact: files.FileContents path=doesnotexist

Errors in the process step are raised instead*:

--> Preparing operation files...
    Loading: test.py

--> Disconnecting from hosts...
--> An exception occurred in: test.py:

Traceback (most recent call last):
  File ".../site-packages/pyinfra_cli/util.py", line 65, in exec_file
    exec(PYTHON_CODES[filename], data)
  File "test.py", line 16, in <module>
    j = host.get_fact(JSONFileContents, "broken.json")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../site-packages/pyinfra/api/host.py", line 367, in get_fact
    return get_fact(self.state, self, name_or_cls, args=args, kwargs=kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../site-packages/pyinfra/api/facts.py", line 185, in get_fact
    return _get_fact(
           ^^^^^^^^^^
  File ".../site-packages/pyinfra/api/facts.py", line 270, in _get_fact
    data = fact.process(stdout_lines)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../jsonfiles.py", line 15, in process
    output = json.loads(output)
             ^^^^^^^^^^^^^^^^^^
  File ".../json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

This PR proposes to treat these errors the same as errors in the command step:

--> Preparing operation files...
    Loading: test.py
    [localhost] JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    [localhost] Error: could not process fact: jsonfiles.JSONFileContents path=broken.json

I tried to be consistent with the styling of existing command error logging. It might be desirable to also log the full traceback, though. This would need some changes to the current logger setup since passing in exc_info to logger.error via log_error_or_warning does not seem to work ATM. I assume this is because of the custom LogHandler/LogFormatter. Either way, I think the proposed change is an improvement over the current behavior.


*the code for JSONFileContents is:

class JSONFileContents(files.FileContents):

    def process(self, output):
        output = super().process(output)
        output = "\n".join(output)
        output = json.loads(output)
        return output

  • Pull request is based on the default branch (3.x at this time)
  • Pull request includes tests for any new/updated operations/facts
  • Pull request includes documentation for any new/updated operations/facts
  • Tests pass (see scripts/dev-test.sh)
  • Type checking & code style passes (see scripts/dev-lint.sh)

Tests will be added if the proposed changes are actually desired behavior. If not, writing tests now is probably not worth it.
I don't think the documentation needs to change.

@maisim
Copy link
Contributor

maisim commented Sep 22, 2025

Hi @NichtJens

Thanx, it sould be nice to have a more consistent and compact output but, if I don"t mess anything we loose usefull informations to debug the problem:

I broke the AptSources fact for the test

3.x
image

With this PR changes:

gh pr checkout 1456
M	src/pyinfra/facts/apt.py
Basculement sur la branche 'NichtJens/3.x'
simon@pyinfra-dev:~/workspace/reviews/pyinfra$ uv run pyinfra localhost fact apt.AptSources
--> Loading config...
--> Loading inventory...
--> Connecting to hosts...
    [localhost] Connected

--> Gathering facts...
    [localhost] TypeError: can only concatenate list (not "str") to list
    [localhost] Error: could not process fact: apt.AptSources 
    [localhost] Loaded fact apt.AptSources

--> Fact data for: apt.AptSources
{
    "localhost": []
}

--> Disconnecting from hosts...

It seems now difficult to find the problem.
We lost the file/line and the pyinfra-debug.log
We have to keep the log file and maybe be able to activate a verbose mode ? What do you think?

@NichtJens
Copy link
Contributor Author

@maisim Mhmm... Yeah, that's not good.

I guess this brings us back to my remark:

It might be desirable to also log the full traceback, though. This would need some changes to the current logger setup since passing in exc_info to logger.error via log_error_or_warning does not seem to work ATM. I assume this is because of the custom LogHandler/LogFormatter.

Do you by chance understand where the exc_info gets lost? If we could (re-)enable this, we would get the best of both worlds.

@Fizzadar
Copy link
Member

Fizzadar commented Oct 1, 2025

I think the exception here is correct, following the logic that:

  • commands are executed on the target machine, and thus fail outside of pyinfra's own context, thus should be treated as a server, not local, failure
  • processing is executed locally within the pyinfra process, failing here is just a bug

However, this basically dictates that if the command executes OK the process MUST always work. But given this example:

  • run command to get json output, gives blank or some non-json, but returns ok (exit 0)
  • processing fails to parse the JSON

How do we indicate failure? We can a) blow up w/exception -> this problem or b) return None or an empty dictionary but this makes the response ambiguous "did we get a null back or invalid JSON?".


I don't think catching all exceptions here is going to work since we'll miss bugs. But an explicit FactProcessError exception would work, I think?

@NichtJens
Copy link
Contributor Author

NichtJens commented Oct 8, 2025

Let's see if I got this right :)

I define:

class FactProcessError(RuntimeError):
    pass

... and change from the current PR to this:

try:
    data = fact.process(stdout_lines)
except FactProcessError as e:
    log_error_or_warning(
    ...

Then, I'd use it like this:

class JSONFileContents(files.FileContents):

    def process(self, output):
        output = super().process(output)
        output = "\n".join(output)
        try:
            output = json.loads(output)
        except Exception as e:
            raise FactProcessError from e
        return output

Probably, we still want to log the output. This would mean, I change the current PR to this:

def log_error_or_warning(
    ...

    if exception:
        cause = exception.__cause__
        exc_text = "{0}: {1}".format(type(cause).__name__, cause)
        log_func(
        ...

Now only FactProcessError will be logged, everything else stays a crash.
The usage is to "mark" with FactProcessError places in the processing where we expect it to go wrong and want to handle the problem, while unexpected bugs can still be seen with the full traceback.

Is that correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants