Skip to content
Merged
32 changes: 21 additions & 11 deletions Doc/library/importlib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1584,20 +1584,30 @@ Note that if ``name`` is a submodule (contains a dot),
Importing a source file directly
''''''''''''''''''''''''''''''''

.. note:: ``SourceFileLoader.load_module()`` has been deprecated -- this recipe should be used instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the extra note? The method isn't used here and it's documented as deprecated, so why call it out here and add a visible callout in the rendered docs that some might find distracting?

Copy link
Contributor Author

@ChrisBarker-NOAA ChrisBarker-NOAA Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that because the advice to use SourceFileLoader.load_module() is found all over the internet (e.g. https://krbnite.github.io/How-to-Import-a-Python-Module-from-an-Arbitrary-Path/)
-- and it seems a lot simpler than this recipe, and it doesn't raise a deprecation error. so I thought it would be helpful to make it clear why the more complex recipe is preferred.

I removed the note directive to make it less distracting.

That being said, while it might be helpful now, it'll be clutter in the future -- so your call if you want to simply remove it.

Copy link
Contributor

@ncoghlan ncoghlan Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a lead-in paragraph makes sense, but I think it's pointing at the wrong alternative: the reason importing an arbitrary path is a bad idea is because it's a good way to confuse the import system, and that's why the stdlib's only shorthand API for doing that is deprecated without a straightforward replacement.

The example given uses a top level module/package, so it's potentially OK (although I'm not 100% certain without checking the code for module_from_spec - if that bypasses sys.modules, you might end up with two different versions of JSONDecodeError kicking around, which can easily land you in exception handling hell), but if the file is part of an already importable package than you're far more likely to get some weird state duplication going on (it's a good way to get yourself caught in the "double import trap").

The much safer alternative for executing a Python file and getting access to its top-level module state is to use runpy.run_path (which returns the global namespace that results from executing the file rather than creating a module object).

In my experience, folks wanting to import from arbitrary paths is usually a classic case of the "XY problem": "X" is "I want to run a Python file from a Python program and get access to the resulting global variables", which becomes the "Y" question of "How do I import a Python module given only its path name?". With import being a statement, and the execfile builtin ancient history, it's an understandable leap for people to make, but that doesn't mean it's correct when there's no actual need to create a module object at all (let alone register it with the import system).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't raise a deprecation error

... yet. 😉 #121604

I did that because the advice to use SourceFileLoader.load_module() is found all over the internet

So is probably outdated Python 2 advice, but that doesn't mean we need to call it out. 😉

I think having a lead-in paragraph makes sense, but I think it's pointing at the wrong alternative: the reason importing an arbitrary path is a bad idea is because it's a good way to confuse the import system

I agree. I think it would be better to have a sentence or paragraph pointing out that the recipe is an approximation of an import statement where you specify the file path. It should also point out that modifying sys.path may be better than circumventing the import system or using runpy.run_path() if you just need the global namespace and not a module object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is already in the docs! All this PR does is clean up the snippet a little bit.


To import a Python source file directly, use the following recipe::

import importlib.util
import sys
import importlib.util
import sys

# For illustrative purposes.
import tokenize
file_path = tokenize.__file__
module_name = tokenize.__name__

spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)

def import_from_path(module_name, file_path):
spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module

# For illustrative purposes a name and file path of an
# existing module is needed -- use the json module
# as an example
import json
file_path = json.__file__
module_name = json.__name__

# equivalent of ``import json``
json = import_from_path(module_name, file_path)


Implementing lazy imports
Expand Down