diff --git a/peps/pep-0489.rst b/peps/pep-0489.rst index 2ad02fbe362..d6f33e242b7 100644 --- a/peps/pep-0489.rst +++ b/peps/pep-0489.rst @@ -12,6 +12,11 @@ Python-Version: 3.5 Post-History: 23-Aug-2013, 20-Feb-2015, 16-Apr-2015, 07-May-2015, 18-May-2015 Resolution: https://mail.python.org/pipermail/python-dev/2015-May/140108.html +.. canonical-doc:: :ref:`python:initializing-modules`. + For Python 3.14+, see :ref:`py3.14:extension-modules` + and :ref:`py3.14:pymoduledef` + +.. highlight:: c Abstract ======== @@ -23,12 +28,12 @@ import-related problems by bringing extension modules closer to the way Python modules behave; specifically to hook into the ModuleSpec-based loading mechanism introduced in :pep:`451`. -This proposal draws inspiration from PyType_Spec of :pep:`384` to allow extension +This proposal draws inspiration from ``PyType_Spec`` of :pep:`384` to allow extension authors to only define features they need, and to allow future additions to extension module declarations. Extensions modules are created in a two-step process, fitting better into -the ModuleSpec architecture, with parallels to __new__ and __init__ of classes. +the ModuleSpec architecture, with parallels to ``__new__`` and ``__init__`` of classes. Extension modules can safely store arbitrary C-level per-module state in the module that is covered by normal garbage collection and supports @@ -39,7 +44,7 @@ when using the new API. The proposal also allows extension modules with non-ASCII names. Not all problems tackled in :pep:`3121` are solved in this proposal. -In particular, problems with run-time module lookup (PyState_FindModule) +In particular, problems with run-time module lookup (``PyState_FindModule``) are left to a future PEP. @@ -55,7 +60,7 @@ and passed to the relevant hooks. For extensions (i.e. shared libraries) and built-in modules, the module init function is executed straight away and does both the creation and initialization. The initialization function is not passed the ModuleSpec, -or any information it contains, such as the __file__ or fully-qualified +or any information it contains, such as the ``__file__`` or fully-qualified name. This hinders relative imports and resource loading. In Py3, modules are also not being added to sys.modules, which means that a @@ -65,8 +70,8 @@ again. Without access to the fully-qualified module name, it is not trivial to correctly add the module to sys.modules either. This is specifically a problem for Cython generated modules, for which it's not uncommon that the module init code has the same level of complexity as -that of any 'regular' Python module. Also, the lack of __file__ and __name__ -information hinders the compilation of "__init__.py" modules, i.e. packages, +that of any 'regular' Python module. Also, the lack of ``__file__`` and ``__name__`` +information hinders the compilation of "``__init__.py``" modules, i.e. packages, especially when relative imports are being used at module init time. Furthermore, the majority of currently existing extension modules has @@ -84,14 +89,14 @@ The current process =================== Currently, extension and built-in modules export an initialization function -named "PyInit_modulename", named after the file name of the shared library. +named "``PyInit_modulename``", named after the file name of the shared library. This function is executed by the import machinery and must return a fully initialized module object. The function receives no arguments, so it has no way of knowing about its import context. During its execution, the module init function creates a module object -based on a PyModuleDef object. It then continues to initialize it by adding +based on a ``PyModuleDef`` object. It then continues to initialize it by adding attributes to the module dict, creating types, etc. In the back, the shared library loader keeps a note of the fully qualified @@ -105,9 +110,9 @@ but this assumption usually holds in practice. The proposal ============ -The initialization function (PyInit_modulename) will be allowed to return -a pointer to a PyModuleDef object. The import machinery will be in charge -of constructing the module object, calling hooks provided in the PyModuleDef +The initialization function (``PyInit_modulename``) will be allowed to return +a pointer to a ``PyModuleDef`` object. The import machinery will be in charge +of constructing the module object, calling hooks provided in the ``PyModuleDef`` in the relevant phases of initialization (as described below). This multi-phase initialization is an additional possibility. Single-phase @@ -115,11 +120,11 @@ initialization, the current practice of returning a fully initialized module object, will still be accepted, so existing code will work unchanged, including binary compatibility. -The PyModuleDef structure will be changed to contain a list of slots, -similarly to :pep:`384`'s PyType_Spec for types. +The ``PyModuleDef`` structure will be changed to contain a list of slots, +similarly to :pep:`384`'s ``PyType_Spec`` for types. To keep binary compatibility, and avoid needing to introduce a new structure (which would introduce additional supporting functions and per-module storage), -the currently unused m_reload pointer of PyModuleDef will be changed to +the currently unused *m_reload* pointer of ``PyModuleDef`` will be changed to hold the slots. The structures are defined as:: typedef struct { @@ -140,7 +145,7 @@ hold the slots. The structures are defined as:: } PyModuleDef; The *m_slots* member must be either NULL, or point to an array of -PyModuleDef_Slot structures, terminated by a slot with id set to 0 +``PyModuleDef_Slot`` structures, terminated by a slot with id set to 0 (i.e. ``{0, NULL}``). To specify a slot, a unique slot ID must be provided. @@ -153,18 +158,18 @@ slot's documentation. The following slots are currently available, and described later: -* Py_mod_create -* Py_mod_exec +* ``Py_mod_create`` +* ``Py_mod_exec`` Unknown slot IDs will cause the import to fail with SystemError. -When using multi-phase initialization, the *m_name* field of PyModuleDef will +When using multi-phase initialization, the *m_name* field of ``PyModuleDef`` will not be used during importing; the module name will be taken from the ModuleSpec. -Before it is returned from PyInit_*, the PyModuleDef object must be initialized -using the newly added PyModuleDef_Init function. This sets the object type +Before it is returned from PyInit_*, the ``PyModuleDef`` object must be initialized +using the newly added ``PyModuleDef_Init`` function. This sets the object type (which cannot be done statically on certain compilers), refcount, and internal -bookkeeping data (m_index). +bookkeeping data (*m_index*). For example, an extension module "example" would be exported as:: static PyModuleDef example_def = {...} @@ -175,7 +180,7 @@ For example, an extension module "example" would be exported as:: return PyModuleDef_Init(&example_def); } -The PyModuleDef object must be available for the lifetime of the module created +The ``PyModuleDef`` object must be available for the lifetime of the module created from it – usually, it will be declared statically. Pseudo-code Overview @@ -188,9 +193,9 @@ are left out, and C code is presented with a concise Python-like syntax. The framework that calls the importers is explained in :pep:`451#how-loading-will-work`. -importlib/_bootstrap.py: +``importlib/_bootstrap.py``: - :: +.. code-block:: python class BuiltinImporter: def create_module(self, spec): @@ -203,9 +208,9 @@ importlib/_bootstrap.py: # use a backwards compatibility shim _load_module_shim(self, name) -importlib/_bootstrap_external.py: +``importlib/_bootstrap_external.py``: - :: +.. code-block:: python class ExtensionFileLoader: def create_module(self, spec): @@ -218,9 +223,9 @@ importlib/_bootstrap_external.py: # use a backwards compatibility shim _load_module_shim(self, name) -Python/import.c (the _imp module): +``Python/import.c`` (the ``_imp`` module): - :: +.. code-block:: python def create_dynamic(spec): name = spec.name @@ -267,9 +272,9 @@ Python/import.c (the _imp module): _PyImport_FixupExtensionObject(module, name, name) return module -Python/importdl.c: +``Python/importdl.c``: - :: +.. code-block:: python def _PyImport_LoadDynamicModuleWithSpec(spec): path = spec.origin @@ -292,9 +297,9 @@ Python/importdl.c: # fall back to single-phase initialization .... -Objects/moduleobject.c: +``Objects/moduleobject.c``: - :: +.. code-block:: python def PyModule_FromDefAndSpec(def, spec): name = spec.name @@ -334,18 +339,18 @@ Module Creation Phase --------------------- Creation of the module object – that is, the implementation of -ExecutionLoader.create_module – is governed by the Py_mod_create slot. +``ExecutionLoader.create_module`` – is governed by the ``Py_mod_create`` slot. The Py_mod_create slot ...................... -The Py_mod_create slot is used to support custom module subclasses. +The ``Py_mod_create`` slot is used to support custom module subclasses. The value pointer must point to a function with the following signature:: PyObject* (*PyModuleCreateFunction)(PyObject *spec, PyModuleDef *def) The function receives a ModuleSpec instance, as defined in :pep:`451`, -and the PyModuleDef structure. +and the ``PyModuleDef`` structure. It should return a new module object, or set an error and return NULL. @@ -354,64 +359,64 @@ specified in :pep:`451#attributes` (such as ``__name__`` or ``__loader__``) on the new module. There is no requirement for the returned object to be an instance of -types.ModuleType. Any type can be used, as long as it supports setting and +``types.ModuleType``. Any type can be used, as long as it supports setting and getting attributes, including at least the import-related attributes. -However, only ModuleType instances support module-specific functionality +However, only ``ModuleType`` instances support module-specific functionality such as per-module state and processing of execution slots. -If something other than a ModuleType subclass is returned, no execution slots -may be defined; if any are, a SystemError is raised. +If something other than a ``ModuleType`` subclass is returned, no execution slots +may be defined; if any are, a ``SystemError`` is raised. -Note that when this function is called, the module's entry in sys.modules +Note that when this function is called, the module's entry in ``sys.modules`` is not populated yet. Attempting to import the same module again (possibly transitively), may lead to an infinite loop. -Extension authors are advised to keep Py_mod_create minimal, an in particular +Extension authors are advised to keep ``Py_mod_create`` minimal, an in particular to not call user code from it. -Multiple Py_mod_create slots may not be specified. If they are, import -will fail with SystemError. +Multiple ``Py_mod_create`` slots may not be specified. If they are, import +will fail with ``SystemError``. -If Py_mod_create is not specified, the import machinery will create a normal -module object using PyModule_New. The name is taken from *spec*. +If ``Py_mod_create`` is not specified, the import machinery will create a normal +module object using ``PyModule_New``. The name is taken from *spec*. Post-creation steps ................... -If the Py_mod_create function returns an instance of types.ModuleType -or a subclass (or if a Py_mod_create slot is not present), the import -machinery will associate the PyModuleDef with the module. -This also makes the PyModuleDef accessible to execution phase, the -PyModule_GetDef function, and garbage collection routines (traverse, +If the ``Py_mod_create`` function returns an instance of ``types.ModuleType`` +or a subclass (or if a ``Py_mod_create`` slot is not present), the import +machinery will associate the ``PyModuleDef`` with the module. +This also makes the ``PyModuleDef`` accessible to execution phase, the +``PyModule_GetDef`` function, and garbage collection routines (traverse, clear, free). -If the Py_mod_create function does not return a module subclass, then m_size -must be 0, and m_traverse, m_clear and m_free must all be NULL. -Otherwise, SystemError is raised. +If the ``Py_mod_create`` function does not return a module subclass, then *m_size* +must be 0, and *m_traverse*, *m_clear* and *m_free* must all be NULL. +Otherwise, ``SystemError`` is raised. -Additionally, initial attributes specified in the PyModuleDef are set on the +Additionally, initial attributes specified in the ``PyModuleDef`` are set on the module object, regardless of its type: -* The docstring is set from m_doc, if non-NULL. -* The module's functions are initialized from m_methods, if any. +* The docstring is set from *m_doc*, if non-NULL. +* The module's functions are initialized from *m_methods*, if any. Module Execution Phase ---------------------- Module execution -- that is, the implementation of -ExecutionLoader.exec_module -- is governed by "execution slots". -This PEP only adds one, Py_mod_exec, but others may be added in the future. +``ExecutionLoader.exec_module`` -- is governed by "execution slots". +This PEP only adds one, ``Py_mod_exec``, but others may be added in the future. -The execution phase is done on the PyModuleDef associated with the module -object. For objects that are not a subclass of PyModule_Type (for which -PyModule_GetDef would fail), the execution phase is skipped. +The execution phase is done on the ``PyModuleDef`` associated with the module +object. For objects that are not a subclass of ``PyModule_Type`` (for which +``PyModule_GetDef`` would fail), the execution phase is skipped. Execution slots may be specified multiple times, and are processed in the order they appear in the slots array. When using the default import machinery, they are processed after import-related attributes specified in :pep:`451#attributes` (such as ``__name__`` or ``__loader__``) are set and the module is added -to sys.modules. +to ``sys.modules``. Pre-Execution steps @@ -419,7 +424,7 @@ Pre-Execution steps Before processing the execution slots, per-module state is allocated for the module. From this point on, per-module state is accessible through -PyModule_GetState. +``PyModule_GetState``. The Py_mod_exec slot @@ -436,12 +441,12 @@ The "module" argument receives the module object to initialize. The function must return ``0`` on success, or, on error, set an exception and return ``-1``. -If PyModuleExec replaces the module's entry in sys.modules, the new object +If ``PyModuleExec`` replaces the module's entry in ``sys.modules``, the new object will be used and returned by importlib machinery after all execution slots are processed. This is a feature of the import machinery itself. The slots themselves are all processed using the module returned from the -creation phase; sys.modules is not consulted during the execution phase. -(Note that for extension modules, implementing Py_mod_create is usually +creation phase; ``sys.modules`` is not consulted during the execution phase. +(Note that for extension modules, implementing ``Py_mod_create`` is usually a better solution for using custom module objects.) @@ -449,9 +454,9 @@ Legacy Init ----------- The backwards-compatible single-phase initialization continues to be supported. -In this scheme, the PyInit function returns a fully initialized module rather -than a PyModuleDef object. -In this case, the PyInit hook implements the creation phase, and the execution +In this scheme, the ``PyInit`` function returns a fully initialized module rather +than a ``PyModuleDef`` object. +In this case, the ``PyInit`` hook implements the creation phase, and the execution phase is a no-op. Modules that need to work unchanged on older versions of Python should stick to @@ -514,7 +519,7 @@ Built-In modules Any extension module can be used as a built-in module by linking it into the executable, and including it in the inittab (either at runtime with -PyImport_AppendInittab, or at configuration time, using tools like *freeze*). +``PyImport_AppendInittab``, or at configuration time, using tools like *freeze*). To keep this possibility, all changes to extension module loading introduced in this PEP will also apply to built-in modules. @@ -525,14 +530,14 @@ Subinterpreters and Interpreter Reloading ----------------------------------------- Extensions using the new initialization scheme are expected to support -subinterpreters and multiple Py_Initialize/Py_Finalize cycles correctly, +subinterpreters and multiple ``Py_Initialize``/``Py_Finalize`` cycles correctly, avoiding the issues mentioned in Python documentation [#subinterpreter-docs]_. The mechanism is designed to make this easy, but care is still required on the part of the extension author. No user-defined functions, methods, or instances may leak to different interpreters. To achieve this, all module-level state should be kept in either the module -dict, or in the module object's storage reachable by PyModule_GetState. +dict, or in the module object's storage reachable by ``PyModule_GetState``. A simple rule of thumb is: Do not define any static data, except built-in types with no mutable or user-settable class attributes. @@ -540,21 +545,21 @@ with no mutable or user-settable class attributes. Functions incompatible with multi-phase initialization ------------------------------------------------------ -The PyModule_Create function will fail when used on a PyModuleDef structure +The ``PyModule_Create`` function will fail when used on a ``PyModuleDef`` structure with a non-NULL *m_slots* pointer. The function doesn't have access to the ModuleSpec object necessary for multi-phase initialization. -The PyState_FindModule function will return NULL, and PyState_AddModule -and PyState_RemoveModule will also fail on modules with non-NULL *m_slots*. +The ``PyState_FindModule`` function will return NULL, and ``PyState_AddModule`` +and ``PyState_RemoveModule`` will also fail on modules with non-NULL *m_slots*. PyState registration is disabled because multiple module objects may be created -from the same PyModuleDef. +from the same ``PyModuleDef``. Module state and C-level callbacks ---------------------------------- -Due to the unavailability of PyState_FindModule, any function that needs access +Due to the unavailability of ``PyState_FindModule``, any function that needs access to module-level state (including functions, classes or exceptions defined at the module level) must receive a reference to the module object (or the particular object it needs), either directly or indirectly. @@ -569,7 +574,7 @@ Fixing these cases is outside of the scope of this PEP, but will be needed for the new mechanism to be useful to all modules. Proper fixes have been discussed on the import-sig mailing list [#findmodule-discussion]_. -As a rule of thumb, modules that rely on PyState_FindModule are, at the moment, +As a rule of thumb, modules that rely on ``PyState_FindModule`` are, at the moment, not good candidates for porting to the new mechanism. @@ -577,7 +582,7 @@ New Functions ------------- A new function and macro implementing the module creation phase will be added. -These are similar to PyModule_Create and PyModule_Create2, except they +These are similar to ``PyModule_Create`` and ``PyModule_Create2``, except they take an additional ModuleSpec argument, and handle module definitions with non-NULL slots:: @@ -592,10 +597,10 @@ a module is executed, unless the module is being reloaded:: PyAPI_FUNC(int) PyModule_ExecDef(PyObject *module, PyModuleDef *def) -Another function will be introduced to initialize a PyModuleDef object. +Another function will be introduced to initialize a ``PyModuleDef`` object. This idempotent function fills in the type, refcount, and module index. -It returns its argument cast to PyObject*, so it can be returned directly -from a PyInit function:: +It returns its argument cast to ``PyObject*``, so it can be returned directly +from a ``PyInit`` function:: PyObject * PyModuleDef_Init(PyModuleDef *); @@ -613,32 +618,34 @@ As portable C identifiers are limited to ASCII, module names must be encoded to form the PyInit hook name. For ASCII module names, the import hook is named -PyInit_, where is the name of the module. +``PyInit_``, where ```` is the name of the module. For module names containing non-ASCII characters, the import hook is named -PyInitU_, where the name is encoded using CPython's +``PyInitU_``, where the name is encoded using CPython's "punycode" encoding (:rfc:`Punycode <3492>` with a lowercase suffix), with hyphens ("-") replaced by underscores ("_"). -In Python:: +In Python: + +.. code-block:: python - def export_hook_name(name): - try: - suffix = b'_' + name.encode('ascii') - except UnicodeEncodeError: - suffix = b'U_' + name.encode('punycode').replace(b'-', b'_') - return b'PyInit' + suffix + def export_hook_name(name): + try: + suffix = b'_' + name.encode('ascii') + except UnicodeEncodeError: + suffix = b'U_' + name.encode('punycode').replace(b'-', b'_') + return b'PyInit' + suffix Examples: -============= =================== +============= ======================= Module name Init hook name -============= =================== -spam PyInit_spam -lančmít PyInitU_lanmt_2sa6t -スパム PyInitU_zck5b2b -============= =================== +============= ======================= +spam ``PyInit_spam`` +lančmít ``PyInitU_lanmt_2sa6t`` +スパム ``PyInitU_zck5b2b`` +============= ======================= For modules with non-ASCII names, single-phase initialization is not supported. @@ -649,11 +656,11 @@ names will not be supported. Module Reloading ---------------- -Reloading an extension module using importlib.reload() will continue to +Reloading an extension module using ``importlib.reload()`` will continue to have no effect, except re-setting import-related attributes. Due to limitations in shared library loading (both dlopen on POSIX and -LoadModuleEx on Windows), it is not generally possible to load +``LoadModuleEx`` on Windows), it is not generally possible to load a modified library after it has changed on disk. Use cases for reloading other than trying out a new version of the module @@ -673,7 +680,9 @@ Note that this mechanism can currently only be used to *load* extra modules, but not to *find* them. (This is a limitation of the loader mechanism, which this PEP does not try to modify.) To work around the lack of a suitable finder, code like the following -can be used:: +can be used: + +.. code-block:: python import importlib.machinery import importlib.util @@ -707,29 +716,29 @@ Summary of API Changes and Additions New functions: -* PyModule_FromDefAndSpec (macro) -* PyModule_FromDefAndSpec2 -* PyModule_ExecDef -* PyModule_SetDocString -* PyModule_AddFunctions -* PyModuleDef_Init +* ``PyModule_FromDefAndSpec`` (macro) +* ``PyModule_FromDefAndSpec2`` +* ``PyModule_ExecDef`` +* ``PyModule_SetDocString`` +* ``PyModule_AddFunctions`` +* ``PyModuleDef_Init`` New macros: -* Py_mod_create -* Py_mod_exec +* ``Py_mod_create`` +* ``Py_mod_exec`` New types: -* PyModuleDef_Type will be exposed +* ``PyModuleDef_Type`` will be exposed New structures: -* PyModuleDef_Slot +* ``PyModuleDef_Slot`` Other changes: -PyModuleDef.m_reload changes to PyModuleDef.m_slots. +``PyModuleDef.m_reload`` changes to ``PyModuleDef.m_slots``. ``BuiltinImporter`` and ``ExtensionFileLoader`` will now implement ``create_module`` and ``exec_module``. @@ -765,28 +774,28 @@ Internal functions of Python/import.c and Python/importdl.c will be removed. Possible Future Extensions ========================== -The slots mechanism, inspired by PyType_Slot from :pep:`384`, +The slots mechanism, inspired by ``PyType_Slot`` from :pep:`384`, allows later extensions. -Some extension modules exports many constants; for example _ssl has +Some extension modules exports many constants; for example ``_ssl`` has a long list of calls in the form:: PyModule_AddIntConstant(m, "SSL_ERROR_ZERO_RETURN", PY_SSL_ERROR_ZERO_RETURN); -Converting this to a declarative list, similar to PyMethodDef, +Converting this to a declarative list, similar to ``PyMethodDef``, would reduce boilerplate, and provide free error-checking which is often missing. String constants and types can be handled similarly. (Note that non-default bases for types cannot be portably specified -statically; this case would need a Py_mod_exec function that runs +statically; this case would need a ``Py_mod_exec`` function that runs before the slots are added. The free error-checking would still be beneficial, though.) -Another possibility is providing a "main" function that would be run -when the module is given to Python's -m switch. -For this to work, the runpy module will need to be modified to take +Another possibility is providing a "``main``" function that would be run +when the module is given to Python's :program:`-m` switch. +For this to work, the ``runpy`` module will need to be modified to take advantage of ModuleSpec-based loading introduced in :pep:`451`. Also, it will be necessary to add a mechanism for setting up a module according to slots it wasn't originally defined with. @@ -795,7 +804,7 @@ according to slots it wasn't originally defined with. Implementation ============== -Work-in-progress implementation is available in a Github repository [#gh-repo]_; +Work-in-progress implementation is available in a GitHub repository [#gh-repo]_; a patchset is at [#gh-patch]_. @@ -803,28 +812,28 @@ Previous Approaches =================== Stefan Behnel's initial proto-PEP [#stefans_protopep]_ -had a "PyInit_modulename" hook that would create a module class, +had a "``PyInit_modulename``" hook that would create a module class, whose ``__init__`` would be then called to create the module. This proposal did not correspond to the (then nonexistent) :pep:`451`, where module creation and initialization is broken into distinct steps. It also did not support loading an extension into pre-existing module objects. -Alyssa (Nick) Coghlan proposed "Create" and "Exec" hooks, and wrote a prototype +Alyssa (Nick) Coghlan proposed "``Create``" and "``Exec``" hooks, and wrote a prototype implementation [#alyssas-prototype]_. At this time :pep:`451` was still not implemented, so the prototype does not use ModuleSpec. -The original version of this PEP used Create and Exec hooks, and allowed -loading into arbitrary pre-constructed objects with Exec hook. +The original version of this PEP used ``Create`` and ``Exec`` hooks, and allowed +loading into arbitrary pre-constructed objects with ``Exec`` hook. The proposal made extension module initialization closer to how Python modules are initialized, but it was later recognized that this isn't an important goal. The current PEP describes a simpler solution. -A further iteration used a "PyModuleExport" hook as an alternative to PyInit, -where PyInit was used for existing scheme, and PyModuleExport for multi-phase. +A further iteration used a "``PyModuleExport``" hook as an alternative to ``PyInit``, +where ``PyInit`` was used for existing scheme, and ``PyModuleExport`` for multi-phase. However, not being able to determine the hook name based on module name -complicated automatic generation of PyImport_Inittab by tools like freeze. -Keeping only the PyInit hook name, even if it's not entirely appropriate for +complicated automatic generation of ``PyImport_Inittab`` by tools like freeze. +Keeping only the ``PyInit`` hook name, even if it's not entirely appropriate for exporting a definition, yielded a much simpler solution.