Skip to content

Conversation

@dimitri-yatsenko
Copy link
Member

Summary

Implement lazy imports to reduce import datajoint time, especially on macOS where import overhead is significantly higher (~6s vs ~1s on Linux).

Problem

Heavy dependencies are loaded eagerly at import time even if unused:

  • diagram.py imports networkx and matplotlib (~3s on Mac)
  • admin.py imports pymysql via connection (~2.5s on Mac)
  • cli.py imports click

Solution

Use __getattr__ in __init__.py to defer loading until first access:

_lazy_modules = {
    "Diagram": (".diagram", "Diagram"),
    "Di": (".diagram", "Diagram"),
    "ERD": (".diagram", "Diagram"),
    "kill": (".admin", "kill"),
    "cli": (".cli", "cli"),
}

def __getattr__(name: str):
    if name in _lazy_modules:
        module_path, attr_name = _lazy_modules[name]
        module = importlib.import_module(module_path, __package__)
        return getattr(module, attr_name)
    raise AttributeError(...)

Results

Before (estimated from issue):

  • Mac: ~6s for import datajoint
  • Linux: ~1s

After:

  • Basic import: ~0.75s (Mac)
  • Diagram access adds: ~0.25s when needed

What's lazy vs eager

Eager (always loaded):

  • Schema, Table, FreeTable
  • Manual, Lookup, Computed, Imported, Part
  • Connection, conn, config
  • Expression classes (Not, AndList, Top, U)
  • errors, DataJointError
  • Codec API

Lazy (loaded on first access):

  • dj.Diagram, dj.Di, dj.ERD → networkx, matplotlib
  • dj.kill → admin module
  • dj.cli → click

Test plan

  • Unit tests for lazy import behavior
  • Verify core functionality available immediately
  • Verify lazy modules load on access
  • Verify Diagram aliases work correctly

Closes #1220


🤖 Generated with Claude Code

Defer loading of heavy dependencies (networkx, matplotlib, click, pymysql)
until their associated features are accessed:

- dj.Diagram, dj.Di, dj.ERD -> loads diagram.py (networkx, matplotlib)
- dj.kill -> loads admin.py (pymysql via connection)
- dj.cli -> loads cli.py (click)

This reduces `import datajoint` time significantly, especially on macOS
where import overhead is higher. Core functionality (Schema, Table,
Connection, etc.) remains immediately available.

Closes #1220

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@github-actions github-actions bot added the enhancement Indicates new improvements label Jan 9, 2026
- Cache lazy imports in globals() to override the submodule that
  importlib automatically sets on the parent module
- Add dj.diagram to lazy modules (returns module for diagram_active access)
- Add tests for cli callable and diagram module access

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@dimitri-yatsenko dimitri-yatsenko merged commit 299ac0d into pre/v2.0 Jan 9, 2026
8 checks passed
@dimitri-yatsenko dimitri-yatsenko deleted the perf/1220-lazy-imports branch January 9, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Indicates new improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants