-
Notifications
You must be signed in to change notification settings - Fork 21
CF-1022 pyinstaller + Nuitka support #1140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adds get_jedi_environment() helper that returns InterpreterEnvironment when running in compiled/bundled binaries to avoid subprocess spawning issues with Jedi analysis.
When running as compiled binary, SAFE_SYS_EXECUTABLE now searches for .venv or venv in cwd and parent directories before falling back to system Python. This allows subprocess calls to use the correct Python interpreter instead of the non-existent binary path.
Fixes NameError where _find_python_executable was called during module import before being defined. Helper functions must be defined before the Compat class that uses them.
…binaries Instead of embedding the script as a string constant, use importlib.resources to read pytest_new_process_discovery.py from package resources. This allows the file to be included in compiled binaries while keeping the source clean. When running as a compiled binary, the script is extracted from resources and written to a temp directory before being executed.
Replace importlib.resources approach with direct string embedding of pytest_new_process_discovery.py content. This ensures the script is compiled into the binary and can be written to a temp file at runtime without needing to access package resources.
Add detailed warning logs and error handling around all Jedi operations to diagnose typeshed access issues in Nuitka onefile binaries: - Log InterpreterEnvironment creation details and typeshed paths - Check if typeshed/stdlib directories exist and list contents - Wrap all name.type and definition.type accesses in try-except - Log Jedi environment being used in all Script operations - Convert logger.exception to logger.warning with exc_info for visibility This will help identify why Jedi cannot find typeshed files in the temporary extraction directory of onefile binaries.
Centralize executable finding logic in compat.py with new find_executable_in_venv() function. This enables formatters like black to be found in venv directories when running as a Nuitka compiled binary, preventing "formatter not found" errors. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Embed plugin as standalone module to avoid conflicts with installed codeflash packages in target venv.
|
A couple of questions
|
locally and linux / windows servers
I'm not sure what that will look like yet, I need to see exactly how omni is doing things before moving on, one step at a time. |
| for parent in [current_dir, *current_dir.parents]: | ||
| for venv_name in venv_names: | ||
| venv_dir = parent / venv_name | ||
| if venv_dir.is_dir(): | ||
| bin_dir = venv_dir / ("bin" if os.name != "nt" else "Scripts") | ||
| for exe_name in exe_names: | ||
| exe_path = bin_dir / exe_name | ||
| if exe_path.is_file(): | ||
| return str(exe_path) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡️Codeflash found 47% (0.47x) speedup for _find_python_executable in codeflash/code_utils/compat.py
⏱️ Runtime : 450 microseconds → 305 microseconds (best of 1 runs)
📝 Explanation and details
The optimized code achieves a 47% speedup by replacing pathlib.Path operations with lower-level os.path functions in the find_executable_in_venv function.
Key Optimization
Path Object Creation → String Operations
The original code creates multiple Path objects during directory traversal:
parent / venv_namecreates a new Path objectvenv_dir / "bin"creates another Path objectbin_dir / exe_namecreates yet another Path object- Each
.is_dir()and.is_file()call involves Path object overhead
The optimized version uses os.path.join(), os.path.isdir(), and os.path.isfile() which operate directly on strings, avoiding the object allocation and method dispatch overhead of Path objects.
Why This Works
Looking at the line profiler data, the hottest lines in the original code are:
- Path concatenations (
parent / venv_name,bin_dir / exe_name): ~34% of total time - Directory/file checks (
is_dir(),is_file()): ~34% of total time
By eliminating Path object creation in the loop (which runs ~112-159 times based on profiler hits), we reduce memory allocations and method call overhead significantly.
Iteration Strategy Change
The optimization also changes from for parent in [current_dir, *current_dir.parents] (which pre-allocates the entire parent list) to a while True loop with manual parent traversal using os.path.dirname(). This:
- Avoids creating the full parents list upfront
- Enables early exit once the venv is found (same behavior, but lighter memory footprint)
Test Case Performance
The optimization shows consistent gains across different scenarios:
- Simple venv in cwd: 48% faster (50.1μs → 33.8μs)
- Deep parent traversal: 72% faster (121μs → 70.7μs) - especially beneficial when venv is in ancestor directories
- Fallback scenarios: 37-44% faster even when no venv is found
Impact on Workloads
Based on function_references, this function is called by get_safe_sys_executable(), which appears to be used during initialization or when determining the Python executable to use. While not in a tight loop, the ~47% speedup reduces startup/initialization overhead, particularly beneficial when:
- Running in compiled/bundled binary mode where venv discovery happens
- Deep directory structures require many parent traversals (test shows 72% speedup for deep hierarchies)
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 200 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
import os # used to inspect/create venv-like dirs and to restore os.name if changed
import shutil # used and monkeypatched for system python lookup behavior
import stat # for setting executable bit if desired (not required by function)
import sys # used for sys.executable and to simulate frozen attributes
# import the real module that contains the function under test
from codeflash.code_utils import compat
# We will test the real _find_python_executable function from the compat module.
# The function relies on is_compiled_or_bundled_binary (which is cached) and on
# find_executable_in_venv, shutil.which, os.name, and the current working directory.
# Tests will manipulate these dependencies through pytest fixtures (monkeypatch, tmp_path)
# and will ALWAYS call the real function under test (no mocking/stubbing of it).
# Helper to ensure the lru_cache on is_compiled_or_bundled_binary is cleared between tests.
def _clear_compiled_cache():
# The function is decorated with lru_cache(maxsize=1); clear its cache before changing state.
compat.is_compiled_or_bundled_binary.cache_clear()
def _ensure_no_compiled_flag(monkeypatch):
"""Ensure the compat module does not report compiled/bundled binary.
This is done by removing any compiled indicators from runtime:
- remove sys.frozen if present on compat check
- remove sys._MEIPASS if present
- remove __compiled__ from compat module globals if present
Then clear the cached result.
"""
# Remove attributes if they exist (monkeypatch will restore them at test end).
monkeypatch.delattr(sys, "frozen", raising=False)
monkeypatch.delattr(sys, "_MEIPASS", raising=False)
# Remove module-level __compiled__ if present
monkeypatch.delattr(compat, "__compiled__", raising=False)
_clear_compiled_cache()
def _set_compiled_flag(monkeypatch):
"""Make compat.is_compiled_or_bundled_binary() return True by setting one of the indicators.
We set sys._MEIPASS (non-raising so that attribute can be created).
Then clear the cached result.
"""
monkeypatch.setattr(sys, "_MEIPASS", "1", raising=False)
_clear_compiled_cache()
def test_not_compiled_returns_sys_executable(monkeypatch, tmp_path):
# Ensure no compiled indicator exists and cache is cleared.
_ensure_no_compiled_flag(monkeypatch)
# Change cwd to a temporary directory that does NOT contain a venv.
monkeypatch.chdir(tmp_path)
# Call the function; since not compiled it should immediately return sys.executable.
codeflash_output = compat._find_python_executable()
result = codeflash_output # 4.60μs -> 4.60μs (0.022% faster)
def test_compiled_prefers_venv_in_cwd(monkeypatch, tmp_path):
# Simulate compiled/bundled binary environment.
_set_compiled_flag(monkeypatch)
# Create a .venv directory in the current working directory with a bin/python3 file.
venv_dir = tmp_path / ".venv" / "bin"
venv_dir.mkdir(parents=True)
python3_path = venv_dir / "python3"
python3_path.write_text("#!/usr/bin/env python3\n") # content doesn't matter
# Ensure it's a file (is_file() should be True). Executable permission is not required by function.
python3_path.chmod(python3_path.stat().st_mode | stat.S_IXUSR)
# Set current working directory to the temp path where .venv lives.
monkeypatch.chdir(tmp_path)
# The function should detect the venv python3 and return its path as a string.
codeflash_output = compat._find_python_executable()
result = codeflash_output # 50.1μs -> 33.8μs (48.1% faster)
def test_compiled_finds_venv_in_parent(monkeypatch, tmp_path):
# Simulate compiled/bundled binary environment.
_set_compiled_flag(monkeypatch)
# Create a parent directory that holds "venv/bin/python3"
parent = tmp_path / "ancestor"
venv_dir = parent / "venv" / "bin"
venv_dir.mkdir(parents=True)
venv_python = venv_dir / "python3"
venv_python.write_text("print('hello')")
# Create a nested child path and set cwd to it so parent traversal is required.
child = parent / "child" / "grandchild"
child.mkdir(parents=True)
monkeypatch.chdir(child)
# The function should search up parents and find the venv/python3 in the ancestor.
codeflash_output = compat._find_python_executable()
result = codeflash_output # 121μs -> 70.7μs (72.3% faster)
def test_compiled_venv_with_python_not_python3(monkeypatch, tmp_path):
# Simulate compiled environment.
_set_compiled_flag(monkeypatch)
# Create a venv containing only 'python' (but not 'python3') to test fallback name order.
venv_dir = tmp_path / "venv" / "bin"
venv_dir.mkdir(parents=True)
python_path = venv_dir / "python"
python_path.write_text("# python fallback executable")
# Set cwd to the directory containing the venv.
monkeypatch.chdir(tmp_path)
# Ensure that the result is the venv's 'python' (since 'python3' is absent but 'python' exists).
codeflash_output = compat._find_python_executable()
result = codeflash_output # 76.0μs -> 55.8μs (36.1% faster)
def test_compiled_falls_back_to_shutil_which_sequence(monkeypatch, tmp_path):
# Simulate compiled environment.
_set_compiled_flag(monkeypatch)
# Ensure no venv exists in cwd or parents.
monkeypatch.chdir(tmp_path)
# Monkeypatch shutil.which to behave differently for "python3" vs "python".
def which_mock(name):
# Simulate that "python3" is not present but "python" is present.
if name == "python3":
return None
if name == "python":
return "/usr/bin/fake-python"
return None
monkeypatch.setattr(shutil, "which", which_mock)
# The function should call find_executable_in_venv (none found), then call shutil.which
# first for "python3" (None) and then for "python" and return that value.
codeflash_output = compat._find_python_executable()
result = codeflash_output # 101μs -> 70.8μs (44.1% faster)
def test_compiled_falls_back_to_sys_executable_when_no_system_python(monkeypatch, tmp_path):
# Simulate compiled environment.
_set_compiled_flag(monkeypatch)
# No venv found (cwd is empty) and shutil.which will return None for all checks.
monkeypatch.chdir(tmp_path)
monkeypatch.setattr(shutil, "which", lambda name: None)
# With no venv and no system python detected by shutil.which, function returns sys.executable.
codeflash_output = compat._find_python_executable()
result = codeflash_output # 95.7μs -> 69.6μs (37.5% faster)
def test_windows_named_executable_and_scripts_dir(monkeypatch, tmp_path):
# Simulate Windows by patching os.name to 'nt' for this test. Use raising=False to allow change.
monkeypatch.setattr(os, "name", "nt", raising=False)
# Simulate compiled environment.
_set_compiled_flag(monkeypatch)
# On Windows, the function looks for venv/Scripts/python.exe
venv_scripts = tmp_path / "venv" / "Scripts"
venv_scripts.mkdir(parents=True)
win_python = venv_scripts / "python.exe"
win_python.write_text("windows python exe content")
# Mark as executable (not required by function, but realistic).
win_python.chmod(win_python.stat().st_mode | stat.S_IXUSR)
# Change cwd to tmp_path so the venv is visible.
monkeypatch.chdir(tmp_path)
# The function should detect the windows-style python executable and return it.
codeflash_output = compat._find_python_executable()
result = codeflash_output
def test_deep_parent_search_large_scale(monkeypatch, tmp_path):
# Large scale scenario: create a deep directory tree and place venv near top to ensure traversal scales.
# Depth chosen to be large but under 1000 as requested (we pick 150).
depth = 150
base = tmp_path
current = base
# Build a deep nested path
for i in range(depth):
current = current / f"level_{i}"
current.mkdir()
# Place venv in a higher-level ancestor (not immediate parent) to ensure many parent iterations.
# Choose ancestor at level 10 so traversal will ascend many steps.
ancestor = (
base
/ "level_0"
/ "level_1"
/ "level_2"
/ "level_3"
/ "level_4"
/ "level_5"
/ "level_6"
/ "level_7"
/ "level_8"
/ "level_9"
)
venv_bin = ancestor / "venv" / "bin"
venv_bin.mkdir(parents=True)
deep_venv_python = venv_bin / "python3"
deep_venv_python.write_text("deep venv python3")
# Simulate compiled environment and set cwd to the deepest nested path.
_set_compiled_flag(monkeypatch)
monkeypatch.chdir(current)
# The function should walk up the directory tree and find the venv placed in the ancestor.
codeflash_output = compat._find_python_executable()
result = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.from unittest.mock import patch
from codeflash.code_utils.compat import _find_python_executable
class TestBasicFunctionality:
"""Basic test cases for _find_python_executable under normal conditions."""
def test_returns_string(self):
"""Verify that _find_python_executable returns a string."""
codeflash_output = _find_python_executable()
result = codeflash_output
def test_returns_non_empty_string(self):
"""Verify that _find_python_executable returns a non-empty string."""
codeflash_output = _find_python_executable()
result = codeflash_output
def test_normal_execution_returns_sys_executable(self):
"""When not compiled/bundled, should return sys.executable."""
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=False):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_returns_path_like_string(self):
"""Verify that the returned string looks like a valid file path."""
codeflash_output = _find_python_executable()
result = codeflash_output
def test_executable_contains_python_reference(self):
"""Verify that returned executable name references Python."""
codeflash_output = _find_python_executable()
result = codeflash_output
result_lower = result.lower()
class TestEdgeCases:
"""Edge case test cases for _find_python_executable."""
def test_when_sys_executable_is_sys_executable(self):
"""Verify behavior when sys.executable points to current interpreter."""
codeflash_output = _find_python_executable()
original_result = codeflash_output
# Under normal conditions (not compiled), should match sys.executable
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=False):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_compiled_binary_no_venv_no_system_python(self):
"""When compiled, no venv found, no system python found."""
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=None):
with patch("shutil.which", return_value=None):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_compiled_binary_venv_found(self):
"""When compiled and venv python is found."""
expected_venv_python = "/path/to/venv/bin/python3"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=expected_venv_python):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_compiled_binary_venv_not_found_system_found(self):
"""When compiled, venv not found, system python found."""
expected_system_python = "/usr/bin/python3"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=None):
with patch("shutil.which", return_value=expected_system_python):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_windows_python_executable(self):
"""Test behavior on Windows-like paths."""
windows_python = "C:\\Python39\\python.exe"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=windows_python):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_relative_path_python(self):
"""Test behavior with relative paths."""
relative_python = "./venv/bin/python"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=relative_python):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_venv_priority_over_system(self):
"""Verify venv python is prioritized over system python."""
venv_python = "/home/user/.venv/bin/python"
system_python = "/usr/bin/python3"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=venv_python):
with patch("shutil.which", return_value=system_python):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_empty_string_not_returned(self):
"""Verify that empty string is never returned."""
codeflash_output = _find_python_executable()
result = codeflash_output
def test_none_not_returned(self):
"""Verify that None is never returned."""
codeflash_output = _find_python_executable()
result = codeflash_output
def test_multiple_calls_consistent(self):
"""Verify that multiple calls return consistent results."""
# Note: This may fail if system state changes, but should be consistent in unit test
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=False):
codeflash_output = _find_python_executable()
result1 = codeflash_output
codeflash_output = _find_python_executable()
result2 = codeflash_output
def test_python3_preferred_on_unix(self):
"""On Unix-like systems, python3 should be checked before python."""
# This tests the order of names tried
python3_path = "/usr/bin/python3"
python_path = "/usr/bin/python"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=None):
# Simulate which() being called multiple times
call_count = [0]
def which_side_effect(name):
call_count[0] += 1
if name == "python3":
return python3_path
if name == "python":
return python_path
return None
with patch("shutil.which", side_effect=which_side_effect), patch("os.name", "posix"):
codeflash_output = _find_python_executable()
result = codeflash_output
class TestLargeScale:
"""Large scale test cases for performance and scalability."""
def test_repeated_calls_performance(self):
"""Test that repeated calls maintain performance (no degradation)."""
import time
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=False):
# First call
start_time = time.perf_counter()
for _ in range(100):
_find_python_executable()
elapsed_time = time.perf_counter() - start_time
def test_compiled_binary_multiple_calls_consistency(self):
"""Test that compiled binary mode works consistently across multiple calls."""
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value="/venv/python"):
results = []
for _ in range(50):
codeflash_output = _find_python_executable()
result = codeflash_output
results.append(result)
def test_large_number_of_different_paths(self):
"""Test behavior when system has many possible paths."""
# Simulate finding different executables in rotation
paths = [f"/usr/bin/python{i}" for i in range(100)]
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=None):
for path in paths[:10]: # Test with first 10 paths
with patch("shutil.which", return_value=path):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_path_string_length_handling(self):
"""Test handling of very long path strings."""
long_path = "/very/long/path/" + "a" * 500 + "/python"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=long_path):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_unicode_in_paths(self):
"""Test handling of Unicode characters in paths."""
unicode_path = "/home/user/ПрОвЕрКа/python"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=unicode_path):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_special_characters_in_paths(self):
"""Test handling of special characters in paths."""
special_paths = [
"/path/with spaces/python",
"/path/with-dashes/python",
"/path/with_underscores/python",
"/path/with.dots/python",
]
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
for special_path in special_paths:
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=special_path):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_sequential_fallback_logic(self):
"""Test that fallback logic is applied sequentially."""
venv_python = "/venv/python"
system_python = "/usr/bin/python"
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
# Case 1: venv found first
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=venv_python):
with patch("shutil.which", return_value=system_python):
codeflash_output = _find_python_executable()
result = codeflash_output
# Case 2: venv not found, system found
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=None):
with patch("shutil.which", return_value=system_python):
codeflash_output = _find_python_executable()
result = codeflash_output
# Case 3: neither found
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value=None):
with patch("shutil.which", return_value=None):
codeflash_output = _find_python_executable()
result = codeflash_output
class TestIntegration:
"""Integration tests combining multiple aspects."""
def test_actual_sys_executable_fallback_works(self):
"""Verify actual sys.executable is always a valid fallback."""
# This is a real test without mocks
codeflash_output = _find_python_executable()
result = codeflash_output
# When not compiled, should return sys.executable
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=False):
codeflash_output = _find_python_executable()
result = codeflash_output
def test_result_contains_executable_name(self):
"""Verify result contains some form of Python executable reference."""
codeflash_output = _find_python_executable()
result = codeflash_output
result_lower = result.lower()
def test_callable_without_arguments(self):
"""Verify function can be called without any arguments."""
# Should not raise any exception
codeflash_output = _find_python_executable()
result = codeflash_output
def test_return_type_consistency(self):
"""Verify return type is always string."""
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=True):
with patch("codeflash.code_utils.compat.find_executable_in_venv", return_value="/venv/python"):
codeflash_output = _find_python_executable()
result1 = codeflash_output
with patch("codeflash.code_utils.compat.is_compiled_or_bundled_binary", return_value=False):
codeflash_output = _find_python_executable()
result2 = codeflash_outputfrom codeflash.code_utils.compat import _find_python_executable
def test__find_python_executable():
_find_python_executable()To test or edit this optimization locally git merge codeflash/optimize-pr1140-2026-01-22T23.56.38
Click to see suggested changes
| for parent in [current_dir, *current_dir.parents]: | |
| for venv_name in venv_names: | |
| venv_dir = parent / venv_name | |
| if venv_dir.is_dir(): | |
| bin_dir = venv_dir / ("bin" if os.name != "nt" else "Scripts") | |
| for exe_name in exe_names: | |
| exe_path = bin_dir / exe_name | |
| if exe_path.is_file(): | |
| return str(exe_path) | |
| parent_path = os.fspath(current_dir) | |
| while True: | |
| for venv_name in venv_names: | |
| venv_dir = os.path.join(parent_path, venv_name) | |
| if os.path.isdir(venv_dir): | |
| bin_dir = os.path.join(venv_dir, ("bin" if os.name != "nt" else "Scripts")) | |
| for exe_name in exe_names: | |
| exe_path = os.path.join(bin_dir, exe_name) | |
| if os.path.isfile(exe_path): | |
| return exe_path | |
| new_parent = os.path.dirname(parent_path) | |
| if new_parent == parent_path: | |
| break | |
| parent_path = new_parent | |
No description provided.