Skip to content

Conversation

@aseembits93
Copy link
Contributor

@aseembits93 aseembits93 commented Oct 9, 2025

PR Type

Enhancement


Description

  • Add caching to config discovery functions

  • Cache pyproject lookup with functools.cache

  • Cache conftest discovery for test paths


Diagram Walkthrough

flowchart LR
  A["config_parser.py"] -- "add @cache" --> B["find_pyproject_toml()"]
  A -- "add @cache" --> C["find_conftest_files()"]
  B -- "memoized results" --> D["faster pyproject lookup"]
  C -- "memoized results" --> E["faster conftest discovery"]
Loading

File Walkthrough

Relevant files
Enhancement
config_parser.py
Memoize config and test discovery functions                           

codeflash/code_utils/config_parser.py

  • Import functools.cache decorator.
  • Add @cache to find_pyproject_toml.
  • Add @cache to find_conftest_files.
  • Improve performance by memoizing results.
+3/-0     

@github-actions
Copy link

github-actions bot commented Oct 9, 2025

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Cache Invalidation

Using a global @cache on filesystem discovery results can become stale if the project structure changes (e.g., pyproject.toml added/moved, new conftest files) during the process lifetime. Confirm that long-lived processes won't need fresh results or consider cache clear hooks or a TTL approach.

@cache
def find_pyproject_toml(config_file: Path | None = None) -> Path:
    # Find the pyproject.toml file on the root of the project

    if config_file is not None:
        config_file = Path(config_file)
        if config_file.suffix.lower() != ".toml":
            msg = f"Config file {config_file} is not a valid toml file. Please recheck the path to pyproject.toml"
            raise ValueError(msg)
        if not config_file.exists():
            msg = f"Config file {config_file} does not exist. Please recheck the path to pyproject.toml"
            raise ValueError(msg)
        return config_file
    dir_path = Path.cwd()

    while dir_path != dir_path.parent:
        config_file = dir_path / "pyproject.toml"
        if config_file.exists():
            return config_file
        # Search for pyproject.toml in the parent directories
        dir_path = dir_path.parent
    msg = f"Could not find pyproject.toml in the current directory {Path.cwd()} or any of the parent directories. Please create it by running `codeflash init`, or pass the path to pyproject.toml with the --config-file argument."

    raise ValueError(msg)


@cache
def find_conftest_files(test_paths: list[Path]) -> list[Path]:
    list_of_conftest_files = set()
    for test_path in test_paths:
        # Find the conftest file on the root of the project
Argument Hashability

@cache requires hashable arguments. The function find_conftest_files takes a list[Path], which is unhashable and will raise a TypeError at first call. Consider changing the signature to accept a tuple[Path, ...] or converting to a tuple within the function or switching to lru_cache with a key transformation.

@cache
def find_conftest_files(test_paths: list[Path]) -> list[Path]:
    list_of_conftest_files = set()
    for test_path in test_paths:
        # Find the conftest file on the root of the project

@github-actions
Copy link

github-actions bot commented Oct 9, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix unhashable list caching

Using @cache on a function with a list argument will raise TypeError because lists
are unhashable. Replace the decorator with an explicit cache keyed by an immutable
representation (e.g., a tuple of resolved paths) to safely memoize results.

codeflash/code_utils/config_parser.py [36-39]

-@cache
+_CONFTEST_CACHE: dict[tuple[Path, ...], list[Path]] = {}
+
 def find_conftest_files(test_paths: list[Path]) -> list[Path]:
+    key = tuple(p.resolve() for p in test_paths)
+    if key in _CONFTEST_CACHE:
+        return _CONFTEST_CACHE[key]
+
     list_of_conftest_files = set()
     for test_path in test_paths:
+        # Find the conftest file on the root of the project
+        ...
 
+    result = list(list_of_conftest_files)
+    _CONFTEST_CACHE[key] = result
+    return result
+
Suggestion importance[1-10]: 9

__

Why: Accurate and important: @cache on a function taking list[Path] will raise TypeError because lists are unhashable. The explicit tuple-based cache is a correct fix with high impact on correctness.

High
Prevent stale or incorrect cache keying

Avoid caching based on unhashable or mutable arguments. functools.cache requires all
args to be hashable; passing a Path is fine, but passing None can cause a single
cached result that ignores CWD changes across calls. Convert the input to a stable
cache key (absolute path or a sentinel) and implement a manual cache to avoid stale
results.

codeflash/code_utils/config_parser.py [10-11]

-@cache
+_PYPROJECT_CACHE: dict[Path | str, Path] = {}
+
 def find_pyproject_toml(config_file: Path | None = None) -> Path:
+    key: Path | str
+    if config_file is None:
+        # Use CWD-resolved absolute path as cache key to avoid stale cache across directories
+        key = Path.cwd().resolve()
+    else:
+        key = config_file.resolve()
 
+    if key in _PYPROJECT_CACHE:
+        return _PYPROJECT_CACHE[key]
+
+    # Find the pyproject.toml file on the root of the project
+    if config_file is not None:
+        if config_file.is_dir():
+            config_file = config_file / "pyproject.toml"
+        if config_file.exists():
+            _PYPROJECT_CACHE[key] = config_file
+            return config_file
+
+    dir_path = Path.cwd()
+    while dir_path != dir_path.parent:
+        config_file = dir_path / "pyproject.toml"
+        if config_file.exists():
+            _PYPROJECT_CACHE[key] = config_file
+            return config_file
+        dir_path = dir_path.parent
+
+    msg = f"Could not find pyproject.toml in the current directory {Path.cwd()} or any of the parent directories. Please create it by running `codeflash init`, or pass the path to pyproject.toml with the --config-file argument."
+    raise ValueError(msg)
+
Suggestion importance[1-10]: 7

__

Why: Correctly flags that blindly using @cache on find_pyproject_toml can lead to stale results when CWD changes; the proposed keying approach is reasonable. Impact is moderate since Path is hashable, but environment-dependent staleness is a valid concern.

Medium

import tomlkit


@cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mohammedahmed18 could caching this be a problem wrt the vs code extension, when we are doing the codeflash init flow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in codeflash/lsp/beta.py there's _find_pyproject_toml that parses for pyproject.toml differently, should be fine

@aseembits93
Copy link
Contributor Author

it's still in progress, there are some edge cases to consider at the moment

import tomlkit

PYPROJECT_TOML_CACHE = {}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many pyproject.toml's are we expecting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at most 2-3 guessing

@KRRT7
Copy link
Contributor

KRRT7 commented Oct 14, 2025

there are some merge conflicts @aseembits93

@aseembits93 aseembits93 requested a review from KRRT7 October 14, 2025 05:03
@KRRT7
Copy link
Contributor

KRRT7 commented Oct 14, 2025

we're looking for different config types repeatedly as well no? for addopts, I think same sort of caching could be extended to avoid searching for the same stuff repeatedly

@aseembits93
Copy link
Contributor Author

@aseembits93 aseembits93 merged commit 491e616 into main Oct 14, 2025
20 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants