feat: add cp2k support for skipping bad box by SchrodingersCattt · Pull Request #1830 · deepmodeling/dpgen

SchrodingersCattt · 2025-10-26T04:26:30Z

Summary by CodeRabbit

New Features
- Automatically detects and skips configurations with invalid box geometry during processing.
Improvements
- Adds end-of-run summary reporting the number of skipped and remaining configurations.
- Ensures post-processing output files remain linked and available after run.
Bug Fixes
- Removes redundant file-handling operations to improve reliability and resource cleanup.

coderabbitai · 2025-10-26T04:27:14Z

📝 Walkthrough

Walkthrough

Adds an fp_skip_bad_box option to make_fp_cp2k that validates each task POSCAR with check_bad_box, skips and counts bad configurations, logs a skip summary after processing, removes redundant fp.close() calls, and continues linking PP files for processed tasks.

Changes

Cohort / File(s)	Summary
Skip bad box logic in FP generation `dpgen/generator/run.py`	Adds conditional `fp_skip_bad_box` handling in `make_fp_cp2k`: validates task POSCAR via `check_bad_box`, skips and counts bad boxes (`count_bad_box`), logs a summary of skipped vs remaining configurations, removes redundant `fp.close()` calls, and retains/extends PP-file linking via `_link_fp_vasp_pp`.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant make_fp_cp2k as make_fp_cp2k()
    participant check_bad_box
    participant _link_fp_vasp_pp
    participant Logger

    Caller->>make_fp_cp2k: invoke (may include fp_skip_bad_box)
    make_fp_cp2k->>make_fp_cp2k: iterate tasks
    rect rgba(200,220,255,0.6)
        note right of make_fp_cp2k: Per-task validation when fp_skip_bad_box set
        make_fp_cp2k->>check_bad_box: validate POSCAR
        alt bad box detected
            check_bad_box-->>make_fp_cp2k: True
            make_fp_cp2k->>make_fp_cp2k: increment count_bad_box and skip task
        else valid box
            check_bad_box-->>make_fp_cp2k: False
            make_fp_cp2k->>make_fp_cp2k: generate inputs (input.inp, coord.xyz)
            make_fp_cp2k->>_link_fp_vasp_pp: link PP files for task
        end
    end
    make_fp_cp2k->>Logger: log skip summary (skipped vs remaining)
    make_fp_cp2k-->>Caller: return

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Verify check_bad_box integration and handling of missing/malformed POSCAR files.
Confirm count_bad_box initialization, increments, and final summary correctness.
Ensure redundant fp.close() removals do not leak resources and file writes remain correct.
Validate _link_fp_vasp_pp is invoked appropriately for non-skipped tasks.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "feat: add cp2k support for skipping bad box" is clear, specific, and directly aligned with the main changes in the changeset. The summary indicates that the core modification involves adding a skip_bad_box mechanism in the make_fp_cp2k function, which is precisely what the title conveys. The title is concise and avoids vague language or unnecessary noise, and a teammate scanning the repository history would immediately understand that this PR introduces functionality for handling invalid geometries in CP2K calculations.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 54f039c and b228742.

📒 Files selected for processing (1)

dpgen/generator/run.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

dpgen/{generator,auto_test,data,simplify}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place core logic changes in the appropriate module directories: dpgen/generator, dpgen/auto_test, dpgen/data, dpgen/simplify

Files:

dpgen/generator/run.py

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Adhere to ruff linting for Python code as configured in the repository
Use isort with the Black profile for import ordering in Python files
Write Numpy-style docstrings for Python code
Ensure code remains compatible with Python 3.9–3.12 (as per CI matrix)

Files:

dpgen/generator/run.py

🧬 Code graph analysis (1)

dpgen/generator/run.py (1)

dpgen/generator/lib/cp2k.py (3)

make_cp2k_input (286-322)

make_cp2k_input_from_external (340-369)

make_cp2k_xyz (325-337)

🔇 Additional comments (2)

dpgen/generator/run.py (2)

3774-3791: Previous review concerns have been addressed correctly.

The implementation now properly handles the bad box detection by removing the entire task directory with shutil.rmtree(ii) (line 3790) when a bad configuration is detected. This resolves the previous concern about leaving task directories in an inconsistent state.

The implementation correctly:

Changes back to cwd before removing the directory (line 3789)

Tracks the number of skipped tasks (line 3788)

Continues to the next iteration without creating CP2K input files (line 3791)

3806-3812: PP file linking works correctly with skipped tasks.

The previous concern about _link_fp_vasp_pp linking PP files to skipped tasks is resolved. Since skipped task directories are removed entirely (line 3790), when _link_fp_vasp_pp performs its own glob.glob(os.path.join(work_path, "task.*")) (line 3351), it will only find and link to the remaining valid task directories.

The skip summary logging (lines 3806-3809) correctly reports the number of skipped and remaining tasks.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Ming-Yu Guo <gmy721212@163.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

dpgen/generator/run.py (1)
3658-3670: Move bad box check before reading sys_data for better performance.

Currently, sys_data = dpdata.System("POSCAR").data is read for all tasks (line 3660), but if skip_bad_box is set and the box is bad, sys_data is never used. The bad box check happens afterward (lines 3663-3670).

For efficiency, consider checking the bad box first, then only reading sys_data if the task won't be skipped:
 for ii in fp_tasks:
     os.chdir(ii)
+    
+    if skip_bad_box is not None:
+        # Check the box directly from POSCAR
+        skip = check_bad_box("POSCAR", skip_bad_box, fmt="vasp/poscar")
+        if skip:
+            count_bad_box += 1
+            os.chdir(cwd)
+            continue
+    
     sys_data = dpdata.System("POSCAR").data
     # make input for every task
     # if fp_params exits, make keys
-    if skip_bad_box is not None:
-        # Check the box directly from POSCAR
-        skip = check_bad_box("POSCAR", skip_bad_box, fmt="vasp/poscar")
-        if skip:
-            count_bad_box += 1
-            os.chdir(cwd)
-            continue
-
     if fp_params:
This avoids unnecessary POSCAR parsing for tasks that will be skipped.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e3b4f0 and e9faeca.

📒 Files selected for processing (1)

dpgen/generator/run.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

dpgen/{generator,auto_test,data,simplify}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place core logic changes in the appropriate module directories: dpgen/generator, dpgen/auto_test, dpgen/data, dpgen/simplify

Files:

dpgen/generator/run.py

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Adhere to ruff linting for Python code as configured in the repository
Use isort with the Black profile for import ordering in Python files
Write Numpy-style docstrings for Python code
Ensure code remains compatible with Python 3.9–3.12 (as per CI matrix)

Files:

dpgen/generator/run.py

🧬 Code graph analysis (1)

dpgen/generator/run.py (1)

dpgen/generator/lib/cp2k.py (3)

make_cp2k_input (286-322)

make_cp2k_input_from_external (340-369)

make_cp2k_xyz (325-337)

🔇 Additional comments (2)

dpgen/generator/run.py (2)

3654-3657: LGTM! Clean initialization of skip_bad_box mechanism.

The initialization is straightforward and uses .get() appropriately for the optional parameter.

3684-3687: LGTM! Summary logging is clear and consistent with existing patterns.

The summary message format matches the pattern used in _make_fp_vasp_inner (lines 2944-2947), maintaining consistency across the codebase. The logging provides useful information about how many configurations were skipped and how many remain for FP calculations.

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9faeca and 54f039c.

📒 Files selected for processing (1)

dpgen/generator/run.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

dpgen/{generator,auto_test,data,simplify}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place core logic changes in the appropriate module directories: dpgen/generator, dpgen/auto_test, dpgen/data, dpgen/simplify

Files:

dpgen/generator/run.py

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Adhere to ruff linting for Python code as configured in the repository
Use isort with the Black profile for import ordering in Python files
Write Numpy-style docstrings for Python code
Ensure code remains compatible with Python 3.9–3.12 (as per CI matrix)

Files:

dpgen/generator/run.py

🧬 Code graph analysis (1)

dpgen/generator/run.py (1)

dpgen/generator/lib/cp2k.py (3)

make_cp2k_input (286-322)

make_cp2k_input_from_external (340-369)

make_cp2k_xyz (325-337)

dpgen/generator/run.py

codecov · 2025-10-26T04:44:16Z

Codecov Report

❌ Patch coverage is 41.66667% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.80%. Comparing base (9e3b4f0) to head (b228742).
⚠️ Report is 15 commits behind head on devel.

Files with missing lines	Patch %	Lines
dpgen/generator/run.py	41.66%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##            devel    #1830      +/-   ##
==========================================
- Coverage   49.81%   49.80%   -0.02%     
==========================================
  Files          83       83              
  Lines       14978    14988      +10     
==========================================
+ Hits         7462     7465       +3     
- Misses       7516     7523       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

SchrodingersCattt · 2025-10-26T05:01:21Z

related to #1816

Merge branch 'devel' into feat/cp2k_fp_skip_bad_box

54f039c

Signed-off-by: Ming-Yu Guo <gmy721212@163.com>

coderabbitai bot reviewed Oct 26, 2025

View reviewed changes

SchrodingersCattt and others added 2 commits October 26, 2025 12:34

feat: add cp2k support for skipping bad box

ae06272

[pre-commit.ci] auto fixes from pre-commit.com hooks

e9faeca

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Oct 26, 2025

View reviewed changes

dpgen/generator/run.py Show resolved Hide resolved

dpgen/generator/run.py Show resolved Hide resolved

feat: Clean up inconsistent task directory when bad box skipped

b228742

wanghan-iapcm approved these changes Oct 28, 2025

View reviewed changes

wanghan-iapcm merged commit a3d7ace into deepmodeling:devel Oct 28, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add cp2k support for skipping bad box#1830

feat: add cp2k support for skipping bad box#1830
wanghan-iapcm merged 4 commits intodeepmodeling:develfrom
SchrodingersCattt:feat/cp2k_fp_skip_bad_box

SchrodingersCattt commented Oct 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 26, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Oct 26, 2025 •

edited

Loading

Uh oh!

SchrodingersCattt commented Oct 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SchrodingersCattt commented Oct 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

SchrodingersCattt commented Oct 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SchrodingersCattt commented Oct 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 26, 2025 •

edited

Loading

codecov bot commented Oct 26, 2025 •

edited

Loading