Skip to content

feat: add cp2k support for skipping bad box#1830

Merged
wanghan-iapcm merged 4 commits intodeepmodeling:develfrom
SchrodingersCattt:feat/cp2k_fp_skip_bad_box
Oct 28, 2025
Merged

feat: add cp2k support for skipping bad box#1830
wanghan-iapcm merged 4 commits intodeepmodeling:develfrom
SchrodingersCattt:feat/cp2k_fp_skip_bad_box

Conversation

@SchrodingersCattt
Copy link
Contributor

@SchrodingersCattt SchrodingersCattt commented Oct 26, 2025

Summary by CodeRabbit

  • New Features

    • Automatically detects and skips configurations with invalid box geometry during processing.
  • Improvements

    • Adds end-of-run summary reporting the number of skipped and remaining configurations.
    • Ensures post-processing output files remain linked and available after run.
  • Bug Fixes

    • Removes redundant file-handling operations to improve reliability and resource cleanup.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 26, 2025

📝 Walkthrough

Walkthrough

Adds an fp_skip_bad_box option to make_fp_cp2k that validates each task POSCAR with check_bad_box, skips and counts bad configurations, logs a skip summary after processing, removes redundant fp.close() calls, and continues linking PP files for processed tasks.

Changes

Cohort / File(s) Summary
Skip bad box logic in FP generation
dpgen/generator/run.py
Adds conditional fp_skip_bad_box handling in make_fp_cp2k: validates task POSCAR via check_bad_box, skips and counts bad boxes (count_bad_box), logs a summary of skipped vs remaining configurations, removes redundant fp.close() calls, and retains/extends PP-file linking via _link_fp_vasp_pp.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant make_fp_cp2k as make_fp_cp2k()
    participant check_bad_box
    participant _link_fp_vasp_pp
    participant Logger

    Caller->>make_fp_cp2k: invoke (may include fp_skip_bad_box)
    make_fp_cp2k->>make_fp_cp2k: iterate tasks
    rect rgba(200,220,255,0.6)
        note right of make_fp_cp2k: Per-task validation when fp_skip_bad_box set
        make_fp_cp2k->>check_bad_box: validate POSCAR
        alt bad box detected
            check_bad_box-->>make_fp_cp2k: True
            make_fp_cp2k->>make_fp_cp2k: increment count_bad_box and skip task
        else valid box
            check_bad_box-->>make_fp_cp2k: False
            make_fp_cp2k->>make_fp_cp2k: generate inputs (input.inp, coord.xyz)
            make_fp_cp2k->>_link_fp_vasp_pp: link PP files for task
        end
    end
    make_fp_cp2k->>Logger: log skip summary (skipped vs remaining)
    make_fp_cp2k-->>Caller: return
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify check_bad_box integration and handling of missing/malformed POSCAR files.
  • Confirm count_bad_box initialization, increments, and final summary correctness.
  • Ensure redundant fp.close() removals do not leak resources and file writes remain correct.
  • Validate _link_fp_vasp_pp is invoked appropriately for non-skipped tasks.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat: add cp2k support for skipping bad box" is clear, specific, and directly aligned with the main changes in the changeset. The summary indicates that the core modification involves adding a skip_bad_box mechanism in the make_fp_cp2k function, which is precisely what the title conveys. The title is concise and avoids vague language or unnecessary noise, and a teammate scanning the repository history would immediately understand that this PR introduces functionality for handling invalid geometries in CP2K calculations.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 54f039c and b228742.

📒 Files selected for processing (1)
  • dpgen/generator/run.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
dpgen/{generator,auto_test,data,simplify}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place core logic changes in the appropriate module directories: dpgen/generator, dpgen/auto_test, dpgen/data, dpgen/simplify

Files:

  • dpgen/generator/run.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Adhere to ruff linting for Python code as configured in the repository
Use isort with the Black profile for import ordering in Python files
Write Numpy-style docstrings for Python code
Ensure code remains compatible with Python 3.9–3.12 (as per CI matrix)

Files:

  • dpgen/generator/run.py
🧬 Code graph analysis (1)
dpgen/generator/run.py (1)
dpgen/generator/lib/cp2k.py (3)
  • make_cp2k_input (286-322)
  • make_cp2k_input_from_external (340-369)
  • make_cp2k_xyz (325-337)
🔇 Additional comments (2)
dpgen/generator/run.py (2)

3774-3791: Previous review concerns have been addressed correctly.

The implementation now properly handles the bad box detection by removing the entire task directory with shutil.rmtree(ii) (line 3790) when a bad configuration is detected. This resolves the previous concern about leaving task directories in an inconsistent state.

The implementation correctly:

  1. Changes back to cwd before removing the directory (line 3789)
  2. Tracks the number of skipped tasks (line 3788)
  3. Continues to the next iteration without creating CP2K input files (line 3791)

3806-3812: PP file linking works correctly with skipped tasks.

The previous concern about _link_fp_vasp_pp linking PP files to skipped tasks is resolved. Since skipped task directories are removed entirely (line 3790), when _link_fp_vasp_pp performs its own glob.glob(os.path.join(work_path, "task.*")) (line 3351), it will only find and link to the remaining valid task directories.

The skip summary logging (lines 3806-3809) correctly reports the number of skipped and remaining tasks.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Ming-Yu Guo <gmy721212@163.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
dpgen/generator/run.py (1)

3658-3670: Move bad box check before reading sys_data for better performance.

Currently, sys_data = dpdata.System("POSCAR").data is read for all tasks (line 3660), but if skip_bad_box is set and the box is bad, sys_data is never used. The bad box check happens afterward (lines 3663-3670).

For efficiency, consider checking the bad box first, then only reading sys_data if the task won't be skipped:

 for ii in fp_tasks:
     os.chdir(ii)
+    
+    if skip_bad_box is not None:
+        # Check the box directly from POSCAR
+        skip = check_bad_box("POSCAR", skip_bad_box, fmt="vasp/poscar")
+        if skip:
+            count_bad_box += 1
+            os.chdir(cwd)
+            continue
+    
     sys_data = dpdata.System("POSCAR").data
     # make input for every task
     # if fp_params exits, make keys
-    if skip_bad_box is not None:
-        # Check the box directly from POSCAR
-        skip = check_bad_box("POSCAR", skip_bad_box, fmt="vasp/poscar")
-        if skip:
-            count_bad_box += 1
-            os.chdir(cwd)
-            continue
-
     if fp_params:

This avoids unnecessary POSCAR parsing for tasks that will be skipped.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e3b4f0 and e9faeca.

📒 Files selected for processing (1)
  • dpgen/generator/run.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
dpgen/{generator,auto_test,data,simplify}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place core logic changes in the appropriate module directories: dpgen/generator, dpgen/auto_test, dpgen/data, dpgen/simplify

Files:

  • dpgen/generator/run.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Adhere to ruff linting for Python code as configured in the repository
Use isort with the Black profile for import ordering in Python files
Write Numpy-style docstrings for Python code
Ensure code remains compatible with Python 3.9–3.12 (as per CI matrix)

Files:

  • dpgen/generator/run.py
🧬 Code graph analysis (1)
dpgen/generator/run.py (1)
dpgen/generator/lib/cp2k.py (3)
  • make_cp2k_input (286-322)
  • make_cp2k_input_from_external (340-369)
  • make_cp2k_xyz (325-337)
🔇 Additional comments (2)
dpgen/generator/run.py (2)

3654-3657: LGTM! Clean initialization of skip_bad_box mechanism.

The initialization is straightforward and uses .get() appropriately for the optional parameter.


3684-3687: LGTM! Summary logging is clear and consistent with existing patterns.

The summary message format matches the pattern used in _make_fp_vasp_inner (lines 2944-2947), maintaining consistency across the codebase. The logging provides useful information about how many configurations were skipped and how many remain for FP calculations.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9faeca and 54f039c.

📒 Files selected for processing (1)
  • dpgen/generator/run.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
dpgen/{generator,auto_test,data,simplify}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place core logic changes in the appropriate module directories: dpgen/generator, dpgen/auto_test, dpgen/data, dpgen/simplify

Files:

  • dpgen/generator/run.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Adhere to ruff linting for Python code as configured in the repository
Use isort with the Black profile for import ordering in Python files
Write Numpy-style docstrings for Python code
Ensure code remains compatible with Python 3.9–3.12 (as per CI matrix)

Files:

  • dpgen/generator/run.py
🧬 Code graph analysis (1)
dpgen/generator/run.py (1)
dpgen/generator/lib/cp2k.py (3)
  • make_cp2k_input (286-322)
  • make_cp2k_input_from_external (340-369)
  • make_cp2k_xyz (325-337)

@codecov
Copy link

codecov bot commented Oct 26, 2025

Codecov Report

❌ Patch coverage is 41.66667% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.80%. Comparing base (9e3b4f0) to head (b228742).
⚠️ Report is 15 commits behind head on devel.

Files with missing lines Patch % Lines
dpgen/generator/run.py 41.66% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #1830      +/-   ##
==========================================
- Coverage   49.81%   49.80%   -0.02%     
==========================================
  Files          83       83              
  Lines       14978    14988      +10     
==========================================
+ Hits         7462     7465       +3     
- Misses       7516     7523       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@SchrodingersCattt
Copy link
Contributor Author

related to #1816

@wanghan-iapcm wanghan-iapcm merged commit a3d7ace into deepmodeling:devel Oct 28, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants