Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
ebf561e
Merge remote-tracking branch 'origin/feature/1a-zero-vram' into fix/d…
LTan-101104 Jun 26, 2025
532206a
db_connection_custom_condition : intialize code for load_jobs_datafr…
LTan-101104 Jun 26, 2025
2f1e04c
db_connection_custom_condition : Resetup mock test in conftest.py, in…
LTan-101104 Jun 30, 2025
e3693c5
db_connection_custom_condition : Added new parameters of optional fil…
LTan-101104 Jun 30, 2025
3d98f56
db_connection_custom_condition : refactor tests in test_preprocess du…
LTan-101104 Jun 30, 2025
67c15ab
Merge remote-tracking branch 'origin/main' into fix/db_connection_ana…
LTan-101104 Jun 30, 2025
9febbf2
Merge remote-tracking branch 'origin/main' into fix/db_connection_ana…
LTan-101104 Jul 2, 2025
54e89dc
db_connection_custom_condition : fix ruff issue
LTan-101104 Jul 2, 2025
80de772
db_connection_custom_condition: initialize keyword arguments handling…
LTan-101104 Jul 2, 2025
26632ce
db_connection_custom_condition : reintialize simpler functions and ad…
LTan-101104 Jul 3, 2025
b7809d3
REsolve merge conflict from branch origin/feature/1a-zero-vram
LTan-101104 Jul 3, 2025
d0a99b5
db_connection_custom_condition : intialize warning and new filtering …
LTan-101104 Jul 3, 2025
88e41ff
db_connection_custom_condition : add if and try catch in preprocess t…
LTan-101104 Jul 7, 2025
3734154
db_connection_custom_condition : revert changes on vram_usage.py, add…
LTan-101104 Jul 8, 2025
12bcf08
db_connection_custom_condition : Add custom query and check for dates…
LTan-101104 Jul 8, 2025
74a9ec0
db_connection_custom_condition : refactor code in preprocess
LTan-101104 Jul 8, 2025
af28b60
Merge with origin/main
LTan-101104 Jul 8, 2025
31e6d04
Resolve some issues from last merge
LTan-101104 Jul 8, 2025
5894043
db_connection_custom_condition : debug issues of improper handling fo…
LTan-101104 Jul 9, 2025
6ace8ca
db_connection_custom_condition : add warning raised options for load_…
LTan-101104 Jul 9, 2025
187d900
db_connection_custom_condition : Readjust warnings, add tests for Key…
LTan-101104 Jul 9, 2025
359c8fc
db_connection_custom_condition : add docstring for test_load_jobs_duc…
LTan-101104 Jul 9, 2025
db2b898
db_connection_custom_condition : Change constants name, omit TODOs, a…
LTan-101104 Jul 9, 2025
093fe7f
Add QOS options to include/ omit custom QOS values, add new tests for…
LTan-101104 Jul 10, 2025
0f105bd
Update names of helper functions for loading preprocessed
LTan-101104 Jul 30, 2025
56e49ce
Merge remote-tracking branch 'origin/main' into HEAD
LTan-101104 Jul 30, 2025
6ce7962
Fix pytest
LTan-101104 Jul 30, 2025
33734ce
Make changes to error raised test to match error message
LTan-101104 Jul 30, 2025
4700f01
Clean up comments and change test names
LTan-101104 Jul 30, 2025
0dfdb0a
Change location of load_process_dataframe function, update notebooks …
LTan-101104 Jul 30, 2025
e1067b1
Clean up comment in preprocess
LTan-101104 Jul 30, 2025
8bdadcd
Added docstring for one of the new test
LTan-101104 Jul 30, 2025
1caf951
Integrate some comments in terms of docstring, inconsistent handling …
LTan-101104 Aug 5, 2025
e6a3e88
Update mock data to more recent times, fix test for warning checks
LTan-101104 Aug 5, 2025
3f89e90
Change mock data Elapsed time to be more accurate
LTan-101104 Aug 6, 2025
2aa287e
Moved the test to the correct position
LTan-101104 Aug 6, 2025
8696fff
Add recwarn to all preprocess test
LTan-101104 Aug 6, 2025
e9bfea4
Create seperate fixture for dataframe and database path
LTan-101104 Aug 6, 2025
f67aa2a
Reachnge based on main's preprocess to avoid conflicts
LTan-101104 Aug 7, 2025
4a17627
Change name of fixture to avoid conflicts
LTan-101104 Aug 7, 2025
ac32f5b
Merge with origin/main
LTan-101104 Aug 7, 2025
492525e
Merged with remain part from origin/main
LTan-101104 Aug 7, 2025
063a61f
Clean up helper_filter_irrelevant_records and move to conftest
LTan-101104 Aug 7, 2025
c749bb2
Remerge with last changes
LTan-101104 Aug 7, 2025
df7b22e
Uncomment and fix tests in test_load_processed_jobs_dataframe
LTan-101104 Aug 7, 2025
a0db402
Clean up tests, deleting all notna filtering for GPU and GPUType
LTan-101104 Aug 7, 2025
61e8533
Add update_helper_filter and start modifying test_preprocess to use this
LTan-101104 Aug 7, 2025
f5e9b62
Updated to use new filter helper functions
LTan-101104 Aug 9, 2025
6558c4f
Finalize new helper for tests
LTan-101104 Aug 9, 2025
08e19b1
Add notes section back to preprocess
LTan-101104 Aug 11, 2025
2e7f674
Update constant name, add new suggested comment
LTan-101104 Aug 11, 2025
7729e32
Run clean up script for Efficiency Analysis Notebook
LTan-101104 Aug 11, 2025
e0a777b
Update to use original version of Efficiency Analysis
LTan-101104 Aug 11, 2025
3f2262d
Update qos variable names
LTan-101104 Aug 11, 2025
6ea9e45
Changed required and optional column set to enum
LTan-101104 Aug 11, 2025
f5ad403
Add remain columns to optional, update warning statement and test
LTan-101104 Aug 11, 2025
4a7a6fd
Break code in preprocess_data into multiple different functions
LTan-101104 Aug 11, 2025
89f3160
Add new helper for check_infinity_value (overflow) in preprocess
LTan-101104 Aug 11, 2025
ace8434
Update Efficiency Analysis notebookk to use new function
LTan-101104 Aug 11, 2025
53e6380
Resolved mypy
LTan-101104 Aug 11, 2025
3d127b3
Change folder name and file name
LTan-101104 Aug 11, 2025
bcb4035
Make changes to notebook
LTan-101104 Aug 11, 2025
fee8186
Reorder arguments in utility function
LTan-101104 Aug 11, 2025
9fbfecb
Fixed multiple docstring and comments and test file name following su…
LTan-101104 Aug 12, 2025
709183c
Change module name so ruff runs
LTan-101104 Aug 12, 2025
4ba2cc8
Change test names as suggested
LTan-101104 Aug 12, 2025
cf7ae12
Apply remain suggestions for changing test names
LTan-101104 Aug 12, 2025
f2ef0c8
Add warning raised for empty dataframe and test
LTan-101104 Aug 12, 2025
46af286
Readjust parameter name
LTan-101104 Aug 12, 2025
5321f45
Break load_and_preprocessed into 2 functions , one without custom que…
LTan-101104 Aug 12, 2025
9ed207b
Change function call in notebooks
LTan-101104 Aug 12, 2025
bf65522
Update documentation for Preempted columne
LTan-101104 Aug 12, 2025
0d04dfd
Changed to use @pytest.mark.parametrize and omit some unnecessary rec…
LTan-101104 Aug 12, 2025
b2092c2
Complete resolving recwarn problem and add new pytest.mark.parametrize
LTan-101104 Aug 12, 2025
ba99adb
Change function name and convert columns which should be omitted to a…
LTan-101104 Aug 12, 2025
3e274f1
Add apply_filtering options to avoid duplicate filtering in preproces…
LTan-101104 Aug 12, 2025
17888c3
Enforce creating new copy in preprocess
LTan-101104 Aug 13, 2025
428f21f
Change method imports in efficiency analysis notebooks
LTan-101104 Aug 13, 2025
7ae8ec7
Merge parts of codes, awaiting for merge on test preprocess and handl…
LTan-101104 Aug 16, 2025
c28475c
Make changes to test_preprocess_origin to use standard filter helper
LTan-101104 Aug 17, 2025
afe9b3f
Complete test_preprocess and preprocess
LTan-101104 Aug 17, 2025
921e989
Apprehend test_preprocess back
LTan-101104 Aug 17, 2025
b1a853b
Complete fix of test_load_and_preprocess, remove all instances of val…
LTan-101104 Aug 17, 2025
5748955
Fix ruff and mypy
LTan-101104 Aug 17, 2025
6faf44b
Make changes to del statement in conftest __del__() method of databas…
LTan-101104 Aug 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/preprocess.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,16 @@
### Attributes Omitted
- **UUID**
- **Nodes**: NodesList have more specific information
- **Preempted**: Status have more valid information
- **Preempted**: Contains unreliable data. Use Status column instead (PREEMPT for
unfinished, COMPLETE/FAILED/etc. for finished preempted jobs).
- **EndTime**: Can be calculated from StartTime and Elapsed

### Options for Including or Omitting Jobs
- **Keeping CPU jobs:**
- If `GPUType` is null, the value will be filled with `["cpu"]`
- If `GPUs` is null or is 0, the value will be 0.
- **Keeping jobs where the status is "Failed" or "Cancelled"**
- **Keeping jobs where the QOS is customized (not normal, long, or short)**

### Records Omitted If:
- `Elapsed` is less than the minimum threshold
Expand Down
Loading