HPCC-35694 DFUQuery sort by compressed size#20885
HPCC-35694 DFUQuery sort by compressed size#20885GordonSmith wants to merge 4 commits intohpcc-systems:candidate-10.2.xfrom
Conversation
|
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-35694 Jirabot Action Result: |
There was a problem hiding this comment.
Pull request overview
This pull request adds sorting capability by compressed file size in the DFUQuery service. The change enables users to sort logical files by their compressed size, with appropriate fallback to uncompressed size for non-compressed files.
Changes:
- Added frontend UI enhancements to display compressed size with visual indicators for compressed vs uncompressed files
- Implemented backend sorting logic for compressed file size in the DFU service
- Added mapping for compressed size field in the sort order configuration
- Updated ESDL service interface documentation with version bumping guidelines
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| esp/src/src-react/components/Files.tsx | Changed CompressedFileSize column to be sortable, added dimmed styling for uncompressed files showing their size in brackets |
| esp/services/ws_dfu/ws_dfuService.hpp | Added declaration for findPositionByCompressedSize function |
| esp/services/ws_dfu/ws_dfuService.cpp | Implemented findPositionByCompressedSize sorting function and added CompressedFileSize to legacy sort mappings |
| dali/base/dadfs.cpp | Added logic to populate compressedsize field for file attributes, with fallback to origsize |
| .github/instructions/esp-service-interface.instructions.md | Added version bumping guidelines section to ESDL instructions |
|
@jakesmith tagging you for review as I think you made the changes in and around DFUSFsize, I tried to make this a minimal change, but IMO really should be refactored to:
|
320b00c to
c689e95
Compare
c689e95 to
cd736df
Compare
jakesmith
left a comment
There was a problem hiding this comment.
@GordonSmith - had a looked into this/history of DFUQResultField::origsize etc. a bit.
See comment.
72afcb2 to
e38741a
Compare
6296923 to
e5d793e
Compare
Signed-off-by: Gordon Smith<GordonJSmith@gmail.com>
Signed-off-by: Gordon Smith<GordonJSmith@gmail.com>
e5d793e to
9b85d09
Compare
|
rebased |
dali/base/dadfs.cpp
Outdated
| if (options.includeField(DFUQResultField::origsize)) | ||
| { | ||
| const char *propName = getDFUQResultFieldName(DFUQResultField::origsize); | ||
| attr->setPropInt64(propName, attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::origsize), -1)); |
There was a problem hiding this comment.
This block sets @origsize to the value of @origsize, which is effectively a no-op when present, but will also create @origsize = -1 when absent. If the goal is to ensure the field exists only when already populated, consider guarding with hasProp(propName) (or omit this block entirely). If the goal is to force materialization, add a clarifying comment and avoid emitting -1 as a sentinel that could leak into clients.
| attr->setPropInt64(propName, attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::origsize), -1)); | |
| if (attr->hasProp(propName)) | |
| attr->setPropInt64(propName, attr->getPropInt64(propName)); |
There was a problem hiding this comment.
I kind of agree I suppose, but not quite what copilot is suggesting. The code is trying to default this prop to -1 if not set, so it would be better to do:
if (!attr->hasProp(propName))
attr->setPropInt64(propName, -1);
But, re. prev comments that were on lines 14206-14208..
and comment at end of line on 14210 "//Sort the files with empty size to front"
Aren't blank entries already at front?
jakesmith
left a comment
There was a problem hiding this comment.
@GordonSmith - please see comments/questions.
| // Size field is notionally the size on disk | ||
| const char *propName = getDFUQResultFieldName(DFUQResultField::size); | ||
| attr->setPropInt64(propName, attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::origsize), -1));//Sort the files with empty size to front | ||
| attr->setPropInt64(propName, isCompressed(*attr) ? attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::compressedsize), -1) : attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::origsize), -1)); |
There was a problem hiding this comment.
DFUQResultField::size is a special field that only this sets (@DFUSFsize)
@GordonSmith - as far I could see last time I looked, Eclwatch (nor nothing else uses it).. I thought Eclwatch used "@SiZe" which is DFUQResultField::origsize .. ? (see comment in code here prior to this change).
If so this code should be deleted (and probably DFUQResultField::size/@DFUSFsize, but that could be part of a separate cleanup PR.
There was a problem hiding this comment.
To elaborate on what Jake's saying here as best as I can see, the @DFUSFsize value is not exposed in the DFUQuery response. The legacyMappings are used to map name mismatches between the SortBy strings submitted in the request and the PTree attributes returned from CDistributedFileDirectory::getLogicalFiles (which uses deserializeFileAttrIter(...) if I'm correct to create the returned PTree).
The missing link is that the WsDFUHelpers::addToLogicalFileList(...) function isn't pulling out the @DFUSFsize attribute from the response. Even if it were, I'm not entirely sure where it should be returned.
Currently it pulls @size aka DFUQResultField::origsize to populate IntSize and TotalSize response fields. And it pulls @compressedSize to populate the CompressedFileSize response field.
The DFUQuery response would probably need to be modified with a new field for something like FileSize to hold @DFUSFsize if you wanted it:
https://github.com/hpcc-systems/HPCC-Platform/blob/master/esp/scm/ws_dfu_common.ecm
dali/base/dadfs.cpp
Outdated
| if (options.includeField(DFUQResultField::origsize)) | ||
| { | ||
| const char *propName = getDFUQResultFieldName(DFUQResultField::origsize); | ||
| attr->setPropInt64(propName, attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::origsize), -1)); |
There was a problem hiding this comment.
I kind of agree I suppose, but not quite what copilot is suggesting. The code is trying to default this prop to -1 if not set, so it would be better to do:
if (!attr->hasProp(propName))
attr->setPropInt64(propName, -1);
But, re. prev comments that were on lines 14206-14208..
and comment at end of line on 14210 "//Sort the files with empty size to front"
Aren't blank entries already at front?
| CPPUNIT_ASSERT_MESSAGE("testGetLogicalFilesSorted: Missing cost attributes", costAttrsPresent); | ||
| bool dirAttrsPresent = attrs.hasProp("@directory"); | ||
| CPPUNIT_ASSERT_MESSAGE("testGetLogicalFilesSorted: directory attribute should NOT be present", !dirAttrsPresent); | ||
| CPPUNIT_ASSERT_MESSAGE("testGetLogicalFilesSorted: size field missing", attrs.hasProp("@DFUSFsize")); |
There was a problem hiding this comment.
I don't know why we have @DFUSFsize at the moment - what if anything uses it?
| // Size field is notionally the size on disk | ||
| const char *propName = getDFUQResultFieldName(DFUQResultField::size); | ||
| attr->setPropInt64(propName, attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::origsize), -1));//Sort the files with empty size to front | ||
| attr->setPropInt64(propName, isCompressed(*attr) ? attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::compressedsize), -1) : attr->getPropInt64(getDFUQResultFieldName(DFUQResultField::origsize), -1)); |
There was a problem hiding this comment.
To elaborate on what Jake's saying here as best as I can see, the @DFUSFsize value is not exposed in the DFUQuery response. The legacyMappings are used to map name mismatches between the SortBy strings submitted in the request and the PTree attributes returned from CDistributedFileDirectory::getLogicalFiles (which uses deserializeFileAttrIter(...) if I'm correct to create the returned PTree).
The missing link is that the WsDFUHelpers::addToLogicalFileList(...) function isn't pulling out the @DFUSFsize attribute from the response. Even if it were, I'm not entirely sure where it should be returned.
Currently it pulls @size aka DFUQResultField::origsize to populate IntSize and TotalSize response fields. And it pulls @compressedSize to populate the CompressedFileSize response field.
The DFUQuery response would probably need to be modified with a new field for something like FileSize to hold @DFUSFsize if you wanted it:
https://github.com/hpcc-systems/HPCC-Platform/blob/master/esp/scm/ws_dfu_common.ecm
Type of change:
Checklist:
Smoketest:
Testing: