Skip to content

Conversation

@mckellyln
Copy link
Contributor

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copilot AI review requested due to automatic review settings December 17, 2025 14:59
@github-actions
Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-35485

Jirabot Action Result:
Assigning user: [email protected]
Workflow Transition To: Merge Pending
Updated PR

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for the jemalloc memory allocator as a configurable option via LD_PRELOAD. The feature allows users to enable jemalloc through configuration settings (either in environment.xml or environment.conf), which can provide better memory allocation performance and lower fragmentation compared to the default glibc allocator. The implementation adds the jemalloc library dependency, installation logic, and runtime configuration parsing.

Key changes:

  • Adds jemalloc as a vcpkg dependency for Linux platforms
  • Implements configuration parsing in hpcc_setenv.in to detect useJemalloc setting and conditionally add libjemalloc.so to LD_PRELOAD
  • Installs jemalloc library files during package build for Linux platforms

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
vcpkg.json.in Adds jemalloc as a dependency for Linux platforms via vcpkg package manager
initfiles/sbin/hpcc_setenv.in Implements logic to parse useJemalloc configuration from XML or conf files and prepend libjemalloc.so to LD_PRELOAD when enabled
CMakeLists.txt Adds installation logic to copy jemalloc library files from vcpkg to the lib directory for Linux platform builds

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

Comment on lines 134 to 138
HPCC_XMLFILE=${HPCC_XMLFILE:-${CONFIG_DIR}/${ENV_XML_FILE}}

usejem=$(sed -n '/\<envsettings/I,/\/envsettings/I{//!p;}' ${HPCC_XMLFILE} 2>/dev/null | awk -F '[<>]' 'tolower($0)~/usejemalloc/{print $3}' 2>/dev/null)
if [[ "$usejem" == "" ]] ; then
usejem=$(grep -i usejemalloc ${HPCC_CONFIG} 2> /dev/null | awk -F= '{print $2}' 2>/dev/null)
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sed command parsing the XML file uses case-insensitive matching and a broad pattern that could match unintended XML elements. If the XML structure contains nested or malformed tags, this could extract incorrect values. Consider validating the XML structure or using a proper XML parser. At minimum, document the expected XML structure for the useJemalloc setting in environment.xml.

Suggested change
HPCC_XMLFILE=${HPCC_XMLFILE:-${CONFIG_DIR}/${ENV_XML_FILE}}
usejem=$(sed -n '/\<envsettings/I,/\/envsettings/I{//!p;}' ${HPCC_XMLFILE} 2>/dev/null | awk -F '[<>]' 'tolower($0)~/usejemalloc/{print $3}' 2>/dev/null)
if [[ "$usejem" == "" ]] ; then
usejem=$(grep -i usejemalloc ${HPCC_CONFIG} 2> /dev/null | awk -F= '{print $2}' 2>/dev/null)
# Expected XML structure in ${HPCC_XMLFILE}:
# <EnvSettings>
# <useJemalloc>true</useJemalloc>
# </EnvSettings>
HPCC_XMLFILE=${HPCC_XMLFILE:-${CONFIG_DIR}/${ENV_XML_FILE}}
usejem=""
if command -v xmlstarlet >/dev/null 2>&1 ; then
# Prefer xmlstarlet for robust XML parsing
usejem=$(xmlstarlet sel -t -v "string(//EnvSettings/useJemalloc)" -n "${HPCC_XMLFILE}" 2>/dev/null | tr 'A-Z' 'a-z')
else
# Fallback to legacy sed/awk parsing within the <EnvSettings> section
usejem=$(sed -n '/\<envsettings/I,/\/envsettings/I{//!p;}' "${HPCC_XMLFILE}" 2>/dev/null | awk -F '[<>]' 'tolower($0)~/usejemalloc/{print tolower($3)}' 2>/dev/null)
fi
if [[ "$usejem" == "" ]] ; then
usejem=$(grep -i usejemalloc "${HPCC_CONFIG}" 2> /dev/null | awk -F= '{print $2}' 2>/dev/null | tr 'A-Z' 'a-z')

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its ok as is and don't want to add a dependency on an xml parsing utility.

@mckellyln mckellyln force-pushed the jemalloc_ldpreload branch 3 times, most recently from cee98f7 to b5cb841 Compare December 17, 2025 21:32
@mckellyln mckellyln requested a review from Copilot December 17, 2025 21:44
@mckellyln
Copy link
Contributor Author

Code seems ok to set LD_PRELOAD ... but some jobs are failing when libjemalloc is used, I will have to look at that more in depth.
@ghalliday fyi

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

mpSoMaxConn=128
mpTraceLevel=0

useJemalloc=true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is enabled now for testing the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented this out now that tests are passing and the correct initial setting for this is false.

@mckellyln
Copy link
Contributor Author

If we build libjemalloc with option: --disable-cxx then everything works ok - which suggests there is a memory issue somewhere.

Notes from jemalloc discuss this -

… passing incorrect sizes to the sized deallocation functions is disallowed by the language spec ("undefined behavior", same as passing a wild pointer or a double-free).

One partial "fix" (which will fix the allocator crashes, although still disallowed) is to give the base class a custom operator delete which calls the global, unsized, operator delete; another is to compile with option:

-fno-sized-deallocation

A way to kinda-sorta fake this is to configure jemalloc with --disable-cxx, which (as a quirk of the current libstdc++ implementation that will likely change fairly soon) will also turn off sized deallocation usage barring some fancy sdallocx-detection trickery in the library that I think is unlikely.

Crash only with Jemalloc, ASan is clean · Issue #2353 · jemalloc/jemalloc

Using valgrind or asan has so far not reported any issues.

I will continue to try and find the memory issue(s)

@ghalliday
Copy link
Member

@mckellyln do you have a stack trace for a failure?
Could it be a delete of a base object that does not have a virtual destructor? I think anything using our link counted objects is ok, but there may be some other classes.
Another idea is to enable to enable
-Werror=delete-non-virtual-dtor
Done in #20788 but it didn't find anything.

@mckellyln
Copy link
Contributor Author

K8s Regression Suite test failure is a false positive (and is getting fixed in 9.14.x soon by James McMullan)
So this means all tests now pass with commit 2 to address the delete size issue from the derived class.

@mckellyln
Copy link
Contributor Author

We could also build libjemalloc with --disable-cxx option for added safety, if desired.

@mckellyln
Copy link
Contributor Author

@jakesmith fyi
@dcamper fyi

@mckellyln mckellyln requested a review from ghalliday January 7, 2026 18:48
@mckellyln mckellyln force-pushed the jemalloc_ldpreload branch 2 times, most recently from 110a22d to ad5e250 Compare January 7, 2026 21:19
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckellyln looks good, please squash.
Also, is 9.14.x the correct target? Could the inclusion of the library in 9.14.x cause problems?

@mckellyln
Copy link
Contributor Author

Squashed.
I can re-target to 10.0.x if you think it's a more appropriate branch target for this feature.

@ghalliday ghalliday merged commit ab528e7 into hpcc-systems:candidate-9.14.x Jan 8, 2026
49 of 51 checks passed
@github-actions
Copy link

github-actions bot commented Jan 8, 2026

Jirabot Action Result:
Added fix version: 9.14.48
Added fix version: 10.0.22
Workflow Transition: 'Resolve issue'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants