Skip to content

Conversation

@punAhuja
Copy link
Contributor

@punAhuja punAhuja commented Oct 7, 2025

https://issues.apache.org/jira/browse/SOLR-17942

Description

The parameter ramPerThreadHardLimitMB cannot be configured more than 2GB in Lucene, as a consequence a single thread cannot write segments larger than 2GB.
Refer: https://lucene.apache.org/core/9_9_0/core/org/apache/lucene/index/IndexWriterConfig.html#setRAMPerThreadHardLimitMB(int)

Solution

This PR makes this parameter configurable above the 2GB limit, so that each thread can write a bigger segment. I use reflection to bypass this hard-coded limit in Lucene.
When ramPerThreadHardLimitMB is configured with a value more than 2GB, setPerThreadRAMLimitViaReflection is called, bypassing the limit

Tests

  1. Disabled merging, auto-commit, and flush in the configs
  2. Wrote a 4.1GB payload using a single thread
  3. Sent a commit
  4. Verified 1 segment is being created

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@github-actions github-actions bot added documentation Improvements or additions to documentation cat:index labels Oct 7, 2025
@punAhuja punAhuja changed the title Puneet/solr 17942 raising ram per thread hard limit SOLR-17942: Raising configurable RAM per thread hard limit using reflection Oct 7, 2025
Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java 9 and beyond, I suggest VarHandle API instead:

// A lookup object that can see the private members of this class.
    MethodHandles.Lookup lookup = MethodHandles.lookup();
    VarHandle fieldHandle = null;
    Class<?> currentClass = config.getClass();

    // Loop to find the field in the class hierarchy
    while (currentClass != null && fieldHandle == null) {
        try {
            // Get a special lookup that can access private members of the target class.
            // This requires `--add-opens` if the target is in another module.
            MethodHandles.Lookup privateLookup = MethodHandles.privateLookupIn(currentClass, lookup);
            // Find a handle for the field by name and type.
            fieldHandle = privateLookup.findVarHandle(currentClass, "perThreadHardLimitMB", int.class);
        } catch (IllegalAccessException | NoSuchFieldException e) {
            // Field not in this class, try the superclass.
            currentClass = currentClass.getSuperclass();
        }
    }

    try {
        if (fieldHandle != null) {
            // Set the value using the handle.
            fieldHandle.set(config, limitMB);
            log.info("Set perThreadHardLimitMB to {} MB via VarHandle", limitMB);
        } else {
            log.error("Could not find VarHandle for perThreadHardLimitMB field");
        }
    } catch (Throwable t) { // VarHandle.set can throw Throwable
        log.error("Failed to set per-thread RAM limit via VarHandle", t);
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:index documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants