-
-
Notifications
You must be signed in to change notification settings - Fork 98
Change NitriteId to use long internally to avoid parseLong overhead
#1117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request refactors how Changes
Sequence Diagram(s)sequenceDiagram
participant Serializer
participant JsonGenerator
Serializer->>JsonGenerator: Call serialize(NitriteId)
Note right of Serializer: Converts id using Long.toString(value.getIdValue())
Serializer->>JsonGenerator: Write string output
sequenceDiagram
participant AddRemoveMethod
participant NitriteId
participant SpatialKeyCreator
AddRemoveMethod->>NitriteId: Retrieve idValue
NitriteId-->>AddRemoveMethod: Return idValue (long)
AddRemoveMethod->>SpatialKeyCreator: Create spatial key using idValue
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
nitrite/src/main/java/org/dizitart/no2/store/memory/InMemoryRTree.java (1)
65-65: 💡 Verification agent🧩 Analysis chain
Verify compatibility with the updated NitriteId implementation.
The code still creates NitriteId objects by converting long to String with
NitriteId.createId(Long.toString(sk.getId())). This suggests that the external API of NitriteId still expects String input, which aligns with the goal of maintaining backward compatibility.However, it's worth verifying that this conversion doesn't reintroduce the performance overhead that this PR aims to eliminate.
Also applies to: 80-80
🏁 Script executed:
#!/bin/bash # Look for how NitriteId.createId is implemented in the updated code echo "Searching for NitriteId.createId implementation:" rg -A 5 "createId" --glob "*.java"Length of output: 34519
Action: Update NitriteId Creation for Improved Performance
Verification shows that the updated NitriteId implementation now offers an overload that accepts a long value directly. In contrast, the current usage in nitrite/src/main/java/org/dizitart/no2/store/memory/InMemoryRTree.java (lines 65 and 80) converts the long ID to a String via
Long.toString(sk.getId()), which may reintroduce unnecessary conversion overhead.Recommendations:
- Replace:
set.add(NitriteId.createId(Long.toString(sk.getId())));- With:
set.add(NitriteId.createId(sk.getId()));This change leverages the new long overload of
NitriteId.createId(long)and should help eliminate the performance penalty associated with the intermediate string conversion.nitrite-mvstore-adapter/src/main/java/org/dizitart/no2/mvstore/NitriteMVRTreeMap.java (1)
109-109: 💡 Verification agent🧩 Analysis chain
Verify string conversion compatibility
This line still converts the long ID to a string before creating the NitriteId. This seems at odds with the PR's goal of using
longinternally to avoid parsing overhead, since here we're doing the conversion anyway.Let's check if this creates a potential performance bottleneck:
🏁 Script executed:
#!/bin/bash # Find NitriteId.createId usages to understand the conversion pattern rg "NitriteId\.createId\(Long\.toString\(" --type java # Find all createId implementations to understand the method signature changes rg -A 3 "createId\(" --type javaLength of output: 20509
Address long-to-string conversion for NitriteId creation
The current implementation in
nitrite-mvstore-adapter/src/main/java/org/dizitart/no2/mvstore/NitriteMVRTreeMap.java:109
still converts the long ID to a string withLong.toString(next.getId())before creating a NitriteId. Since the PR’s intent is to work with long IDs internally (and thereby avoid the overhead of string conversion), you should call the long overload directly—i.e., replace the current line with:return NitriteId.createId(next.getId());This change will eliminate unnecessary conversion overhead and ensure consistency with the PR’s goals.
🧹 Nitpick comments (4)
nitrite/src/main/java/org/dizitart/no2/transaction/TransactionalRTree.java (2)
50-51: Consider optimizing ID creation to avoid string conversion.While the PR optimizes ID handling by using
longinternally, this line still converts the long ID back to a String usingLong.toString(sk.getId())and then creates a NitriteId from it. IfNitriteIdnow useslonginternally, consider adding a factory method that accepts alongdirectly to avoid this conversion overhead.- set.add(NitriteId.createId(Long.toString(sk.getId()))); + set.add(NitriteId.createId(sk.getId()));Note: This would require adding an overloaded
createIdmethod in theNitriteIdclass that accepts a long parameter.
65-66: Consider optimizing ID creation to avoid string conversion.Similar to line 50, this line converts a long value to String and then creates a NitriteId. This introduces unnecessary conversion overhead.
- set.add(NitriteId.createId(Long.toString(sk.getId()))); + set.add(NitriteId.createId(sk.getId()));nitrite/src/main/java/org/dizitart/no2/collection/NitriteDocument.java (1)
122-128: Added backward compatibility for string IDsThis excellent addition ensures backward compatibility by handling the case where the ID is stored as a
Stringin existing databases. If the direct cast tolongfails, it attempts to create an ID from the string value, maintaining compatibility with existing data.This backward compatibility approach is well-designed as it:
- Maintains compatibility with existing databases that stored IDs as strings
- Gradually transitions to the more efficient long representation
- Provides clear error messages when invalid IDs are encountered
This pattern should be considered for other similar type transitions in the codebase.
nitrite/src/main/java/org/dizitart/no2/collection/NitriteId.java (1)
106-116: Consider adding an optimized validId overload for long values.The current
validIdmethod will handle long values by converting them to strings first and then parsing them back to longs, which is inefficient. Consider adding an overloaded method specifically for long values that can perform validation more directly.+ /** + * Validates a long value to be used as {@link NitriteId}. + * + * @param value the long value + * @return `true` if the value is valid; otherwise `false`. + */ + public static boolean validId(long value) { + // Add any necessary validation rules for long values + // For example, checking range limits or reserved values + return true; + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
nitrite-jackson-mapper/src/main/java/org/dizitart/no2/mapper/jackson/modules/NitriteIdSerializer.java(1 hunks)nitrite-mvstore-adapter/src/main/java/org/dizitart/no2/mvstore/NitriteMVRTreeMap.java(2 hunks)nitrite-rocksdb-adapter/src/main/java/org/dizitart/no2/rocksdb/RocksDBRTree.java(1 hunks)nitrite-rocksdb-adapter/src/main/java/org/dizitart/no2/rocksdb/formatter/NitriteSerializers.java(1 hunks)nitrite/src/main/java/org/dizitart/no2/collection/NitriteDocument.java(1 hunks)nitrite/src/main/java/org/dizitart/no2/collection/NitriteId.java(3 hunks)nitrite/src/main/java/org/dizitart/no2/collection/operation/ReadOperations.java(1 hunks)nitrite/src/main/java/org/dizitart/no2/store/memory/InMemoryRTree.java(1 hunks)nitrite/src/main/java/org/dizitart/no2/transaction/TransactionalRTree.java(1 hunks)nitrite/src/test/java/org/dizitart/no2/collection/NitriteDocumentTest.java(1 hunks)nitrite/src/test/java/org/dizitart/no2/collection/NitriteIdTest.java(1 hunks)nitrite/src/test/java/org/dizitart/no2/filters/FilterTest.java(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (14)
- GitHub Check: Verify GraalVM 21 compatibility on macos-latest
- GitHub Check: Verify GraalVM 17 compatibility on macos-latest
- GitHub Check: Verify GraalVM 21 compatibility on windows-latest
- GitHub Check: Verify GraalVM 17 compatibility on windows-latest
- GitHub Check: Build with Java 17 in Ubuntu
- GitHub Check: Verify GraalVM 21 compatibility on ubuntu-latest
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Build with Java 17 in Windows
- GitHub Check: Build with Java 11 in Ubuntu
- GitHub Check: Verify GraalVM 17 compatibility on ubuntu-latest
- GitHub Check: Build with Java 17 in MacOS
- GitHub Check: Build with Java 11 in Windows
- GitHub Check: Analyze (java)
- GitHub Check: Build with Java 11 in MacOS
🔇 Additional comments (31)
nitrite/src/main/java/org/dizitart/no2/transaction/TransactionalRTree.java (3)
29-30: Simplified null check after switching to primitive long.The code now only checks if
nitriteIdis null without checkingnitriteId.getIdValue(). This simplification is appropriate since the PR changesNitriteIdto use a primitivelonginternally instead of aString, and primitive types cannot be null.
37-38: Simplified null check in remove method.Similar to the add method, the code now only checks if
nitriteIdis null without checkingnitriteId.getIdValue(). This change is consistent with using primitivelongfor the ID value.
102-108: Method signature works well with the new ID representation.The
getKeymethod already accepts alongparameter, which aligns perfectly with the change to uselonginternally for NitriteId. This method doesn't require any changes as it was already designed to work with primitive long values.nitrite-rocksdb-adapter/src/main/java/org/dizitart/no2/rocksdb/RocksDBRTree.java (5)
47-53: Code simplification looks good.The modification simplifies the check by directly using
nitriteId.getIdValue()without checking if the ID value is null. This change is consistent with the PR objective of changing NitriteId to uselonginternally instead of String, which eliminates the need for null checks on the ID value.
55-62: Consistent simplification in remove method.Similar to the add method, this change simplifies the code by removing unnecessary null checks on the ID value. The modification maintains consistency across both methods and aligns with the PR's goal of improving performance by reducing overhead.
86-88: Consistent string conversion pattern.Similar to the findIntersectingKeys method, this code correctly maintains compatibility by converting the long ID back to a string when creating NitriteId objects. This approach ensures consistency throughout the codebase.
118-124: Method already accepts long parameter.The
getKeymethod was already designed to accept a long ID parameter, which makes it compatible with the change in NitriteId implementation. No modifications were needed for this method, confirming good design foresight.
71-73:Details
✅ Verification successful
Verify ID conversion for compatibility.
While not changed in this PR, notice that line 72 still converts the spatial key's ID to a string when creating a NitriteId (
NitriteId.createId(Long.toString(sk.getId()))). This is important for maintaining backward compatibility with code that expects string IDs, but should be validated to ensure it works properly with the changed NitriteId implementation.
🏁 Script executed:
#!/bin/bash # Description: Check how NitriteId.createId is defined to verify it handles the string conversion correctly # Look for NitriteId creation methods ast-grep --pattern $'class NitriteId { $$$ static NitriteId createId($_) { $$$ } $$$ }' # Check for any usage patterns of createId with string arguments rg -A 3 "NitriteId.createId\\(.*toString\\(" --type javaLength of output: 4087
ID Conversion Verified for Backward Compatibility
- The usage of
Long.toString(sk.getId())as an argument toNitriteId.createId()in the RocksDB adapter is consistent with patterns found in other modules (e.g., MVStore and in-memory adapters), ensuring compatibility with string-based IDs.- Unit tests (e.g., in
NitriteIdTest.java) further validate that passing a string correctly upholds the expected behavior of theNitriteIdimplementation.- No changes are required based on the current verification.
nitrite/src/main/java/org/dizitart/no2/store/memory/InMemoryRTree.java (3)
42-43: Appropriate simplification of the null check logic.The change simplifies the condition by removing redundant checks. Since
NitriteIdnow useslonginternally instead ofString, the additional validation forgetIdValue()is no longer necessary as it will always return a valid primitive long value.
51-52: Appropriate simplification in the remove method.Similar to the changes in the
addmethod, this simplification is consistent with the refactoring ofNitriteIdto use a primitivelongtype internally instead of aString.
60-60: Consistent usage of long value in spatial key creation.The use of
0Las the ID parameter when creating spatial keys for search operations is consistent with the updated type system. This maintains the existing logic while adapting to the new internal representation.Also applies to: 75-75
nitrite/src/test/java/org/dizitart/no2/filters/FilterTest.java (1)
28-28: Type change consistent with NitriteId implementation update.The test assertion has been updated to check for
Longtype instead ofString, aligning with the PR objective to change NitriteId to use long internally instead of String.This change correctly validates the implementation change while ensuring test consistency.
nitrite/src/test/java/org/dizitart/no2/collection/NitriteIdTest.java (1)
63-63: Updated assertion to match new internal type representation.This test has been updated to verify that
NitriteId.createId("42").getIdValue()now returns a long value (42L) rather than a string. This correctly reflects the internal type change from String to long.The updated assertion properly validates the converted implementation while maintaining test coverage.
nitrite-rocksdb-adapter/src/main/java/org/dizitart/no2/rocksdb/formatter/NitriteSerializers.java (1)
37-37: Preserves backward compatibility with existing database files.The serializer now explicitly converts the long value to a string before writing, maintaining the serialized format while allowing the internal representation to change to long.
This approach is excellent because:
- It maintains backward compatibility with existing database files
- It supports the PR's objective of reducing parseLong overhead in normal operations
- It only incurs the toString conversion cost during serialization, which happens less frequently than normal operations
nitrite/src/test/java/org/dizitart/no2/collection/NitriteDocumentTest.java (1)
59-60: Test updated to use long representation for NitriteId.Both the document initialization and the assertion have been updated to use long values instead of strings, aligning with the internal type change in NitriteId.
The test now correctly uses the proper type (long) for the
_idfield and verifies the expected value type fromgetId().getIdValue(). This is consistent with the PR's objective.nitrite/src/main/java/org/dizitart/no2/collection/operation/ReadOperations.java (1)
138-138:Details
❓ Verification inconclusive
Type change for NitriteId values to improve performance
This change implements the core PR objective of using
longvalues for NitriteId instead of String, which helps avoid the overhead ofparseLongoperations. The code now casts the value from the filter directly tolonginstead ofString.Let's verify that this cast is safe by examining the byIdFilter implementation:
🏁 Script executed:
#!/bin/bash # Examine the ByIdFilter implementation to confirm it returns long values # Find the ByIdFilter class rg -A 10 "class .*ByIdFilter" --type java # Check what getValue() returns ast-grep --pattern 'class $_ implements $_ { $$$ public $_ getValue() { $$$ } $$$ }'Length of output: 142
Manual Verification Required: Confirm
ByIdFilter.getValue()Returns a LongThe change to cast the filter value directly to
longshould boost performance as expected, but the automated check did not return evidence regarding the return type ofByIdFilter.getValue(). Please verify manually that this method indeed returns along(or a value safely castable to long) by checking its implementation in the ByIdFilter class (for example, in a file likenitrite/src/main/java/org/dizitart/no2/collection/filters/ByIdFilter.java).
- Action Item: Inspect the ByIdFilter class to confirm that the
getValue()method returns a long value.- Impact: Ensure the cast in
ReadOperations.javaat line 138 is valid. If the value is not a long, adjust the casting logic to maintain type safety.nitrite-jackson-mapper/src/main/java/org/dizitart/no2/mapper/jackson/modules/NitriteIdSerializer.java (1)
38-38:Details
✅ Verification successful
Simplified serialization of NitriteId with long values
The serializer has been improved by removing null checks that were no longer needed, simplifying the code to directly convert the long ID value to string. This is a clean change that aligns with the PR objectives.
Let's verify that this simplification is safe by examining the NitriteId implementation:
🏁 Script executed:
#!/bin/bash # Find NitriteId to confirm getIdValue() cannot return null rg -A 3 "getIdValue\(\)" --type javaLength of output: 13901
Simplification Verified: NitriteId Serialization is Safe
The null check removal is safe since
getIdValue()consistently returns a primitive long, which cannot be null. Our grep across the codebase confirms that all usages simply convert this long value to a string without any need for additional null handling.
- File:
nitrite-jackson-mapper/src/main/java/org/dizitart/no2/mapper/jackson/modules/NitriteIdSerializer.java(Line 38)- Summary: The simplification directly converts the long ID to a string with
Long.toString(value.getIdValue()), aligning with the rest of the codebase wheregetIdValue()is always a primitive long.nitrite-mvstore-adapter/src/main/java/org/dizitart/no2/mvstore/NitriteMVRTreeMap.java (2)
46-48: Simplified null check pattern for NitriteIdThe null check for
nitriteId.getIdValue()has been removed since it's now guaranteed to have a value whennitriteIditself is not null. This makes the code cleaner and more maintainable.
59-61: Simplified null check pattern for NitriteIdSimilar to the previous change, the redundant null check has been removed here as well, maintaining consistency throughout the codebase.
nitrite/src/main/java/org/dizitart/no2/collection/NitriteDocument.java (2)
106-107: Changed ID type from String to longThe document ID is now stored as a
longvalue instead of aString, which aligns with the PR objective to improve performance by avoiding string parsing.
115-117: Robust ID retrieval with type castThe code now attempts to cast the retrieved ID directly to
longwhile also saving the original value. This allows for a fallback mechanism in case the cast fails, enhancing the robustness of the code.nitrite/src/main/java/org/dizitart/no2/collection/NitriteId.java (10)
50-51: The field type change aligns with performance objective.Changing from
Stringtolongfor theidValuefield effectively eliminates string parsing overhead, which aligns with the PR's goal of improving performance. This primitive type change is more efficient for operations like comparison and reduces memory consumption.
54-54: Direct primitive assignment effectively reduces overhead.Using the primitive value directly (
generator.getId()) rather than converting it to a String first eliminates unnecessary string creation and parsing operations.
57-59: String constructor appropriately handles conversion to the new type.The constructor now correctly parses the string input to a long value, maintaining compatibility with code that creates IDs from strings while storing them as longs internally.
61-63: New constructor enhances API usability.Adding a constructor that directly accepts a
longvalue is a good addition that complements the internal representation change and provides a more direct way to create IDs from numeric values.
87-96: New static factory method provides a more efficient creation path.The
createId(long value)method is a valuable addition that provides a more efficient way to create IDs when the caller already has a numeric identifier. This avoids unnecessary string conversion overhead.
120-120: Simplified comparison logic enhances performance.Using
Long.comparedirectly on the primitive values is more efficient than the previous string comparison. This change should improve sorting and equality checks performance.
125-125: More efficient string representation.Removing any null checks and directly creating the string representation from the primitive long value simplifies the code and improves performance.
129-129: Serialization maintains backward compatibility.Converting the long value to a string during serialization ensures backward compatibility with existing database files and other systems that expect string identifiers.
133-133: Deserialization properly restores the primitive type.The
readObjectmethod correctly parses the string from the input stream back to a long value, ensuring proper reconstitution of the object during deserialization.
94-94:Details
✅ Verification successful
Verify that validId implementation properly handles long values.
The code calls
validId(value)for the long parameter, but the implementation shown at line 106 only explicitly validates string representations. Ensure there's a corresponding implementation for validating long values directly.
🏁 Script executed:
#!/bin/bash # Search for an overloaded validId method that accepts a long parameter rg -A 5 "validId\(long" --type javaLength of output: 37
I'll run a more general search in the
NitriteId.javafile to list all occurrences ofvalidId(along with their context, so that we can verify if there's an overloaded implementation handling long values.
🏁 Script executed:
#!/bin/bash # List all occurrences of "validId(" in NitriteId.java with context to verify its overloads. rg -C 5 "validId\\s*\\(" nitrite/src/main/java/org/dizitart/no2/collection/NitriteId.javaLength of output: 1031
ValidId Method: Long Value Handling Confirmed
The current implementation uses a singlevalidId(Object value)method to validate identifiers for both string and long inputs. In thecreateId(long)method, the long value is autoboxed and passed tovalidId, where its conversion to a string followed byLong.parseLongeffectively validates it as a 64-bit integer.
- Observation: No separate overload for long values exists, but this design is intentional and functionally correct.
- Recommendation: If clarity or performance improvements are desired, consider adding an explicit
validId(long value)overload and document its behavior; otherwise, the current approach is acceptable.
|
The numbers looks good. I am merging the PR. |
I made some changes in the JMH repo to enable testing specific branches more easily, and then ran a pair of jobs that I think you'll want to have a close look at:
main: https://github.com/leoger/nitrite-jmh/actions/runs/13646173992For some of the benchmark cases, it makes Nitrite nearly 50% faster. This roughly matches what I observed in stack-sampling CPU profiler data collected on production servers where my team is using Nitrite.
I enabled the JMH stack profiler on those branches, so you can see some rough data on where the code is spending its time. The library is doing a lot of comparison of record IDs, which is not surprising at all -- that's a fundamental thing we expect of a DB for it's internal bookkeeping. The problem is that when our id is a string and we want that string to sort numerically rather than lexically, there are extra hoops to jump through. Every one of those comparisons is currently very expensive because it triggers the parsing of both strings, compares the numeric values, and immediately throws the result of those computationally expensive parses away. Meanwhile, comparing two
longvalues is a primitive operation for the CPU so it costs a single clock cycle.If you're truly dedicated to the idea that they must be strings in memory at runtime, there are other tricks we could do to minimize the need for parsing, but I strongly advise against it. By simply adjusting the serialization to read the ID in as a String and immediately parse it when loading from a file, we can avoid problems of portability across languages and not have to worry about breaking compatibility for users with existing Nitrite DB files. I do want to be careful with back-compat, so please advise whether there are any additional DB-file-loading scenarios that are not covered by unit tests that would catch if this were actually a breaking change.
Summary by CodeRabbit