-
Notifications
You must be signed in to change notification settings - Fork 83
feat(ffi): Add new EncodedTextAst implementation using StringBlob.
#1561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds a templated EncodedTextAst type with decode/to_string, a new EncodedTextAst error enum and category, a concept restricting encoded-variable types, StringBlob::append, new Catch2 tests, and updates CMake to include the new sources. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant EncodedTextAst
participant StringBlob
participant Handlers
Caller->>EncodedTextAst: create(encoded_vars, string_blob)
activate EncodedTextAst
Caller->>EncodedTextAst: decode<unescape_logtype>(constant,intVar,floatVar,dictVar)
EncodedTextAst->>StringBlob: read logtype & placeholders
loop per segment/placeholder
alt Constant
EncodedTextAst->>Handlers: constant(segment)
else Integer
EncodedTextAst->>Handlers: intVar(encoded_var)
else Float
EncodedTextAst->>Handlers: floatVar(encoded_var)
else Dictionary
EncodedTextAst->>Handlers: dictVar(dict_index)
else Escape
EncodedTextAst->>EncodedTextAst: validate escape (no trailing)
end
end
EncodedTextAst->>EncodedTextAst: validate counts (encoded/dict)
alt OK
EncodedTextAst->>Caller: Result<void>::ok()
else Error
EncodedTextAst->>Caller: Result<void>::err(EncodedTextAstError)
end
deactivate EncodedTextAst
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (8)📓 Common learnings📚 Learning: 2025-08-06T23:57:49.762ZApplied to files:
📚 Learning: 2024-10-18T02:31:18.595ZApplied to files:
📚 Learning: 2024-10-24T14:45:26.265ZApplied to files:
📚 Learning: 2024-10-07T21:16:41.660ZApplied to files:
📚 Learning: 2024-10-13T09:27:43.408ZApplied to files:
📚 Learning: 2025-06-24T20:13:46.758ZApplied to files:
📚 Learning: 2025-08-09T04:07:27.083ZApplied to files:
🧬 Code graph analysis (1)components/core/src/clp/ffi/EncodedTextAst.hpp (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (7)
components/core/CMakeLists.txt(1 hunks)components/core/src/clp/ffi/EncodedTextAst.hpp(1 hunks)components/core/src/clp/ffi/EncodedTextAstError.cpp(1 hunks)components/core/src/clp/ffi/EncodedTextAstError.hpp(1 hunks)components/core/src/clp/ffi/test/test_EncodedTextAst.cpp(1 hunks)components/core/src/clp/ffi/test/test_StringBlob.cpp(3 hunks)components/core/src/clp/ir/types.hpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp/ir/types.hppcomponents/core/src/clp/ffi/EncodedTextAstError.hppcomponents/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/EncodedTextAst.hppcomponents/core/src/clp/ffi/test/test_StringBlob.cpp
🧠 Learnings (17)
📓 Common learnings
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-01T07:59:11.208Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.
Applied to files:
components/core/src/clp/ir/types.hppcomponents/core/src/clp/ffi/EncodedTextAstError.hppcomponents/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-10-18T02:31:18.595Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serialize_schema_tree_node_id` assumes that the caller checks that `node_id` fits within the range of `encoded_node_id_t` before casting.
Applied to files:
components/core/src/clp/ir/types.hppcomponents/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2025-09-11T13:15:39.930Z
Learnt from: anlowee
Repo: y-scope/clp PR: 1176
File: components/core/src/clp_s/ColumnWriter.cpp:67-72
Timestamp: 2025-09-11T13:15:39.930Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, when using sizeof with types like variable_dictionary_id_t, always use the fully qualified namespace (clp::variable_dictionary_id_t) to match the variable declaration and maintain consistency with other column writers in the codebase.
Applied to files:
components/core/src/clp/ir/types.hppcomponents/core/src/clp/ffi/test/test_StringBlob.cpp
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: 2024-11-19T17:30:04.970Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Applied to files:
components/core/CMakeLists.txtcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/test/test_StringBlob.cpp
📚 Learning: 2024-11-26T19:12:09.102Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/src/clp/error_handling/ErrorCode.hpp:131-134
Timestamp: 2024-11-26T19:12:09.102Z
Learning: In the error handling system (`components/core/src/clp/error_handling/ErrorCode.hpp`), the `ErrorCode` class is intentionally designed to wrap the `ErrorCodeEnum`, not to use the enum directly. This wrapper provides a general interface for adding customized error enums.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.hppcomponents/core/src/clp/ffi/EncodedTextAstError.cpp
📚 Learning: 2024-11-26T19:16:19.933Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/tests/test-error_handling.cpp:44-91
Timestamp: 2024-11-26T19:16:19.933Z
Learning: In `components/core/tests/test-error_handling.cpp` (C++), when specializing template methods of `ErrorCategory` via type aliases (e.g., `AlwaysSuccessErrorCategory`), it's acceptable to define these specializations in the current namespace, even though the primary template is in the `clp::error_handling` namespace, due to the use of type aliases and template instantiation.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-24T14:45:26.265Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/test/test_StringBlob.cpp
📚 Learning: 2024-11-26T19:10:05.051Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/src/clp/error_handling/ErrorCode.hpp:34-34
Timestamp: 2024-11-26T19:10:05.051Z
Learning: In the `ErrorCategory` template class in `ErrorCode.hpp`, the `name()` and `message(ErrorCodeEnum)` methods should not be declared as pure virtual functions because the class relies on template specializations, and making them pure virtual would make the class abstract, which is not intended.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cpp
📚 Learning: 2024-11-01T03:26:26.386Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-13T09:27:43.408Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp`, suppressing the `readability-function-cognitive-complexity` warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/test/test_StringBlob.cpp
📚 Learning: 2024-10-14T03:42:10.355Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-11-29T22:50:17.206Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verified through `REQUIRE` statements and subsequent comparisons.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/test/test_StringBlob.cpp
📚 Learning: 2025-04-26T02:21:22.021Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.
Applied to files:
components/core/src/clp/ffi/test/test_StringBlob.cpp
📚 Learning: 2024-11-29T22:51:00.355Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 595
File: components/core/tests/test-clp_s-end_to_end.cpp:141-142
Timestamp: 2024-11-29T22:51:00.355Z
Learning: In 'components/core/tests/test-clp_s-end_to_end.cpp', using `WIFEXITED` and `WEXITSTATUS` with the return value of `std::system` is acceptable for checking the exit status of system commands, as per the C++ documentation.
Applied to files:
components/core/src/clp/ffi/test/test_StringBlob.cpp
📚 Learning: 2024-12-10T16:56:33.545Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 624
File: components/core/src/clp/BoundedReader.hpp:25-27
Timestamp: 2024-12-10T16:56:33.545Z
Learning: In the CLP project, within constructors like `BoundedReader` in `components/core/src/clp/BoundedReader.hpp`, it's acceptable to call methods like `get_pos()` that may throw exceptions without handling potential errors via `try_get_pos()`, since the constructor itself handles errors by throwing.
Applied to files:
components/core/src/clp/ffi/test/test_StringBlob.cpp
🧬 Code graph analysis (2)
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp (1)
components/core/src/clp/ffi/EncodedTextAst.hpp (5)
nodiscard(78-85)nodiscard(97-99)nodiscard(130-145)nodiscard(162-261)encoded_vars(79-80)
components/core/src/clp/ffi/test/test_StringBlob.cpp (1)
components/core/src/clp/ffi/StringBlob.hpp (1)
reader(44-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: package-image
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: lint-check (macos-15)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (6)
components/core/src/clp/ffi/test/test_StringBlob.cpp (2)
12-12: LGTM! Good namespace organization.The namespace wrapper properly organizes the test code in the
clp::ffi::testnamespace, following good practices for test code organization.Also applies to: 47-47
14-14: LGTM! Unqualified names resolve correctly.The change to unqualified
StringBlobandBufferReaderis correct. From theclp::ffi::testnamespace, both names properly resolve to their fully qualified versions (clp::ffi::StringBlobandclp::BufferReaderrespectively) through namespace lookup.Also applies to: 26-26
components/core/CMakeLists.txt (1)
484-491: Nice job wiring the new EncodedTextAst sources into the test target.
With these entries inSOURCE_FILES_unitTest, the new implementation and its tests will actually build under the unit test binary. Looks good.components/core/src/clp/ffi/EncodedTextAstError.hpp (1)
12-20: Enum surface looks spot on.
The enum values and error-code alias align cleanly with the decode paths and ystdlib’s error-handling conventions. Nicely done.components/core/src/clp/ffi/test/test_EncodedTextAst.cpp (1)
73-191: Great coverage across success and failure paths.
Exercising both encoded-variable widths, real decoding, and each distinct error branch gives solid confidence in the new API. Thanks for the thorough tests.components/core/src/clp/ffi/EncodedTextAst.hpp (1)
130-145: to_string neatly reuses the generic decode path.
Rebuilding the string via the handler-based decode keeps the code path single-sourced while honouring the unescape semantics. Looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp/ffi/EncodedTextAst.hpp(1 hunks)components/core/src/clp/ffi/test/test_EncodedTextAst.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp/ffi/EncodedTextAst.hppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cpp
🧠 Learnings (8)
📓 Common learnings
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-01T07:59:11.208Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.
Applied to files:
components/core/src/clp/ffi/EncodedTextAst.hppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-18T02:31:18.595Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serialize_schema_tree_node_id` assumes that the caller checks that `node_id` fits within the range of `encoded_node_id_t` before casting.
Applied to files:
components/core/src/clp/ffi/EncodedTextAst.hppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-11-01T03:26:26.386Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-13T09:27:43.408Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp`, suppressing the `readability-function-cognitive-complexity` warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-11-19T17:30:04.970Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-14T03:42:10.355Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-11-29T22:50:17.206Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verified through `REQUIRE` statements and subsequent comparisons.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
🧬 Code graph analysis (1)
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp (1)
components/core/src/clp/ffi/EncodedTextAst.hpp (5)
nodiscard(78-85)nodiscard(97-99)nodiscard(130-145)nodiscard(162-261)encoded_vars(79-80)
🔇 Additional comments (10)
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp (4)
1-64: LGTM! Well-structured test helper.The
create_encoded_text_ast_from_stringhelper function correctly encodes the input string, loads dictionary variables and logtype into theStringBlob, and constructs theEncodedTextAst. The logic properly handles the dict var bounds and aligns with the expected structure where dict vars precede the logtype.
73-109: LGTM! Comprehensive round-trip test.This test section thoroughly exercises the encoding and decoding pipeline with a variety of variable types (small/medium/large integers, various float formats, dictionary variables) and embedded placeholder characters. The round-trip verification ensures bit-accurate decoding.
111-118: LGTM! Static text test is correct.The test correctly verifies that static text (no variables) is preserved through encoding and decoding, and that the logtype equals the original text.
120-191: LGTM! Thorough error-handling coverage.The error test sections comprehensively cover all error cases defined in
EncodedTextAstErrEnum:
- Missing logtype when StringBlob is empty
- Missing encoded variables (Integer, Float) and dictionary variables using Catch2 generators
- Trailing escape character without a following character
The assertions correctly verify both the error condition and the specific error enum value. Based on learnings, exception handling for Catch2 tests can be omitted as the framework captures exceptions.
components/core/src/clp/ffi/EncodedTextAst.hpp (6)
21-61: LGTM! Well-defined handler concepts.The concept definitions clearly specify the signature requirements for each handler type (constant, int var, float var, dict var). The use of
std::same_as<void>ensures handlers return void, which is appropriate for side-effect-only callbacks.
78-85: LGTM! Safe factory function with validation.The factory function correctly validates that
string_blobcontains at least one string (the logtype) before constructing the object. This ensuresget_logtype()at Line 98 and the constructor logic at Line 152 will always work safely.
97-99: LGTM! Safe unchecked.value()call.The
.value()call is safe because the factory function ensuresget_num_strings() >= 1, andm_num_dict_varsis computed asget_num_strings() - 1in the constructor, so the index is always valid.
130-145: LGTM! Cleanto_stringimplementation.The method correctly uses the
decodetemplate withunescape_logtype = trueand lambda handlers to build the decoded string. The use ofYSTDLIB_ERROR_HANDLING_TRYVmacro properly propagates errors from the decode method.
176-250: LGTM! Robust decode loop with proper bounds checking.The decode implementation correctly:
- Iterates through the logtype character by character
- Handles all placeholder types (Float, Integer, Dictionary, Escape)
- Validates bounds before accessing
m_encoded_vars(lines 185, 199) andm_string_blob(line 213)- Handles escape characters by checking for trailing escape (line 223) and incrementing
curr_posto skip the escaped character (line 242)- Conditionally emits constant text segments based on the
unescape_logtypetemplate parameter
252-260: LGTM! Proper remainder handling.After the loop, any remaining constant text after the last placeholder is correctly emitted via
constant_handler. The implementation returns success after completing the decode.
gibber9809
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did an initial review, mostly looks good. Besides the comments I left I'm also wondering about the [] usage in the decode method -- I'm personally fine with it since we're doing bounds checking ourselves before every access, but I'm wondering if we should add a comment since I think our coding standard prefers .at().
| EncodedTextAstConstantHandlerReq auto&& constant_handler, | ||
| EncodedTextAstIntVarHandlerReq<encoded_variable_t> auto&& int_var_handler, | ||
| EncodedTextAstFloatVarHandlerReq<encoded_variable_t> auto&& float_var_handler, | ||
| EncodedTextAstDictVarHandlerReq auto&& dict_var_handler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why rvalue references here? I think passing callbacks by value is more common + it's what the previous implementation does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will switch back to use passing-by-value. However, do u think it makes more sense to update our code base to use rvalue ref for callbacks? This makes the callback object more flexible: the callback can be not just a lambda but also an object that holds stateful values. Using rvalue allows us to pass the object into the call and use it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai can u create an issue to track this codebase-level update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create a GitHub issue to track the callback parameter passing strategy discussion
gh issue create \
--title "Evaluate using rvalue references for callback parameters codebase-wide" \
--body "## Context
During the review of PR #1561, a discussion arose regarding the parameter passing strategy for callbacks in the codebase.
Currently, most of the codebase passes callbacks by value. The PR initially used rvalue references (\`auto&&\`) for callback parameters, which offers potential benefits:
- **Increased flexibility**: Allows callbacks to be not just lambdas, but also objects that hold stateful values
- **Efficiency**: Using rvalue references allows passing objects into the call and using them later without unnecessary copies
## Proposal
Evaluate whether it would be beneficial to update the codebase to consistently use rvalue references for callback parameters instead of pass-by-value.
## References
- Discussion: https://github.com/y-scope/clp/pull/1561#discussion_r2500321748
- Related PR: https://github.com/y-scope/clp/pull/1561
Requested by: @LinZhihao-723" \
--assignee LinZhihao-723 \
--repo y-scope/clp
Length of output: 42
@LinZhihao-723, I've created issue #1570 to track the evaluation of using rvalue references for callback parameters codebase-wide. The issue includes the context from this discussion, the rationale for the potential change, and references back to this PR and comment thread.
🧠 Learnings used
Learnt from: junhaoliao
Repo: y-scope/clp PR: 939
File: components/package-template/src/etc/clp-config.yml:64-64
Timestamp: 2025-06-24T20:13:46.758Z
Learning: When users ask CodeRabbit to create an issue after providing suggestions, they want a GitHub issue created with the high-level requirements and context, not specific code implementations.
Learnt from: gibber9809
Repo: y-scope/clp PR: 1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-01T07:59:11.208Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
| return "An encoded variable is missing from the `EncodedTextAst`"; | ||
| case EncodedTextAstErrEnum::MissingDictVar: | ||
| return "A dictionary variable is missing from the `EncodedTextAst`"; | ||
| case EncodedTextAstErrEnum::MissingLogtype: | ||
| return "The logtype is missing from the `EncodedTextAst`"; | ||
| case EncodedTextAstErrEnum::UnexpectedTrailingEscapeCharacter: | ||
| return "Unexpected escape character without escaped value at the end of the logtype"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return "An encoded variable is missing from the `EncodedTextAst`"; | |
| case EncodedTextAstErrEnum::MissingDictVar: | |
| return "A dictionary variable is missing from the `EncodedTextAst`"; | |
| case EncodedTextAstErrEnum::MissingLogtype: | |
| return "The logtype is missing from the `EncodedTextAst`"; | |
| case EncodedTextAstErrEnum::UnexpectedTrailingEscapeCharacter: | |
| return "Unexpected escape character without escaped value at the end of the logtype"; | |
| return "An encoded variable is missing from the `EncodedTextAst`."; | |
| case EncodedTextAstErrEnum::MissingDictVar: | |
| return "A dictionary variable is missing from the `EncodedTextAst`."; | |
| case EncodedTextAstErrEnum::MissingLogtype: | |
| return "The logtype is missing from the `EncodedTextAst`."; | |
| case EncodedTextAstErrEnum::UnexpectedTrailingEscapeCharacter: | |
| return "Unexpected escape character without escaped value at the end of the logtype."; |
We end the error messages with periods for our other error categories, so editing this for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose that we try to avoid adding periods unless it's a logging statement. Two benefits of doing so:
- In a logging statement we usually do:
LOG(fmt("{}: {}", err.category().name(), err.message()));
While in other logging statement we do:
LOG(fmt("{} is not valid.", addr))
In the first statement we're assuming the error message will end with a period. But this becomes hard to manage as some error codes don't end with . while others do (e.g., std::errc doesn't end with .: https://en.cppreference.com/w/cpp/error/errc.html)
- Not including the period in the error message allows us to format the message in the middle of a sentence if needed.
| auto const length{static_cast<size_t>(dict_var_bounds[i + 1]) - begin}; | ||
| auto const dict_var{input.substr(begin, length)}; | ||
| BufferReader reader{dict_var.data(), dict_var.size()}; | ||
| REQUIRE_FALSE(string_blob.read_from(reader, length).has_value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it makes sense to also add an emplace_back(std::string_view) signature to StringBlob? Besides tests I think it might also end up being useful for the encode path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I doubt we're gonna need this in the encode path: I prefer to preserve a list of string views and encode them directly, since the lifetime of the input message should outlive the encode method and we can avoid a memory copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, leave it as is then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
| text += " and a medium float " + var_strs[var_ix++]; | ||
| text += " and a weird float " + var_strs[var_ix++]; | ||
| text += " and a string with numbers " + var_strs[var_ix++]; | ||
| text += " and another string with numbers " + var_strs[var_ix++]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this sort of mimics the long test case from test-EncodedVariableInterpreter.cpp could we also add the case with a variable that contains a backslash? Figure we might as well include that edge case since we're doing all of these other cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Also modernize the text string construction.
Co-authored-by: Devin Gibson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp/ffi/EncodedTextAst.hpp(1 hunks)components/core/src/clp/ffi/EncodedTextAstError.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
🧠 Learnings (9)
📓 Common learnings
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-01T07:59:11.208Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
📚 Learning: 2024-11-26T19:16:19.933Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/tests/test-error_handling.cpp:44-91
Timestamp: 2024-11-26T19:16:19.933Z
Learning: In `components/core/tests/test-error_handling.cpp` (C++), when specializing template methods of `ErrorCategory` via type aliases (e.g., `AlwaysSuccessErrorCategory`), it's acceptable to define these specializations in the current namespace, even though the primary template is in the `clp::error_handling` namespace, due to the use of type aliases and template instantiation.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cpp
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-11-26T19:10:05.051Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/src/clp/error_handling/ErrorCode.hpp:34-34
Timestamp: 2024-11-26T19:10:05.051Z
Learning: In the `ErrorCategory` template class in `ErrorCode.hpp`, the `name()` and `message(ErrorCodeEnum)` methods should not be declared as pure virtual functions because the class relies on template specializations, and making them pure virtual would make the class abstract, which is not intended.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cpp
📚 Learning: 2024-10-24T14:45:26.265Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-11-26T19:12:09.102Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/src/clp/error_handling/ErrorCode.hpp:131-134
Timestamp: 2024-11-26T19:12:09.102Z
Learning: In the error handling system (`components/core/src/clp/error_handling/ErrorCode.hpp`), the `ErrorCode` class is intentionally designed to wrap the `ErrorCodeEnum`, not to use the enum directly. This wrapper provides a general interface for adding customized error enums.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cpp
📚 Learning: 2024-10-08T15:52:50.753Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `get_archive_node_id` method, throwing a string literal as an exception is acceptable practice.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cpp
📚 Learning: 2024-12-10T16:03:08.691Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:08.691Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `get_archive_node_id` function, validation and exception throwing for UTF-8 compliance of `curr_node.get_key_name()` are unnecessary and should be omitted.
Applied to files:
components/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-10-18T02:31:18.595Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serialize_schema_tree_node_id` assumes that the caller checks that `node_id` fits within the range of `encoded_node_id_t` before casting.
Applied to files:
components/core/src/clp/ffi/EncodedTextAst.hpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: package-image
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: lint-check (macos-15)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-14, false)
🔇 Additional comments (11)
components/core/src/clp/ffi/EncodedTextAstError.cpp (3)
1-8: LGTM!The includes and type alias setup follow the established error handling patterns in the codebase.
10-13: LGTM!The error category name correctly identifies this as an EncodedTextAst error domain, distinguishing it from other error categories.
15-29: LGTM!The error messages are clear, consistent, and cover all enum cases. The decision to omit trailing periods aligns with the team's preference for composable error messages.
components/core/src/clp/ffi/EncodedTextAst.hpp (8)
1-19: LGTM!Header guards and includes are appropriate for this template class definition.
21-61: LGTM!The handler concepts clearly define the required signatures and leverage C++20 concepts effectively to constrain the decode method's template parameters.
71-85: LGTM!The factory function provides essential validation that the string blob contains at least the required logtype, preventing construction of invalid instances.
88-94: LGTM!Explicitly defaulted special member functions with appropriate noexcept specifications follow modern C++ best practices.
97-99: LGTM!The direct
.value()call is safe because the factory function ensures at least one string exists in the blob. The logtype is stored at indexm_num_dict_varswhich is always valid.
101-122: LGTM!The method signature uses
auto&&forwarding references with concepts, which is the idiomatic C++20 approach for generic callable parameters. This provides perfect forwarding and works efficiently with both lvalue and rvalue handlers without forcing copies.
130-145: LGTM!The
to_string()implementation cleanly composes the decode method with lambda handlers to reconstruct the original text, with proper error propagation.
160-261: LGTM!The decode implementation correctly handles all placeholder types and escape sequences with appropriate bounds checking and error handling. The logic for skipping escaped characters (line 242) and processing the remainder (lines 252-258) is sound.
Switched to use |
…23/clp into new-encoded-text-ast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
components/core/src/clp/ffi/EncodedTextAst.hpp(1 hunks)components/core/src/clp/ffi/EncodedTextAstError.cpp(1 hunks)components/core/src/clp/ffi/EncodedTextAstError.hpp(1 hunks)components/core/src/clp/ffi/test/test_EncodedTextAst.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/EncodedTextAstError.hppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
🧠 Learnings (15)
📓 Common learnings
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/tests/test-ir_encoding_methods.cpp:1180-1186
Timestamp: 2024-10-01T07:59:11.208Z
Learning: In the context of loop constructs, LinZhihao-723 prefers using `while (true)` loops and does not consider alternative loop constructs necessarily more readable.
📚 Learning: 2024-11-26T19:16:19.933Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/tests/test-error_handling.cpp:44-91
Timestamp: 2024-11-26T19:16:19.933Z
Learning: In `components/core/tests/test-error_handling.cpp` (C++), when specializing template methods of `ErrorCategory` via type aliases (e.g., `AlwaysSuccessErrorCategory`), it's acceptable to define these specializations in the current namespace, even though the primary template is in the `clp::error_handling` namespace, due to the use of type aliases and template instantiation.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/EncodedTextAstError.hpp
📚 Learning: 2025-08-06T23:57:49.762Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 1163
File: components/core/src/clp_s/ColumnWriter.hpp:120-122
Timestamp: 2025-08-06T23:57:49.762Z
Learning: In the CLP codebase, the `encoded_log_dict_id_t` type alias in `components/core/src/clp_s/ColumnWriter.hpp` is appropriately placed as a class-scoped alias because the ColumnWriter.hpp header is already included in all places where this type is used, eliminating the need to move it to a common header.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/EncodedTextAstError.hppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-11-26T19:10:05.051Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/src/clp/error_handling/ErrorCode.hpp:34-34
Timestamp: 2024-11-26T19:10:05.051Z
Learning: In the `ErrorCategory` template class in `ErrorCode.hpp`, the `name()` and `message(ErrorCodeEnum)` methods should not be declared as pure virtual functions because the class relies on template specializations, and making them pure virtual would make the class abstract, which is not intended.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/EncodedTextAstError.hpp
📚 Learning: 2024-10-24T14:45:26.265Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-10-07T21:38:35.979Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-07T21:38:35.979Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `get_archive_node_id` method, throwing a string literal as an exception is acceptable practice.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cpp
📚 Learning: 2024-11-26T19:12:09.102Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/src/clp/error_handling/ErrorCode.hpp:131-134
Timestamp: 2024-11-26T19:12:09.102Z
Learning: In the error handling system (`components/core/src/clp/error_handling/ErrorCode.hpp`), the `ErrorCode` class is intentionally designed to wrap the `ErrorCodeEnum`, not to use the enum directly. This wrapper provides a general interface for adding customized error enums.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.cppcomponents/core/src/clp/ffi/EncodedTextAstError.hpp
📚 Learning: 2024-11-01T03:26:26.386Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-13T09:27:43.408Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp`, suppressing the `readability-function-cognitive-complexity` warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-11-19T17:30:04.970Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-14T03:42:10.355Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-18T02:31:18.595Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serialize_schema_tree_node_id` assumes that the caller checks that `node_id` fits within the range of `encoded_node_id_t` before casting.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cppcomponents/core/src/clp/ffi/EncodedTextAstError.hppcomponents/core/src/clp/ffi/EncodedTextAst.hpp
📚 Learning: 2024-11-29T22:50:17.206Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verified through `REQUIRE` statements and subsequent comparisons.
Applied to files:
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp
📚 Learning: 2024-10-08T15:52:50.753Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 549
File: components/core/src/clp/ffi/ir_stream/Deserializer.hpp:209-210
Timestamp: 2024-10-08T15:52:50.753Z
Learning: Until #486 is merged with customized error code support, introducing a layer from `IRErrorCode` to `std::errc` is unnecessary.
Applied to files:
components/core/src/clp/ffi/EncodedTextAstError.hpp
📚 Learning: 2024-10-07T21:16:41.660Z
Learnt from: AVMatthews
Repo: y-scope/clp PR: 543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, when handling errors in `parse_from_ir`, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
Applied to files:
components/core/src/clp/ffi/EncodedTextAst.hpp
🧬 Code graph analysis (1)
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp (1)
components/core/src/clp/ffi/EncodedTextAst.hpp (5)
nodiscard(78-85)nodiscard(97-99)nodiscard(130-145)nodiscard(162-261)encoded_vars(79-80)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (3)
components/core/src/clp/ffi/EncodedTextAstError.hpp (1)
12-20: Nice consistency with the error handling stack.Enum sizing plus the
YSTDLIB_ERROR_HANDLING_MARK_AS_ERROR_CODE_ENUMregistration keeps this aligned with the existing error plumbing—looks good to me.components/core/src/clp/ffi/EncodedTextAstError.cpp (1)
10-28: Clear, user-facing messages.Distinct category name and concise per-enum text keep the diagnostics easy to trace—nicely done.
components/core/src/clp/ffi/test/test_EncodedTextAst.cpp (1)
70-195: Great coverage breadth.The template test spanning both encoded variable widths plus the targeted error sections (missing pieces, trailing escape) gives strong confidence in the new decode path.
Description
This is one of PR series to address #1541.
This PR adds a new
EncodedTextAstimplementation, usingStringBlobto store dictionary variables and logtypes. Besides storage, it implements a generic decoding method similar to this as a member method to allow text decoding. Similarly, it implements ato_stringmethod similar to this to decode and unparse the encoded text AST to its string form.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Tests
Bug Fixes