feat(clp-s): Add get_metadata_for_log_event function to clp_s::ArchiveReader (resolves #2012).#2033
feat(clp-s): Add get_metadata_for_log_event function to clp_s::ArchiveReader (resolves #2012).#2033gibber9809 wants to merge 6 commits intoy-scope:mainfrom
get_metadata_for_log_event function to clp_s::ArchiveReader (resolves #2012).#2033Conversation
WalkthroughAdds ArchiveReader::get_metadata_for_log_event and matching ArchiveReaderAdaptor support: adaptor stores per-range metadata, populates it when reading range index, and exposes lookup by log-event index. Tests extended to validate per-log-event metadata and boundary conditions; Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.OpenGrep is compatible with Semgrep configurations. Add an |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/ArchiveReaderAdaptor.cpp`:
- Around line 339-346: In get_metadata_for_log_event, move the negative-index
guard (log_event_idx < 0) to before the
m_non_empty_range_metadata_map.upper_bound call to avoid unnecessary lookups,
and after obtaining the iterator from upper_bound verify that the found range
actually contains log_event_idx (e.g., check log_event_idx >= corresponding
range's start_index or whatever field marks range start) before returning
it->second; if the iterator is end() or the index falls outside the range, throw
OperationFailed(ErrorCodeBadParam, __FILENAME__, __LINE__).
- Around line 153-155: m_non_empty_range_metadata_map is currently copying
m_range_index.back().fields (potentially large nlohmann::json); change the map
value to store a pointer (nlohmann::json const*) or std::reference_wrapper<const
nlohmann::json> instead of a copy, update the map type in the header, modify the
emplace in try_read_range_index to insert the address/reference of
m_range_index.back().fields, and update callers such as
get_metadata_for_log_event to dereference the stored pointer/reference when
accessing metadata; keep the invariant that m_range_index is not
mutated/reallocated after population.
In `@components/core/src/clp_s/ArchiveReaderAdaptor.hpp`:
- Line 5: Replace the unused `#include` <functional> with `#include` <map> in
ArchiveReaderAdaptor.hpp so std::map is properly declared for
m_non_empty_range_metadata_map; locate the declaration of
m_non_empty_range_metadata_map (and the surrounding class/struct in
ArchiveReaderAdaptor) and update the header includes to remove <functional> and
add <map> consistent with ArchiveReader.hpp.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
components/core/src/clp_s/ArchiveReader.hppcomponents/core/src/clp_s/ArchiveReaderAdaptor.cppcomponents/core/src/clp_s/ArchiveReaderAdaptor.hppcomponents/core/tests/test-clp_s-range_index.cpp
There was a problem hiding this comment.
♻️ Duplicate comments (3)
components/core/src/clp_s/ArchiveReaderAdaptor.hpp (1)
1-10:⚠️ Potential issue | 🟡 MinorMissing
<map>include forstd::mapusage.The file uses
std::mapat line 198 form_non_empty_range_metadata_map, but<map>is not included in the header includes.Proposed fix
`#include` <cstddef> +#include <map> `#include` <memory> `#include` <optional>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp_s/ArchiveReaderAdaptor.hpp` around lines 1 - 10, The header is missing the <map> include required for std::map usage; add `#include` <map> to the include block at the top of ArchiveReaderAdaptor.hpp so the declaration of m_non_empty_range_metadata_map (and any other std::map uses) compiles; look for the include list near the top of the file and insert <map> alongside the other standard headers.components/core/src/clp_s/ArchiveReaderAdaptor.cpp (2)
369-376:⚠️ Potential issue | 🟡 MinorMove negative index validation before the map lookup.
The
log_event_idx < 0check is evaluated afterupper_bound, performing an unnecessary lookup when the index is already known to be invalid. Reordering improves clarity and avoids wasted work.Proposed fix
auto ArchiveReaderAdaptor::get_metadata_for_log_event(int64_t log_event_idx) -> nlohmann::json const& { + if (log_event_idx < 0) { + throw OperationFailed(ErrorCodeBadParam, __FILENAME__, __LINE__); + } auto const it{m_non_empty_range_metadata_map.upper_bound(log_event_idx)}; - if (m_non_empty_range_metadata_map.end() == it || log_event_idx < 0) { + if (m_non_empty_range_metadata_map.end() == it) { throw OperationFailed(ErrorCodeBadParam, __FILENAME__, __LINE__); } return it->second; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp_s/ArchiveReaderAdaptor.cpp` around lines 369 - 376, Check for a negative log_event_idx before doing the map lookup in ArchiveReaderAdaptor::get_metadata_for_log_event: move the log_event_idx < 0 validation to the top of the function and throw OperationFailed(ErrorCodeBadParam, __FILENAME__, __LINE__) immediately if negative, then call m_non_empty_range_metadata_map.upper_bound(log_event_idx) and proceed as before to avoid an unnecessary lookup when the index is invalid.
171-173: 🧹 Nitpick | 🔵 TrivialMap stores a copy of
fields— consider storing a pointer instead.Since
m_range_indexowns theRangeIndexEntryobjects and lives for the lifetime of the adaptor, storing anlohmann::json const*instead of copying potentially large JSON objects would reduce memory overhead.Sketch using pointer
In the header, change the map value type:
- std::map<int64_t, nlohmann::json> m_non_empty_range_metadata_map; + std::map<int64_t, nlohmann::json const*> m_non_empty_range_metadata_map;In
try_read_range_index:if (start_index != end_index) { - m_non_empty_range_metadata_map.emplace(end_index, m_range_index.back().fields); + m_non_empty_range_metadata_map.emplace(end_index, &m_range_index.back().fields); }In
get_metadata_for_log_event:- return it->second; + return *(it->second);Note: This is safe as long as
m_range_indexis not mutated after population, which appears to be the case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp_s/ArchiveReaderAdaptor.cpp` around lines 171 - 173, The code currently copies large nlohmann::json objects into m_non_empty_range_metadata_map; change the map to store nlohmann::json const* (pointer) instead of a copy, update the header/type of m_non_empty_range_metadata_map accordingly, set the pointer when populating the map in try_read_range_index by using &m_range_index.back().fields (or the appropriate RangeIndexEntry instance), and update get_metadata_for_log_event and any other access sites to dereference the pointer (handle nulls if needed) while preserving const correctness; ensure comments note that m_range_index must not be mutated after population so the pointer remains valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@components/core/src/clp_s/ArchiveReaderAdaptor.cpp`:
- Around line 369-376: Check for a negative log_event_idx before doing the map
lookup in ArchiveReaderAdaptor::get_metadata_for_log_event: move the
log_event_idx < 0 validation to the top of the function and throw
OperationFailed(ErrorCodeBadParam, __FILENAME__, __LINE__) immediately if
negative, then call m_non_empty_range_metadata_map.upper_bound(log_event_idx)
and proceed as before to avoid an unnecessary lookup when the index is invalid.
- Around line 171-173: The code currently copies large nlohmann::json objects
into m_non_empty_range_metadata_map; change the map to store nlohmann::json
const* (pointer) instead of a copy, update the header/type of
m_non_empty_range_metadata_map accordingly, set the pointer when populating the
map in try_read_range_index by using &m_range_index.back().fields (or the
appropriate RangeIndexEntry instance), and update get_metadata_for_log_event and
any other access sites to dereference the pointer (handle nulls if needed) while
preserving const correctness; ensure comments note that m_range_index must not
be mutated after population so the pointer remains valid.
In `@components/core/src/clp_s/ArchiveReaderAdaptor.hpp`:
- Around line 1-10: The header is missing the <map> include required for
std::map usage; add `#include` <map> to the include block at the top of
ArchiveReaderAdaptor.hpp so the declaration of m_non_empty_range_metadata_map
(and any other std::map uses) compiles; look for the include list near the top
of the file and insert <map> alongside the other standard headers.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 813fdf19-d406-4950-879c-30aaf9ee91b7
📒 Files selected for processing (3)
components/core/src/clp_s/ArchiveReader.hppcomponents/core/src/clp_s/ArchiveReaderAdaptor.cppcomponents/core/src/clp_s/ArchiveReaderAdaptor.hpp
| * @throws ArchiveReaderAdaptor::OperationFailed when `log_event_idx` cannot be mapped to | ||
| * any metadata. | ||
| */ | ||
| [[nodiscard]] auto get_metadata_for_log_event(int64_t log_event_idx) -> nlohmann::json const& { |
There was a problem hiding this comment.
Should log event simply take an unsigned type like uint64_t?
| auto const it{m_non_empty_range_metadata_map.upper_bound(log_event_idx)}; | ||
| if (m_non_empty_range_metadata_map.end() == it || log_event_idx < 0) { |
There was a problem hiding this comment.
Logic seems like you don't really need to look up anything if the log even idx is negative, which makes me feel more strongly about using uint64_t
| if (start_index != end_index) { | ||
| m_non_empty_range_metadata_map.emplace(end_index, m_range_index.back().fields); | ||
| } |
There was a problem hiding this comment.
I guess I've always had the question if the range index accepts files with empty ranges, or gaps between adjacent ranges.
Description
This PR implements the
get_metadata_for_log_eventfunction as requested in #2012. The implementation is simply to create a map ofend_index -> metadatacontaining entries for each range with at least one record, and look up the metadata for a given index using theupper_bound()function.This PR also adds some assertions in
test-clp_s-range-indexto confirm thatget_metadata_for_log_eventworks as expected.Checklist
breaking change.
Validation performed
get_metadata_for_log_eventretrieves the correct metadata for a given index, and throws for invalid indexes.Summary by CodeRabbit
New Features
Tests