Skip to content

Commit 3477619

Browse files
bdashCouleeApps
authored andcommitted
[SharedCache] Split state into initial, loaded, and modified
The initial state is initialized during `PerformInitialLoad` and is immutable after that point. This required some slight restructuring of how information about memory regions is tracked as that was previously modified as regions were loaded. Memory regions are now stored in a map from their address range to the `MemoryRegion` object. This makes it cheap to look them up by address which is a common operation. The modified state consists of changes since the last save to the `DSCView` / `ViewSpecificState`. This means it is no longer necessary to copy any state when mutating a `SharedCache` instance for the first time. Instead, its data structures start off empty and are populated as images, sections, or symbol information is loaded. The loaded state consists of all modified state that has since been saved. It lives on the `ViewSpecificState`. Saving modified state merges it into the the existing loaded state. This pattern is carried over to the `Metadata` stored on the `DSCView`. The initial state is stored under its own metadata key, and each modified state is stored under a key with an incrementing number. This means each save of the state only needs to serialize the state that changed, rather than reserializing all of the state all of the time. There are two huge benefits from these changes: 1. At no point does `SharedCache` have to copy its in memory state. The basic copy-on-write approach introduced in #6129 reduced how often these copies are made, but they're still frequent and very expensive. 1. At no point does `SharedCache` have to re-serialize state to JSON that it has already serialized. JSON serialization previously added hundreds of milliseconds to any mutating operation on `SharedCache`. As a result, this speeds up the initial load of the shared cache by around 2x and loading of subsequent images improves by about the same. One trade-off is that the serialization / deserialization logic is more complicated. There are two reasons for this: 1. The state is now split across multiple metadata keys and needs to be merged when it is loaded. 2. The in-memory representation uses pointers to identify memory regions. These relationships have to be re-established after the JSON is deserialized. As a future direction it is worth considering whether the logic owned by `SharedCache` could be split in a similar manner to the data. The initial loading of the cache header, loading of images, and handling of symbol information are all mostly independent and work on separate data. If the logic were split into separate classes it would be easier to reason about which data is valid when, and would easily permit concurrent loading of multiple images from the shared library in a thread-safe manner.
1 parent 98b0fe0 commit 3477619

File tree

7 files changed

+1120
-870
lines changed

7 files changed

+1120
-870
lines changed

view/sharedcache/api/sharedcache.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,7 @@ namespace SharedCacheAPI {
161161
}
162162

163163
std::vector<DSCSymbol> result;
164+
result.reserve(count);
164165
for (size_t i = 0; i < count; i++)
165166
{
166167
DSCSymbol sym;

view/sharedcache/core/DSCView.cpp

Lines changed: 7 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -603,79 +603,21 @@ bool DSCView::Init()
603603
Ref<Type> filesetEntryCommandType = Type::StructureType(filesetEntryCommandStruct);
604604
DefineType(filesetEntryCommandTypeId, filesetEntryCommandName, filesetEntryCommandType);
605605

606-
std::vector<SharedCacheCore::MemoryRegion> regionsMappedIntoMemory;
607-
if (auto meta = GetParentView()->QueryMetadata(SharedCacheCore::SharedCacheMetadataTag))
606+
if (auto metadata = SharedCacheCore::SharedCacheMetadata::LoadFromView(GetParentView()))
608607
{
609-
std::string data = GetParentView()->GetStringMetadata(SharedCacheCore::SharedCacheMetadataTag);
610-
std::stringstream ss;
611-
ss.str(data);
612-
rapidjson::Document result(rapidjson::kObjectType);
613-
614-
result.Parse(data.c_str());
615-
616-
if (result.HasMember("metadataVersion"))
617-
{
618-
if (result["metadataVersion"].GetInt() != METADATA_VERSION)
619-
{
620-
LogError("Shared cache metadata version mismatch: expected %d, got %d", METADATA_VERSION,
621-
result["metadataVersion"].GetInt());
622-
return false;
623-
}
624-
}
625-
else
626-
{
627-
LogError("Shared cache metadata version not found");
628-
return false;
629-
}
630-
for (auto& imgV : result["regionsMappedIntoMemory"].GetArray())
631-
{
632-
SharedCacheCore::MemoryRegion region;
633-
region.LoadFromValue(imgV);
634-
regionsMappedIntoMemory.push_back(region);
635-
}
636-
637-
std::unordered_map<uint64_t, std::string> imageStartToInstallName;
638-
// key "m_imageStarts"
639-
for (auto& imgV : result["m_imageStarts"].GetArray())
640-
{
641-
std::string name = imgV.GetArray()[0].GetString();
642-
uint64_t addr = imgV.GetArray()[1].GetUint64();
643-
imageStartToInstallName[addr] = name;
644-
}
645-
646-
std::vector<std::pair<uint64_t, std::vector<std::pair<uint64_t, std::pair<BNSymbolType, std::string>>>>> exportInfos;
647-
648-
for (const auto& obj1 : result["exportInfos"].GetArray())
649-
{
650-
std::vector<std::pair<uint64_t, std::pair<BNSymbolType, std::string>>> innerVec;
651-
for (const auto& obj2 : obj1["value"].GetArray())
652-
{
653-
std::pair<BNSymbolType, std::string> innerPair = { (BNSymbolType)obj2["val1"].GetUint64(), obj2["val2"].GetString() };
654-
innerVec.push_back({ obj2["key"].GetUint64(), innerPair });
655-
}
656-
657-
exportInfos.push_back({obj1["key"].GetUint64(), innerVec});
658-
}
659-
660608
BeginBulkModifySymbols();
661-
for (const auto & [imageBaseAddr, exportList] : exportInfos)
609+
for (const auto& [imageBaseAddr, exportMap] : metadata->ExportInfos())
662610
{
663-
std::vector<Ref<Symbol>> symbolsList;
664-
for (const auto & [exportAddr, exportTypeAndName] : exportList)
665-
{
666-
symbolsList.push_back(new Symbol(exportTypeAndName.first, exportTypeAndName.second, exportAddr));
667-
}
668-
669-
auto typelib = GetTypeLibrary(imageStartToInstallName[imageBaseAddr]);
611+
auto typelib = GetTypeLibrary(metadata->InstallNameForImageBaseAddress(imageBaseAddr));
670612

671-
for (const auto& symbol : symbolsList)
613+
for (const auto& [address, symbol] : *exportMap)
672614
{
673-
if (!IsValidOffset(symbol->GetAddress()))
615+
if (!IsValidOffset(address))
674616
continue;
617+
675618
if (typelib)
676619
{
677-
auto type = typelib->GetNamedObject(symbol->GetRawName());
678-
if (type)
620+
if (auto type = typelib->GetNamedObject(symbol->GetFullName()))
679621
{
680622
DefineAutoSymbolAndVariableOrFunction(GetDefaultPlatform(), symbol, type);
681623
continue;

view/sharedcache/core/MetadataSerializable.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,11 @@ void Serialize(SerializationContext& context, uint64_t value) {
4747
context.writer.Uint64(value);
4848
}
4949

50+
void Serialize(SerializationContext& context, unsigned long value)
51+
{
52+
context.writer.Uint64(value);
53+
}
54+
5055
void Deserialize(DeserializationContext& context, std::string_view name, bool& b) {
5156
b = context.doc[name.data()].GetBool();
5257
}

view/sharedcache/core/MetadataSerializable.hpp

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -86,9 +86,10 @@ struct DeserializationContext {
8686
template <typename Derived, typename LoadResult = Derived>
8787
class MetadataSerializable {
8888
public:
89-
std::string AsString() const {
89+
template <typename... Args>
90+
std::string AsString(Args&&... args) const {
9091
SerializationContext context;
91-
Store(context);
92+
Store(context, std::forward<Args>(args)...);
9293

9394
return context.buffer.GetString();
9495
}
@@ -106,14 +107,15 @@ class MetadataSerializable {
106107
return Derived::Load(context);
107108
}
108109

109-
Ref<Metadata> AsMetadata() const {
110-
return new Metadata(AsString());
110+
template <typename... Args>
111+
Ref<Metadata> AsMetadata(Args&&... args) const {
112+
return new Metadata(AsString(std::forward<Args>(args)...));
111113
}
112114

113115
template <typename... Args>
114-
void Store(SerializationContext& context) const {
116+
void Store(SerializationContext& context, Args&&... args) const {
115117
context.writer.StartObject();
116-
AsDerived().Store(context);
118+
AsDerived().Store(context, std::forward<Args>(args)...);
117119
context.writer.EndObject();
118120
}
119121

@@ -148,8 +150,8 @@ void Serialize(SerializationContext& context, const std::pair<First, Second>& va
148150
context.writer.EndArray();
149151
}
150152

151-
template <typename K, typename V>
152-
void Serialize(SerializationContext& context, const std::map<K, V>& value)
153+
template <typename K, typename V, typename L>
154+
void Serialize(SerializationContext& context, const std::map<K, V, L>& value)
153155
{
154156
context.writer.StartArray();
155157
for (auto& pair : value)
@@ -181,6 +183,15 @@ void Serialize(SerializationContext& context, const std::vector<T>& values)
181183
context.writer.EndArray();
182184
}
183185

186+
template <typename T>
187+
void Serialize(SerializationContext& context, const std::optional<T>& value)
188+
{
189+
if (value.has_value())
190+
Serialize(context, *value);
191+
else
192+
context.writer.Null();
193+
}
194+
184195
SHAREDCACHE_FFI_API void Serialize(SerializationContext& context, const char*);
185196
SHAREDCACHE_FFI_API void Serialize(SerializationContext& context, bool b);
186197
SHAREDCACHE_FFI_API void Deserialize(DeserializationContext& context, std::string_view name, bool& b);
@@ -200,6 +211,7 @@ SHAREDCACHE_FFI_API void Serialize(SerializationContext& context, int32_t b);
200211
SHAREDCACHE_FFI_API void Deserialize(DeserializationContext& context, std::string_view name, int32_t& b);
201212
SHAREDCACHE_FFI_API void Serialize(SerializationContext& context, int64_t b);
202213
SHAREDCACHE_FFI_API void Deserialize(DeserializationContext& context, std::string_view name, int64_t& b);
214+
SHAREDCACHE_FFI_API void Serialize(SerializationContext& context, unsigned long b);
203215
SHAREDCACHE_FFI_API void Serialize(SerializationContext& context, std::string_view b);
204216
SHAREDCACHE_FFI_API void Serialize(SerializationContext& context, const std::pair<uint64_t, std::pair<uint64_t, uint64_t>>& value);
205217
SHAREDCACHE_FFI_API void Deserialize(DeserializationContext& context, std::string_view name, std::string& b);

0 commit comments

Comments
 (0)