Skip to content

MemLens Database Specific Changes#170

Merged
cjcchen merged 28 commits intoapache:masterfrom
harish876:master
Mar 20, 2025
Merged

MemLens Database Specific Changes#170
cjcchen merged 28 commits intoapache:masterfrom
harish876:master

Conversation

@harish876
Copy link
Copy Markdown
Contributor

@harish876 harish876 commented Feb 12, 2025

Summary:

  • Integrated a Stats Class with LevelDB to export internal statistics, including LRU cache hit-and-miss metrics for the Memory Engine.
  • Added an LRU Cache controlled via a switch to track and report cache hit-and-miss ratios as part of the storage engine metrics on the front end.
  • Implemented the generation of compile_commands.json to enable autocompletion and IntelliSense during development.
  • Introduced debug flags during build steps for better profiling and performance tracking.
  • Fixed a bug related to passing the LevelDB configuration to the KV store.
  • Created a new endpoint /transaction_data to export specific statistics to the frontend.

Todo:

Bootstrap script like ./INSTALL.sh to install and setup all monitoring tools

Copy link
Copy Markdown
Contributor

@apratimshukla6 apratimshukla6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjcchen can you review and approve this?

name = "leveldb",
srcs = ["leveldb.cpp"],
hdrs = ["leveldb.h"],
linkopts = ["-pg","-g","-ggdb"], # Enable profiling during linking
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only open this debug opts if needed.
Most of time, we dont to debug the code.

}
}
if ((*config).enable_block_cache()) {
LRUCache<std::string, std::string>* block_cache =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When to destroy this pointer?
It looks like there is a memory leak.

You can use unique_pointer instead.

return ValueType();
}

auto it = std::find(dq_.begin(), dq_.end(), key);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance issue?

}

auto it = std::find(dq_.begin(), dq_.end(), key);
dq_.erase(it);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dq_.erase(it);
if ( it != dq_.end()){
dq_.erase(it);
}

int cache_hits_; // Cache hits count
int cache_misses_; // Cache misses count

deque<KeyType> dq_; // To maintain most and least recently used items
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,101 @@
#include "lru_cache.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some UTs.

@cjcchen
Copy link
Copy Markdown
Contributor

cjcchen commented Feb 13, 2025

Hi @harish876,

Thanks for the contribution and it is a nice accomplishment.
I left some comments. Please take a look.

Thanks.

@harish876
Copy link
Copy Markdown
Contributor Author

Hi @cjcchen
Thanks for the comments. I will go through them and resolve them.

}
}

long getRSS() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
long getRSS() {
long GetRSS() {

return 0;
}

unsigned long size, resident, share, text, lib, data, dt;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use int64_t and uint64_t other than long and unsigned long.

std::chrono::system_clock::time_point execution_time;

// Storage Engine Stats
double ext_cache_hit_ratio;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some naming issues. using "_" as the subfix.

@cjcchen
Copy link
Copy Markdown
Contributor

cjcchen commented Feb 21, 2025

Hi @cjcchen , I have made the necessary changes pointed out by you. Can you please review them?

Very nice job.
Left a few comments.

@harish876
Copy link
Copy Markdown
Contributor Author

Thanks @cjcchen , will resolve the additional comments

@harish876
Copy link
Copy Markdown
Contributor Author

Hi @cjcchen tried to resolve the comments. please do review once and let me know if further changes need to be made. thanks

@cjcchen
Copy link
Copy Markdown
Contributor

cjcchen commented Mar 7, 2025

Hi @cjcchen tried to resolve the comments. please do review once and let me know if further changes need to be made. thanks

Very good job.

One question here, what is the motivation to remove the O3 option?
Do we need to always open this feature?

If this is optional to just for the profile tracing, we can add an introduction on how to enable the feature.

@harish876
Copy link
Copy Markdown
Contributor Author

yes @cjcchen . it was a mistake. corrected it now

service/kv/BUILD Outdated
"//chain/storage/setting:enable_leveldb_setting": ["-DENABLE_LEVELDB"],
"//conditions:default": [],
"//chain/storage/setting:enable_leveldb_setting": ["-DENABLE_LEVELDB","-g","-ggdb"],
"//conditions:default": ["-pg", "-g", "-ggdb"],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this -pg -ggdb?

Comment on lines +30 to +31
copts = ["-pg"],
linkopts = ["-pg"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this?

@cjcchen
Copy link
Copy Markdown
Contributor

cjcchen commented Mar 8, 2025

yes @cjcchen . it was a mistake. corrected it now

Great.
How about the "pg" in the BUILD files?

@harish876
Copy link
Copy Markdown
Contributor Author

Hi @cjcchen. I have removed the debug flags. Our profiler only works when the build has debug information at the top level. I am unsure if profiling would work without the -pg flag. We do get kernel level metrics without the debug flag but userspace metrics are not available without that.

Please do let us know what could be done here?

@cjcchen
Copy link
Copy Markdown
Contributor

cjcchen commented Mar 14, 2025

Hi @cjcchen. I have removed the debug flags. Our profiler only works when the build has debug information at the top level. I am unsure if profiling would work without the -pg flag. We do get kernel level metrics without the debug flag but userspace metrics are not available without that.

Please do let us know what could be done here?

Very cool!
Merged.

@cjcchen
Copy link
Copy Markdown
Contributor

cjcchen commented Mar 14, 2025

Hi @cjcchen. I have removed the debug flags. Our profiler only works when the build has debug information at the top level. I am unsure if profiling would work without the -pg flag. We do get kernel level metrics without the debug flag but userspace metrics are not available without that.

Please do let us know what could be done here?

@harish876 did it build successfully on your end?

https://github.com/apache/incubator-resilientdb/actions/runs/13852115218/job/38763569382

@harish876
Copy link
Copy Markdown
Contributor Author

Yes @cjcchen , it did build with warnings albeit, but all TC's were a go. I will go through the failed checks.

@harish876
Copy link
Copy Markdown
Contributor Author

harish876 commented Mar 15, 2025

Hi @cjcchen , verified the build and unit tests again. I removed our profiling command, which needs to be installed. I will convert our setup to a sidecar. Please find my deployment results. I also have enabled LevelDB here, please do let me know if that is fine by you

Screenshot 2025-03-15 at 1 43 49 AM Screenshot 2025-03-15 at 1 52 22 AM

@cjcchen
Copy link
Copy Markdown
Contributor

cjcchen commented Mar 15, 2025

Hi @cjcchen , verified the build and unit tests again. I removed our profiling command, which needs to be installed. I will convert our setup to a sidecar. Please find my deployment results. I also have enabled LevelDB here, please do let me know if that is fine by you

Screenshot 2025-03-15 at 1 43 49 AM Screenshot 2025-03-15 at 1 52 22 AM

"common/lru/lru_cache.cpp:21:1: note: 'std::string' is defined in header ''; did you forget to '#include '?
"

it reports this. I think you just need to include to the header.

@harish876
Copy link
Copy Markdown
Contributor Author

Done @cjcchen

@cjcchen cjcchen merged commit d34e9f5 into apache:master Mar 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants