-
Notifications
You must be signed in to change notification settings - Fork 73
(7/5) [nexus-db-queries] Benchmark for VMM reservation #7498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c9fb7a6
4020517
8f1d37c
772e64f
d8cff32
161f9d6
df119b6
83a26a4
8dc0825
e3113ff
789bc97
5e21f34
fa9461b
4e9cebc
6cfca2d
a1c97d4
050b4c5
4b08032
8bc8f0c
900f09c
195e167
f2ebe31
62c38ec
85985c1
1326116
aba9596
a271f1d
1ad0101
4d26262
6ae1910
58ebd65
ab817bc
9daf923
1b39c51
e21125f
bd95b03
76b0af9
1179919
a07c46e
fdccd6b
db49c67
6e44392
8e6fe1a
d5b951f
fdfb78d
f57a96a
fb8976d
17dc409
97b9b2e
d949890
fa2550e
79b4252
04a4b98
db40d05
b68239b
ca8f890
5406dd4
bb0f349
6704be1
127285c
acf3a64
43f4437
9a3ef20
f028351
45b993a
05c8b60
eee1d6f
10aaed9
f8b9b5d
1032128
859f099
3edc5ed
8566395
393a925
7a99a96
ee657cc
931801d
5f75a2b
b1689c8
fd09de7
c18449c
4bf0e48
1a7e7e0
01b8f03
a35a33c
eb9682f
0e6eaf2
45a97a3
bd1f2cc
7ba08b0
1add3c1
4a00ddd
7a8da16
081e8b4
7733054
e9cc19c
01914c3
e6634dc
565ffc8
e15460c
9190d80
8d90161
9f6ad7b
49196d2
7193c88
16b3df3
f5bbe28
beaedfc
afbc4ad
25a6e5e
7c90072
701106d
0419d37
2ad7bb7
da625b9
36ddafc
b8c5024
bba32a6
e893250
f5ec80d
fba0614
e79fa5f
faed1c4
01029f1
3c0cc80
462afc7
93d4ac3
75acd32
a3e885a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| :showtitle: | ||
| :toc: left | ||
| :icons: font | ||
|
|
||
| = Benchmarks | ||
|
|
||
| This directory contains benchmarks for database queries. | ||
|
|
||
| These queries can be run with: | ||
|
|
||
| [source,bash] | ||
| ---- | ||
| cargo bench -p nexus-db-queries | ||
| ---- | ||
|
|
||
| Additionally, the "SHOW_CONTENTION" environment variable can be set to display | ||
| extra data from CockroachDB tables about contention statistics, if they | ||
| are available. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,108 @@ | ||
| // This Source Code Form is subject to the terms of the Mozilla Public | ||
| // License, v. 2.0. If a copy of the MPL was not distributed with this | ||
| // file, You can obtain one at https://mozilla.org/MPL/2.0/. | ||
|
|
||
| //! Database test helpers | ||
| //! | ||
| //! These are largely ripped out of "nexus/db-queries/src/db/datastore". | ||
| //! | ||
| //! Benchmarks are compiled as external binaries from library crates, so we | ||
| //! can only access `pub` code. | ||
| //! | ||
| //! It may be worth refactoring some of these functions to a test utility | ||
| //! crate to avoid the de-duplication. | ||
|
|
||
| use anyhow::Context; | ||
| use anyhow::Result; | ||
| use nexus_db_model::Sled; | ||
| use nexus_db_model::SledReservationConstraintBuilder; | ||
| use nexus_db_model::SledUpdate; | ||
| use nexus_db_queries::context::OpContext; | ||
| use nexus_db_queries::db::DataStore; | ||
| use nexus_db_queries::db::pub_test_utils::helpers::SledUpdateBuilder; | ||
| use nexus_db_queries::db::pub_test_utils::helpers::small_resource_request; | ||
| use omicron_uuid_kinds::InstanceUuid; | ||
| use omicron_uuid_kinds::PropolisUuid; | ||
| use uuid::Uuid; | ||
|
|
||
| pub fn rack_id() -> Uuid { | ||
| Uuid::parse_str(nexus_test_utils::RACK_UUID).unwrap() | ||
| } | ||
|
|
||
| const USABLE_HARDWARE_THREADS: u32 = 32; | ||
|
|
||
| pub fn test_new_sled_update() -> SledUpdate { | ||
| let mut sled = SledUpdateBuilder::new(); | ||
| sled.rack_id(rack_id()) | ||
| .hardware() | ||
| .usable_hardware_threads(USABLE_HARDWARE_THREADS); | ||
| sled.build() | ||
| } | ||
|
|
||
| pub async fn create_sleds(datastore: &DataStore, count: usize) -> Vec<Sled> { | ||
| let mut sleds = vec![]; | ||
| for _ in 0..count { | ||
| let (sled, _) = | ||
| datastore.sled_upsert(test_new_sled_update()).await.unwrap(); | ||
| sleds.push(sled); | ||
| } | ||
| sleds | ||
| } | ||
|
|
||
| /// Given a `sled_count`, returns the number of times a call to | ||
| /// `create_reservation` should succeed. | ||
| /// | ||
| /// This can be used to validate parameters before running benchmarks. | ||
| pub fn max_resource_request_count(sled_count: usize) -> usize { | ||
| let threads_per_request: usize = | ||
| small_resource_request().hardware_threads.0.try_into().unwrap(); | ||
| let threads_per_sled: usize = USABLE_HARDWARE_THREADS.try_into().unwrap(); | ||
|
|
||
| threads_per_sled * sled_count / threads_per_request | ||
| } | ||
|
|
||
| pub async fn create_reservation( | ||
| opctx: &OpContext, | ||
| db: &DataStore, | ||
| instance_id: InstanceUuid, | ||
| ) -> Result<PropolisUuid> { | ||
| let vmm_id = PropolisUuid::new_v4(); | ||
|
|
||
| loop { | ||
| match db | ||
| .sled_reservation_create( | ||
| &opctx, | ||
| instance_id, | ||
| vmm_id, | ||
| small_resource_request(), | ||
| SledReservationConstraintBuilder::new().build(), | ||
| ) | ||
| .await | ||
| { | ||
| Ok(_) => break, | ||
| Err(err) => { | ||
| // This condition is bad - it would result in a user-visible | ||
| // error, in most cases - but it's also an indication of failure | ||
| // due to contention. We normally bubble this out to users, | ||
| // rather than stalling the request, but in this particular | ||
| // case, we choose to retry immediately. | ||
|
Comment on lines
+86
to
+88
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible/worthwhile to report the number of retries as a lower-is-better benchmarked metric, or does criterion just deal with durations? Extra retries will hurt the average duration, too, but I can see it being useful to have an explicit metric for this, especially if "real" retries produce a user error.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like the answer may be "no" per this bit of the docs ("Note that as of version 0.3.0, only timing measurements are supported, and only a single measurement can be used for one benchmark. These restrictions may be lifted in future versions."). Alas. I'm leaving the comment here, though, in case there's some other clever way to display this that I haven't thought of yet.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I think it's okay right now that we're tracking time -- if it helps, it's a decent proxy for retries, even though, as you say, we don't actually acknowledge the number of user-visible retries here. I'm mostly doing this so I can force the system into contention, which is where benchmarking is most insightful anyway. |
||
| if err.to_string().contains("restart transaction") { | ||
| continue; | ||
| } | ||
| return Err(err).context("Failed to create reservation"); | ||
| } | ||
| } | ||
| } | ||
| Ok(vmm_id) | ||
| } | ||
|
|
||
| pub async fn delete_reservation( | ||
| opctx: &OpContext, | ||
| db: &DataStore, | ||
| vmm_id: PropolisUuid, | ||
| ) -> Result<()> { | ||
| db.sled_reservation_delete(&opctx, vmm_id) | ||
| .await | ||
| .context("Failed to delete reservation")?; | ||
| Ok(()) | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively ,we could perhaps just publicly export them under a "test utils" feature, or something? That might be a simpler solution to de-duplicate this without having to go make a whole new crate for them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this refactor in 36ddafc