Skip to content

Commit 47e98f2

Browse files
authored
Remove sequential simulations and add continuous auth to Polaris benchmarks (#6)
* Remove sequential simulations and add continuous auth * Remove sequential simulations now that the concurrent simulation throughput is configurable (and can be set to 1 until SQL implementation is able to keep up with throughput) * Configure every workload using the configuration file * Change authentication logic so that OAuth token is refreshed every minute. This makes it possible to run a benchmark for longer than the default OAuth validity period (1h). It is useful for use cases like creating very large data sets or running longevity tests. * Code review: use block instead of argument for Gatling actions
1 parent bdda19f commit 47e98f2

13 files changed

+306
-197
lines changed

benchmarks/README.md

Lines changed: 27 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@
66
to you under the Apache License, Version 2.0 (the
77
"License"); you may not use this file except in compliance
88
with the License. You may obtain a copy of the License at
9-
9+
1010
http://www.apache.org/licenses/LICENSE-2.0
11-
11+
1212
Unless required by applicable law or agreed to in writing,
1313
software distributed under the License is distributed on an
1414
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -25,21 +25,28 @@ Benchmarks for the Polaris service using Gatling.
2525

2626
### Dataset Creation Benchmark
2727

28-
The CreateTreeDataset benchmark creates a test dataset with a specific structure. It exists in two variants:
28+
The CreateTreeDataset benchmark creates a test dataset with a specific structure:
2929

30-
- `org.apache.polaris.benchmarks.simulations.CreateTreeDatasetSequential`: Creates entities one at a time
31-
- `org.apache.polaris.benchmarks.simulations.CreateTreeDatasetConcurrent`: Creates up to 50 entities simultaneously
30+
- `org.apache.polaris.benchmarks.simulations.CreateTreeDataset`: Creates up to 50 entities simultaneously
3231

33-
These are write-only workloads designed to populate the system for subsequent benchmarks.
32+
This is a write-only workload designed to populate the system for subsequent benchmarks.
3433

3534
### Read/Update Benchmark
3635

37-
The ReadUpdateTreeDataset benchmark tests read and update operations on an existing dataset. It exists in two variants:
36+
The ReadUpdateTreeDataset benchmark tests read and update operations on an existing dataset:
37+
38+
- `org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDataset`: Performs up to 20 read/update operations simultaneously
39+
40+
This benchmark can only be run after using CreateTreeDataset to populate the system.
41+
42+
### Read-Only Benchmark
43+
44+
The ReadTreeDataset benchmark is a 100% read workload that fetches a tree dataset in Polaris:
3845

39-
- `org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDatasetSequential`: Performs read/update operations one at a time
40-
- `org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDatasetConcurrent`: Performs up to 20 read/update operations simultaneously
46+
- `org.apache.polaris.benchmarks.simulations.ReadTreeDataset`: Performs read-only operations to verify namespaces, tables, and views
47+
48+
This benchmark is intended to be used against a Polaris instance with a pre-existing tree dataset. It has no side effects on the dataset and can be executed multiple times without any issues.
4149

42-
These benchmarks can only be run after using CreateTreeDataset to populate the system.
4350

4451
## Parameters
4552

@@ -117,13 +124,19 @@ workload {
117124
Run benchmarks with your configuration:
118125

119126
```bash
120-
# Sequential dataset creation
121-
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.CreateTreeDatasetSequential \
127+
# Dataset creation
128+
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.CreateTreeDataset \
129+
-Dconfig.file=./application.conf
130+
131+
# Read/Update operations
132+
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDataset \
122133
-Dconfig.file=./application.conf
123134

124-
# Concurrent dataset creation
125-
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.CreateTreeDatasetConcurrent \
135+
# Read-only operations
136+
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.ReadTreeDataset \
126137
-Dconfig.file=./application.conf
138+
139+
127140
```
128141

129142
A message will show the location of the Gatling report:

benchmarks/src/gatling/resources/benchmark-defaults.conf

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,4 +140,38 @@ workload {
140140
# Number of property updates to perform per individual view
141141
# Default: 10
142142
updates-per-view = 10
143+
144+
145+
# Configuration for the ReadTreeDataset simulation
146+
read-tree-dataset {
147+
# Number of table operations to perform per second
148+
# Default: 20
149+
table-throughput = 20
150+
151+
# Number of view operations to perform per second
152+
# Default: 10
153+
view-throughput = 10
154+
}
155+
156+
# Configuration for the CreateTreeDataset simulation
157+
create-tree-dataset {
158+
# Number of table operations to perform per second
159+
# Default: 20
160+
table-throughput = 20
161+
162+
# Number of view operations to perform per second
163+
# Default: 10
164+
view-throughput = 10
165+
}
166+
167+
# Configuration for the ReadUpdateTreeDataset simulation
168+
read-update-tree-dataset {
169+
# Number of operations to perform per second
170+
# Default: 100
171+
throughput = 100
172+
173+
# Duration of the simulation in minutes
174+
# Default: 5
175+
duration-in-minutes = 5
176+
}
143177
}

benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/NAryTreeBuilder.scala

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,15 +67,14 @@ case class NAryTreeBuilder(nsWidth: Int, nsDepth: Int) {
6767
*
6868
* @return The total number of nodes in the tree.
6969
*/
70-
val numberOfNodes: Int = {
70+
val numberOfNodes: Int =
7171
// The sum of nodes from level 0 to level d-1 is (n^(d+1) - 1) / (n - 1) if n > 1
7272
// Else, the sum of nodes from level 0 to level d-1 is d
7373
if (nsWidth == 1) {
7474
nsDepth
7575
} else {
7676
((math.pow(nsWidth, nsDepth) - 1) / (nsWidth - 1)).toInt
7777
}
78-
}
7978

8079
/**
8180
* Returns a range of ordinals for the nodes on the last level of a complete n-ary tree.

benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/parameters/BenchmarkConfig.scala

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -38,13 +38,31 @@ object BenchmarkConfig {
3838
http.getString("base-url")
3939
)
4040

41-
val workloadParams = WorkloadParameters(
42-
workload.getDouble("read-write-ratio"),
43-
workload.getInt("updates-per-namespace"),
44-
workload.getInt("updates-per-table"),
45-
workload.getInt("updates-per-view"),
46-
workload.getLong("seed")
47-
)
41+
val workloadParams = {
42+
val rtdConfig = workload.getConfig("read-tree-dataset")
43+
val ctdConfig = workload.getConfig("create-tree-dataset")
44+
val rutdConfig = workload.getConfig("read-update-tree-dataset")
45+
46+
WorkloadParameters(
47+
workload.getDouble("read-write-ratio"),
48+
workload.getInt("updates-per-namespace"),
49+
workload.getInt("updates-per-table"),
50+
workload.getInt("updates-per-view"),
51+
workload.getLong("seed"),
52+
ReadTreeDatasetParameters(
53+
rtdConfig.getInt("table-throughput"),
54+
rtdConfig.getInt("view-throughput")
55+
),
56+
CreateTreeDatasetParameters(
57+
ctdConfig.getInt("table-throughput"),
58+
ctdConfig.getInt("view-throughput")
59+
),
60+
ReadUpdateTreeDatasetParameters(
61+
rutdConfig.getInt("throughput"),
62+
rutdConfig.getInt("duration-in-minutes")
63+
)
64+
)
65+
}
4866

4967
val datasetParams = DatasetParameters(
5068
dataset.getInt("num-catalogs"),
Lines changed: 13 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -17,25 +17,18 @@
1717
* under the License.
1818
*/
1919

20-
package org.apache.polaris.benchmarks.simulations
20+
package org.apache.polaris.benchmarks.parameters
2121

22-
import io.gatling.core.Predef._
23-
import io.gatling.http.Predef._
24-
25-
import scala.concurrent.duration.DurationInt
26-
27-
class ReadUpdateTreeDatasetSequential extends ReadUpdateTreeDataset {
28-
// --------------------------------------------------------------------------------
29-
// Build up the HTTP protocol configuration and set up the simulation
30-
// --------------------------------------------------------------------------------
31-
private val httpProtocol = http
32-
.baseUrl(cp.baseUrl)
33-
.acceptHeader("application/json")
34-
.contentTypeHeader("application/json")
35-
36-
setUp(
37-
authenticate
38-
.inject(atOnceUsers(1))
39-
.andThen(readWriteScenario.inject(constantUsersPerSec(1).during(5.minutes)))
40-
).protocols(httpProtocol)
22+
/**
23+
* Case class to hold the parameters for the CreateTreeDataset simulation.
24+
*
25+
* @param tableThroughput The number of table operations to perform per second.
26+
* @param viewThroughput The number of view operations to perform per second.
27+
*/
28+
case class CreateTreeDatasetParameters(
29+
tableThroughput: Int,
30+
viewThroughput: Int
31+
) {
32+
require(tableThroughput >= 0, "Table throughput cannot be negative")
33+
require(viewThroughput >= 0, "View throughput cannot be negative")
4134
}
Lines changed: 13 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -17,25 +17,18 @@
1717
* under the License.
1818
*/
1919

20-
package org.apache.polaris.benchmarks.simulations
20+
package org.apache.polaris.benchmarks.parameters
2121

22-
import io.gatling.core.Predef._
23-
import io.gatling.http.Predef._
24-
25-
import scala.concurrent.duration.DurationInt
26-
27-
class ReadUpdateTreeDatasetConcurrent extends ReadUpdateTreeDataset {
28-
// --------------------------------------------------------------------------------
29-
// Build up the HTTP protocol configuration and set up the simulation
30-
// --------------------------------------------------------------------------------
31-
private val httpProtocol = http
32-
.baseUrl(cp.baseUrl)
33-
.acceptHeader("application/json")
34-
.contentTypeHeader("application/json")
35-
36-
setUp(
37-
authenticate
38-
.inject(atOnceUsers(1))
39-
.andThen(readWriteScenario.inject(constantUsersPerSec(100).during(5.minutes).randomized))
40-
).protocols(httpProtocol)
22+
/**
23+
* Case class to hold the parameters for the ReadTreeDataset simulation.
24+
*
25+
* @param tableThroughput The number of table operations to perform per second.
26+
* @param viewThroughput The number of view operations to perform per second.
27+
*/
28+
case class ReadTreeDatasetParameters(
29+
tableThroughput: Int,
30+
viewThroughput: Int
31+
) {
32+
require(tableThroughput >= 0, "Table throughput cannot be negative")
33+
require(viewThroughput >= 0, "View throughput cannot be negative")
4134
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing,
13+
* software distributed under the License is distributed on an
14+
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
* KIND, either express or implied. See the License for the
16+
* specific language governing permissions and limitations
17+
* under the License.
18+
*/
19+
20+
package org.apache.polaris.benchmarks.parameters
21+
22+
/**
23+
* Case class to hold the parameters for the ReadUpdateTreeDataset simulation.
24+
*
25+
* @param throughput The number of operations to perform per second.
26+
* @param durationInMinutes The duration of the simulation in minutes.
27+
*/
28+
case class ReadUpdateTreeDatasetParameters(
29+
throughput: Int,
30+
durationInMinutes: Int
31+
) {
32+
require(throughput >= 0, "Throughput cannot be negative")
33+
require(durationInMinutes > 0, "Duration in minutes must be positive")
34+
}

benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/parameters/WorkloadParameters.scala

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,10 @@ case class WorkloadParameters(
2424
updatesPerNamespace: Int,
2525
updatesPerTable: Int,
2626
updatesPerView: Int,
27-
seed: Long
27+
seed: Long,
28+
readTreeDataset: ReadTreeDatasetParameters,
29+
createTreeDataset: CreateTreeDatasetParameters,
30+
readUpdateTreeDataset: ReadUpdateTreeDatasetParameters
2831
) {
2932
require(
3033
readWriteRatio >= 0.0 && readWriteRatio <= 1.0,

benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/simulations/CreateTreeDataset.scala

Lines changed: 58 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ package org.apache.polaris.benchmarks.simulations
2121

2222
import io.gatling.core.Predef._
2323
import io.gatling.core.structure.ScenarioBuilder
24+
import io.gatling.http.Predef._
2425
import org.apache.polaris.benchmarks.actions._
2526
import org.apache.polaris.benchmarks.parameters.BenchmarkConfig.config
2627
import org.apache.polaris.benchmarks.parameters.{
@@ -30,7 +31,8 @@ import org.apache.polaris.benchmarks.parameters.{
3031
}
3132
import org.slf4j.LoggerFactory
3233

33-
import java.util.concurrent.atomic.{AtomicInteger, AtomicReference}
34+
import java.util.concurrent.atomic.{AtomicBoolean, AtomicInteger, AtomicReference}
35+
import scala.concurrent.duration._
3436

3537
/**
3638
* This simulation is a 100% write workload that creates a tree dataset in Polaris. It is intended
@@ -51,6 +53,7 @@ class CreateTreeDataset extends Simulation {
5153
// --------------------------------------------------------------------------------
5254
private val numNamespaces: Int = dp.nAryTree.numberOfNodes
5355
private val accessToken: AtomicReference[String] = new AtomicReference()
56+
private val shouldRefreshToken: AtomicBoolean = new AtomicBoolean(true)
5457

5558
private val authenticationActions = AuthenticationActions(cp, accessToken, 5, Set(500))
5659
private val catalogActions = CatalogActions(dp, accessToken, 0, Set())
@@ -64,11 +67,31 @@ class CreateTreeDataset extends Simulation {
6467
private val createdViews = new AtomicInteger()
6568

6669
// --------------------------------------------------------------------------------
67-
// Workload: Authenticate and store the access token for later use
70+
// Authentication related workloads:
71+
// * Authenticate and store the access token for later use every minute
72+
// * Wait for an OAuth token to be available
73+
// * Stop the token refresh loop
6874
// --------------------------------------------------------------------------------
69-
val authenticate: ScenarioBuilder = scenario("Authenticate using the OAuth2 REST API endpoint")
70-
.feed(authenticationActions.feeder())
71-
.exec(authenticationActions.authenticateAndSaveAccessToken)
75+
val continuouslyRefreshOauthToken: ScenarioBuilder =
76+
scenario("Authenticate every minute using the Iceberg REST API")
77+
.asLongAs(_ => shouldRefreshToken.get()) {
78+
feed(authenticationActions.feeder())
79+
.exec(authenticationActions.authenticateAndSaveAccessToken)
80+
.pause(1.minute)
81+
}
82+
83+
val waitForAuthentication: ScenarioBuilder =
84+
scenario("Wait for the authentication token to be available")
85+
.asLongAs(_ => accessToken.get() == null) {
86+
pause(1.second)
87+
}
88+
89+
val stopRefreshingToken: ScenarioBuilder =
90+
scenario("Stop refreshing the authentication token")
91+
.exec { session =>
92+
shouldRefreshToken.set(false)
93+
session
94+
}
7295

7396
// --------------------------------------------------------------------------------
7497
// Workload: Create catalogs
@@ -118,4 +141,34 @@ class CreateTreeDataset extends Simulation {
118141
feed(viewActions.viewCreationFeeder())
119142
.exec(viewActions.createView)
120143
)
144+
145+
// --------------------------------------------------------------------------------
146+
// Build up the HTTP protocol configuration and set up the simulation
147+
// --------------------------------------------------------------------------------
148+
private val httpProtocol = http
149+
.baseUrl(cp.baseUrl)
150+
.acceptHeader("application/json")
151+
.contentTypeHeader("application/json")
152+
153+
// Get the configured throughput for tables and views
154+
private val tableThroughput = wp.createTreeDataset.tableThroughput
155+
private val viewThroughput = wp.createTreeDataset.viewThroughput
156+
157+
setUp(
158+
continuouslyRefreshOauthToken.inject(atOnceUsers(1)).protocols(httpProtocol),
159+
waitForAuthentication
160+
.inject(atOnceUsers(1))
161+
.andThen(createCatalogs.inject(atOnceUsers(1)).protocols(httpProtocol))
162+
.andThen(
163+
createNamespaces
164+
.inject(
165+
constantUsersPerSec(1).during(1.seconds),
166+
constantUsersPerSec(dp.nsWidth - 1).during(dp.nsDepth.seconds)
167+
)
168+
.protocols(httpProtocol)
169+
)
170+
.andThen(createTables.inject(atOnceUsers(tableThroughput)).protocols(httpProtocol))
171+
.andThen(createViews.inject(atOnceUsers(viewThroughput)).protocols(httpProtocol))
172+
.andThen(stopRefreshingToken.inject(atOnceUsers(1)).protocols(httpProtocol))
173+
)
121174
}

0 commit comments

Comments
 (0)