Skip to content

Commit c53e51d

Browse files
authored
feat(connector): Upgrade AWS Glue to AWS SDK v2 and Migrate to MetricPublisher (#26670)
## Description <!---Describe your changes in detail--> This PR upgrades the Glue client to AWS SDK v2 and updates the GlueHiveMetastore used by the Presto Lakehouse connectors accordingly. ### Key Changes - Migrates Glue client usage from SDK v1 to v2. - In SDK v1, the async client extended the sync client. In SDK v2, the async client exposes only async methods returning CompletableFuture. - Added a utility method to safely block on async calls (join() / get()) for synchronous use cases. - Replaces v1 RequestMetricCollector with v2 MetricPublisher for metric collection. - Updates metric names and structures where necessary. ## Motivation and Context <!---Why is this change required? What problem does it solve?--> <!---If it fixes an open issue, please link to the issue here.--> #26668 and #25529 ## Impact <!---Describe any public API or user-facing feature change or any performance impact--> `hive.metastore.glue.pin-client-to-current-region` is deprecated as AWS Region SDK v2 no longer supports the Regions.getCurrentRegion API. The current region will be inferred automatically if Presto is running on an EC2 machine. ## Test Plan <!---Please fill in how you tested your change--> Sample JMX metrics output for the new metrics published via the MetricPublisher ``` presto:imjalpreet_db> select "awsapicallduration.alltime.avg", "awsbackoffdelayduration.alltime.avg", "awsservicecallduration.alltime.avg", "awsrequestcount.totalcount", "awsretrycount.totalcount", "awsthrottleexceptions.totalc ount" from jmx.current."com.facebook.presto.hive.metastore.glue:name=ahana_oss,type=gluehivemetastore"; awsapicallduration.alltime.avg | awsbackoffdelayduration.alltime.avg | awsservicecallduration.alltime.avg | awsrequestcount.totalcount | awsretrycount.totalcount | awsthrottleexceptions.totalcount --------------------------------+-------------------------------------+------------------------------------+----------------------------+--------------------------+---------------------------------- 643.0344827586207 | 0.0 | 640.1724137931035 | 29 | 0 | 0 (1 row) Query 20251210_203809_00008_z6sib, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) [Latency: client-side: 0:01, server-side: 0:01] [1 rows, 48B] [1 rows/s, 69B/s] ``` Ran the TestHiveClientGlueMetastore suite with AWS Glue, and below are the test results [TestHiveClientGlueMetastore.html](https://github.com/user-attachments/files/24088278/TestHiveClientGlueMetastore.html) ## Contributor checklist - [x] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards). - [x] PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced. - [x] Documented new properties (with its default value), SQL syntax, functions, or other functionality. - [x] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines). - [x] Adequate tests were added if applicable. - [x] CI passed. - [ ] If adding new dependencies, verified they have an [OpenSSF Scorecard](https://securityscorecards.dev/#the-checks) score of 5.0 or higher (or obtained explicit TSC approval for lower scores). ## Release Notes Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below. ``` == RELEASE NOTES == Hive Connector Changes * Upgrade AWS Glue Client to AWS SDK v2 Iceberg Connector Changes * Upgrade AWS Glue Client to AWS SDK v2 Delta Connector Changes * Upgrade AWS Glue Client to AWS SDK v2 Hudi Connector Changes * Upgrade AWS Glue Client to AWS SDK v2 ```
1 parent b096c68 commit c53e51d

File tree

15 files changed

+1107
-713
lines changed

15 files changed

+1107
-713
lines changed

presto-hive-common/pom.xml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,11 @@
3636
<artifactId>log</artifactId>
3737
</dependency>
3838

39+
<dependency>
40+
<groupId>com.facebook.airlift</groupId>
41+
<artifactId>stats</artifactId>
42+
</dependency>
43+
3944
<dependency>
4045
<groupId>com.google.errorprone</groupId>
4146
<artifactId>error_prone_annotations</artifactId>
@@ -97,6 +102,21 @@
97102
<artifactId>jakarta.inject-api</artifactId>
98103
</dependency>
99104

105+
<dependency>
106+
<groupId>software.amazon.awssdk</groupId>
107+
<artifactId>metrics-spi</artifactId>
108+
</dependency>
109+
110+
<dependency>
111+
<groupId>software.amazon.awssdk</groupId>
112+
<artifactId>sdk-core</artifactId>
113+
</dependency>
114+
115+
<dependency>
116+
<groupId>org.weakref</groupId>
117+
<artifactId>jmxutils</artifactId>
118+
</dependency>
119+
100120
<dependency>
101121
<groupId>com.facebook.presto</groupId>
102122
<artifactId>presto-hdfs-core</artifactId>
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
/*
2+
* Licensed under the Apache License, Version 2.0 (the "License");
3+
* you may not use this file except in compliance with the License.
4+
* You may obtain a copy of the License at
5+
*
6+
* http://www.apache.org/licenses/LICENSE-2.0
7+
*
8+
* Unless required by applicable law or agreed to in writing, software
9+
* distributed under the License is distributed on an "AS IS" BASIS,
10+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11+
* See the License for the specific language governing permissions and
12+
* limitations under the License.
13+
*/
14+
package com.facebook.presto.hive.aws.metrics;
15+
16+
import com.facebook.airlift.stats.CounterStat;
17+
import com.facebook.airlift.stats.TimeStat;
18+
import com.google.errorprone.annotations.ThreadSafe;
19+
import org.weakref.jmx.Managed;
20+
import org.weakref.jmx.Nested;
21+
import software.amazon.awssdk.metrics.MetricCollection;
22+
import software.amazon.awssdk.metrics.MetricPublisher;
23+
24+
import java.time.Duration;
25+
26+
import static java.time.Duration.ZERO;
27+
import static java.util.Objects.requireNonNull;
28+
import static java.util.concurrent.TimeUnit.MILLISECONDS;
29+
import static software.amazon.awssdk.core.internal.metrics.SdkErrorType.THROTTLING;
30+
import static software.amazon.awssdk.core.metrics.CoreMetric.API_CALL_DURATION;
31+
import static software.amazon.awssdk.core.metrics.CoreMetric.BACKOFF_DELAY_DURATION;
32+
import static software.amazon.awssdk.core.metrics.CoreMetric.ERROR_TYPE;
33+
import static software.amazon.awssdk.core.metrics.CoreMetric.RETRY_COUNT;
34+
import static software.amazon.awssdk.core.metrics.CoreMetric.SERVICE_CALL_DURATION;
35+
36+
/**
37+
* For reference on AWS SDK v2 Metrics: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/metrics-list.html
38+
* Metrics Publisher: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/metrics.html
39+
*/
40+
@ThreadSafe
41+
public final class AwsSdkClientStats
42+
{
43+
private final CounterStat awsRequestCount = new CounterStat();
44+
private final CounterStat awsRetryCount = new CounterStat();
45+
private final CounterStat awsThrottleExceptions = new CounterStat();
46+
private final TimeStat awsServiceCallDuration = new TimeStat(MILLISECONDS);
47+
private final TimeStat awsApiCallDuration = new TimeStat(MILLISECONDS);
48+
private final TimeStat awsBackoffDelayDuration = new TimeStat(MILLISECONDS);
49+
50+
@Managed
51+
@Nested
52+
public CounterStat getAwsRequestCount()
53+
{
54+
return awsRequestCount;
55+
}
56+
57+
@Managed
58+
@Nested
59+
public CounterStat getAwsRetryCount()
60+
{
61+
return awsRetryCount;
62+
}
63+
64+
@Managed
65+
@Nested
66+
public CounterStat getAwsThrottleExceptions()
67+
{
68+
return awsThrottleExceptions;
69+
}
70+
71+
@Managed
72+
@Nested
73+
public TimeStat getAwsServiceCallDuration()
74+
{
75+
return awsServiceCallDuration;
76+
}
77+
78+
@Managed
79+
@Nested
80+
public TimeStat getAwsApiCallDuration()
81+
{
82+
return awsApiCallDuration;
83+
}
84+
85+
@Managed
86+
@Nested
87+
public TimeStat getAwsBackoffDelayDuration()
88+
{
89+
return awsBackoffDelayDuration;
90+
}
91+
92+
public AwsSdkClientRequestMetricsPublisher newRequestMetricsPublisher()
93+
{
94+
return new AwsSdkClientRequestMetricsPublisher(this);
95+
}
96+
97+
public static class AwsSdkClientRequestMetricsPublisher
98+
implements MetricPublisher
99+
{
100+
private final AwsSdkClientStats stats;
101+
102+
protected AwsSdkClientRequestMetricsPublisher(AwsSdkClientStats stats)
103+
{
104+
this.stats = requireNonNull(stats, "stats is null");
105+
}
106+
107+
@Override
108+
public void publish(MetricCollection metricCollection)
109+
{
110+
long requestCount = metricCollection.metricValues(RETRY_COUNT)
111+
.stream()
112+
.map(i -> i + 1)
113+
.reduce(Integer::sum).orElse(0);
114+
115+
stats.awsRequestCount.update(requestCount);
116+
117+
long retryCount = metricCollection.metricValues(RETRY_COUNT)
118+
.stream()
119+
.reduce(Integer::sum).orElse(0);
120+
121+
stats.awsRetryCount.update(retryCount);
122+
123+
long throttleExceptions = metricCollection
124+
.childrenWithName("ApiCallAttempt")
125+
.flatMap(mc -> mc.metricValues(ERROR_TYPE).stream())
126+
.filter(s -> s.equals(THROTTLING.toString()))
127+
.count();
128+
129+
stats.awsThrottleExceptions.update(throttleExceptions);
130+
131+
Duration serviceCallDuration = metricCollection
132+
.childrenWithName("ApiCallAttempt")
133+
.flatMap(mc -> mc.metricValues(SERVICE_CALL_DURATION).stream())
134+
.reduce(Duration::plus).orElse(ZERO);
135+
136+
stats.awsServiceCallDuration.add(serviceCallDuration.toMillis(), MILLISECONDS);
137+
138+
Duration apiCallDuration = metricCollection
139+
.metricValues(API_CALL_DURATION)
140+
.stream().reduce(Duration::plus).orElse(ZERO);
141+
142+
stats.awsApiCallDuration.add(apiCallDuration.toMillis(), MILLISECONDS);
143+
144+
Duration backoffDelayDuration = metricCollection
145+
.childrenWithName("ApiCallAttempt")
146+
.flatMap(mc -> mc.metricValues(BACKOFF_DELAY_DURATION).stream())
147+
.reduce(Duration::plus).orElse(ZERO);
148+
149+
stats.awsBackoffDelayDuration.add(backoffDelayDuration.toMillis(), MILLISECONDS);
150+
}
151+
152+
@Override
153+
public void close()
154+
{
155+
}
156+
}
157+
}

presto-hive-metastore/pom.xml

Lines changed: 54 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -124,13 +124,63 @@
124124
</dependency>
125125

126126
<dependency>
127-
<groupId>com.amazonaws</groupId>
128-
<artifactId>aws-java-sdk-glue</artifactId>
127+
<groupId>software.amazon.awssdk</groupId>
128+
<artifactId>auth</artifactId>
129129
</dependency>
130130

131131
<dependency>
132-
<groupId>com.amazonaws</groupId>
133-
<artifactId>aws-java-sdk-sts</artifactId>
132+
<groupId>software.amazon.awssdk</groupId>
133+
<artifactId>aws-core</artifactId>
134+
</dependency>
135+
136+
<dependency>
137+
<groupId>software.amazon.awssdk</groupId>
138+
<artifactId>glue</artifactId>
139+
</dependency>
140+
141+
<dependency>
142+
<groupId>software.amazon.awssdk</groupId>
143+
<artifactId>http-client-spi</artifactId>
144+
</dependency>
145+
146+
<dependency>
147+
<groupId>software.amazon.awssdk</groupId>
148+
<artifactId>metrics-spi</artifactId>
149+
</dependency>
150+
151+
<dependency>
152+
<groupId>software.amazon.awssdk</groupId>
153+
<artifactId>netty-nio-client</artifactId>
154+
</dependency>
155+
156+
<dependency>
157+
<groupId>software.amazon.awssdk</groupId>
158+
<artifactId>regions</artifactId>
159+
</dependency>
160+
161+
<dependency>
162+
<groupId>software.amazon.awssdk</groupId>
163+
<artifactId>sdk-core</artifactId>
164+
</dependency>
165+
166+
<dependency>
167+
<groupId>software.amazon.awssdk</groupId>
168+
<artifactId>sts</artifactId>
169+
</dependency>
170+
171+
<dependency>
172+
<groupId>software.amazon.awssdk</groupId>
173+
<artifactId>utils</artifactId>
174+
</dependency>
175+
176+
<dependency>
177+
<groupId>software.amazon.awssdk</groupId>
178+
<artifactId>retries-spi</artifactId>
179+
</dependency>
180+
181+
<dependency>
182+
<groupId>software.amazon.awssdk</groupId>
183+
<artifactId>retries</artifactId>
134184
</dependency>
135185

136186
<dependency>

presto-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/glue/GlueCatalogApiStats.java

Lines changed: 6 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,6 @@
1313
*/
1414
package com.facebook.presto.hive.metastore.glue;
1515

16-
import com.amazonaws.AmazonWebServiceRequest;
17-
import com.amazonaws.handlers.AsyncHandler;
1816
import com.facebook.airlift.stats.CounterStat;
1917
import com.facebook.airlift.stats.TimeStat;
2018
import com.google.errorprone.annotations.ThreadSafe;
@@ -24,6 +22,7 @@
2422
import java.util.function.Supplier;
2523

2624
import static java.util.concurrent.TimeUnit.MILLISECONDS;
25+
import static java.util.concurrent.TimeUnit.NANOSECONDS;
2726

2827
@ThreadSafe
2928
public class GlueCatalogApiStats
@@ -53,23 +52,12 @@ public void record(Runnable action)
5352
}
5453
}
5554

56-
public <R extends AmazonWebServiceRequest, T> AsyncHandler<R, T> metricsAsyncHandler()
55+
public void recordAsync(long executionTimeNanos, boolean failed)
5756
{
58-
return new AsyncHandler<R, T>() {
59-
private final TimeStat.BlockTimer timer = time.time();
60-
@Override
61-
public void onError(Exception exception)
62-
{
63-
timer.close();
64-
recordException(exception);
65-
}
66-
67-
@Override
68-
public void onSuccess(R request, T result)
69-
{
70-
timer.close();
71-
}
72-
};
57+
time.add(executionTimeNanos, NANOSECONDS);
58+
if (failed) {
59+
totalFailures.update(1);
60+
}
7361
}
7462

7563
@Managed

0 commit comments

Comments
 (0)