You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/connector-hive.md
+108-2Lines changed: 108 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: jianleishen
6
6
ms.subservice: data-movement
7
7
ms.custom: synapse
8
8
ms.topic: conceptual
9
-
ms.date: 09/12/2024
9
+
ms.date: 05/27/2025
10
10
ms.author: jianleishen
11
11
---
12
12
@@ -15,6 +15,9 @@ ms.author: jianleishen
15
15
16
16
This article outlines how to use the Copy Activity in an Azure Data Factory or Synapse Analytics pipeline to copy data from Hive. It builds on the [copy activity overview](copy-activity-overview.md) article that presents a general overview of copy activity.
17
17
18
+
> [!IMPORTANT]
19
+
> The Hive connector version 2.0 (Preview) provides improved native Hive support. If you are using the Hive connector version 1.0 in your solution, please [upgrade your Hive connector](#upgrade-the-hive-connector) before **September 30, 2025**. Refer to this [section](#differences-between-hive-version-20-and-version-10) for details on the difference between version 2.0 (Preview) and version 1.0.
20
+
18
21
## Supported capabilities
19
22
20
23
This Hive connector is supported for the following capabilities:
@@ -71,7 +74,65 @@ The following sections provide details about properties that are used to define
71
74
72
75
## Linked service properties
73
76
74
-
The following properties are supported for Hive linked service:
77
+
The Hive connector now supports version 2.0 (Preview). Refer to this [section](#upgrade-the-hive-connector) to upgrade your Hive connector version from version 1.0. For the property details, see the corresponding sections.
78
+
79
+
-[Version 2.0 (Preview)](#version-20)
80
+
-[Version 1.0](#version-10)
81
+
82
+
### <aname="version-20"></a> Version 2.0 (Preview)
83
+
84
+
The Hive linked service supports the following properties when apply version 2.0 (Preview):
85
+
86
+
| Property | Description | Required |
87
+
|:--- |:--- |:--- |
88
+
| type | The type property must be set to: **Hive**| Yes |
89
+
| version | The version that you specify. The value is `2.0`. | Yes |
90
+
| host | IP address or host name of the Hive server. | Yes |
91
+
| port | The TCP port that the Hive server uses to listen for client connections. If you connect to Azure HDInsight, specify port as 443. | Yes |
92
+
| serverType | The type of Hive server. <br/>Allowed value is: **HiveServer2**| No |
93
+
| thriftTransportProtocol | The transport protocol to use in the Thrift layer. <br/>Allowed value are: **Binary**, **SASL**, **HTTP**| No |
94
+
| authenticationType | The authentication method used to access the Hive server. <br/>Allowed values are: **Anonymous**, **UsernameAndPassword**, **WindowsAzureHDInsightService**. Kerberos authentication is not supported now. | Yes |
95
+
| username | The user name that you use to access Hive Server. | No |
96
+
| password | The password corresponding to the user. Mark this field as a SecureString to store it securely, or [reference a secret stored in Azure Key Vault](store-credentials-in-key-vault.md). | No |
97
+
| httpPath | The partial URL corresponding to the Hive server. | No |
98
+
| enableSsl | Specifies whether the connections to the server are encrypted using TLS. The default value is true. | No |
99
+
| enableServerCertificateValidation | Specify whether to enable server SSL certificate validation when you connect. Always use System Trust Store. The default value is true. | No |
100
+
| storageReference | A reference to the linked service of the storage account used for staging data in mapping data flow. This is required only when using the Hive linked service in mapping data flow. | No |
101
+
| connectVia | The [Integration Runtime](concepts-integration-runtime.md) to be used to connect to the data store. Learn more from [Prerequisites](#prerequisites) section. If not specified, it uses the default Azure Integration Runtime. |No |
The following properties are supported for Hive linked service when apply version 1.0:
75
136
76
137
| Property | Description | Required |
77
138
|:--- |:--- |:--- |
@@ -241,10 +302,55 @@ source(
241
302
a. Check the setting "hive.resultset.use.unique.column.names" in Hive server side and set it to false.
242
303
b. Use column mapping to rename the column name.
243
304
305
+
## Data type mapping for Hive
306
+
307
+
When you copy data from and to Hive, the following interim data type mappings are used within the service. To learn about how the copy activity maps the source schema and data type to the sink, see [Schema and data type mappings](copy-activity-schema-and-type-mapping.md).
308
+
309
+
| Hive data type | Interim service data type (for version 2.0 (Preview)) | Interim service data type (for version 1.0) |
310
+
|:--- |:--- |:--- |
311
+
| TINYINT | Sbyte | Int16 |
312
+
| SMALLINT | Int16 | Int16 |
313
+
| INT | Int32 | Int32 |
314
+
| BIGINT | Int32 | Int64 |
315
+
| BOOLEAN |Boolean | Boolean |
316
+
| FLOAT | Single | Single |
317
+
| DOUBLE | Double | Double |
318
+
| DECIMAL | Decimal | Decimal |
319
+
| STRING | String | String |
320
+
| VARCHAR | String | String |
321
+
| CHAR | String | String |
322
+
| TIMESTAMP |DateTimeOffset | DateTime |
323
+
| DATE | DateTime | DateTime |
324
+
| BINARY | Byte[]| Byte[]|
325
+
| ARRAY | String | String |
326
+
| MAP | String | String |
327
+
| STRUCT | String | String |
328
+
244
329
## Lookup activity properties
245
330
246
331
To learn details about the properties, check [Lookup activity](control-flow-lookup-activity.md).
247
332
333
+
## Upgrade the Hive connector
334
+
335
+
Here are steps that help you upgrade the Hive connector:
336
+
337
+
1. In **Edit linked service** page, select version 2.0 (Preview) and configure the linked service by referring to [Linked service properties version 2.0](#version-20).
338
+
339
+
2. The data type mapping for the Hive linked service version 2.0 (Preview) is different from that for the version 1.0. To learn the latest data type mapping, see [Data type mapping for Hive](#data-type-mapping-for-hive).
340
+
341
+
## <aname="differences-between-hive-version-20-and-version-10"></a> Differences between Hive version 2.0 (Preview) and version 1.0
342
+
343
+
The Hive connector version 2.0 (Preview) offers new functionalities and is compatible with most features of version 1.0. The following table shows the feature differences between version 2.0 (Preview) and version 1.0.
344
+
345
+
| Version 2.0 (Preview) | Version 1.0 |
346
+
|:--- |:--- |
347
+
| Using ';' to separate multiple hosts (only when serviceDiscoveryMode is enabled) is not supported.| Using ';' to separate multiple hosts (only when serviceDiscoveryMode is enabled) is supported.|
348
+
| HiveServer1 and HiveThriftServer are not supported for `ServerType`. | Support HiveServer1 and HiveThriftServer for `ServerType`. |
349
+
| Username authentication type is not supported. <br><br>SASL transport protocol only supports UsernameAndPassword authentication type. Binary transport protocol only supports Anonymous authentication type. | Support Username authentication type. <br><br>SASL and Binary transport protocols support Anonymous, Username, UsernameAndPassword and WindowsAzureHDInsightService authentication types. |
350
+
|`serviceDiscoveryMode`, `zooKeeperNameSpace` and `useNativeQuery` are not supported. |`serviceDiscoveryMode`, `zooKeeperNameSpace`, `useNativeQuery` are supported. |
351
+
| The default value of `enableSSL` is true. `trustedCertPath`, `useSystemTrustStore`, `allowHostNameCNMismatch` and `allowSelfSignedServerCert` are not supported.<br><br>`enableServerCertificateValidation` is supported.| The default value of `enableSSL` is false. `trustedCertPath`, `useSystemTrustStore`, `allowHostNameCNMismatch` and `allowSelfSignedServerCert` are supported.<br><br>`enableServerCertificateValidation` is not supported. |
352
+
| The following mappings are used from Hive data types to interim service data type.<br><br>TINYINT -> SByte<br>TIMESTAMP -> DateTimeOffset | The following mappings are used from Hive data types to interim service data type.<br><br>TINYINT -> Int16 <br>TIMESTAMP -> DateTime |
353
+
248
354
249
355
## Related content
250
356
For a list of data stores supported as sources and sinks by the copy activity, see [supported data stores](copy-activity-overview.md#supported-data-stores-and-formats).
0 commit comments