|
1 | 1 | ---
|
2 | 2 | hidden: true
|
3 |
| -title: Upgrading EMR Clusters |
| 3 | +title: Updating EMR Clusters |
4 | 4 | ---
|
5 | 5 | {% include content/plan-grid.md name="data-lakes" %}
|
6 | 6 |
|
7 |
| -# Upgrading EMR Clusters |
8 |
| -This document contains the instructions to manually update an existing Segment |
9 |
| -Data Lake destination to use a new v5.33.0 EMR cluster. The Segment Data Lake on the new version will continue to use the Glue data catalog you have previously configured. |
| 7 | +# Updating EMR Clusters |
| 8 | +You can manually update an existing Segment Data Lake destination to use a v5.33.0 EMR cluster. |
| 9 | +The Segment Data Lake on the new version will continue to use the Glue data catalog you have previously configured. |
10 | 10 |
|
11 |
| -By updating your EMR cluster from 5.27.0 to 5.33.0, you can participate in [AWS Lake Formation](https://aws.amazon.com/lake-formation/?whats-new-cards.sort-by=item.additionalFields.postDateTime&whats-new-cards.sort-order=desc). Clusters running version 5.33.0 also allow for faster Parquet jobs and dynamic auto-scaling. |
| 11 | +By updating your EMR cluster from 5.27.0 to 5.33.0, you can participate in [AWS Lake Formation](https://aws.amazon.com/lake-formation/?whats-new-cards.sort-by=item.additionalFields.postDateTime&whats-new-cards.sort-order=desc), use dynamic auto-scaling, and experience faster Parquet jobs. |
12 | 12 |
|
13 | 13 | > info""
|
14 |
| -> Your Segment Data Lake does not need to be disabled during the upgrade process, and any ongoing syncs will complete on the old cluster. Any syncs that fail while you are setting up a new EMR cluster will be restarted on the new cluster. |
| 14 | +> Your Segment Data Lake does not need to be disabled during the update process, and any ongoing syncs will complete on the old cluster. Any syncs that fail while you are setting up a new EMR cluster will be restarted on the new cluster. |
15 | 15 |
|
16 | 16 | ## Prerequisites
|
17 | 17 | * S3 bucket with a lifecycle rule of 14 days
|
18 |
| -* An EMR cluster version 5.33.0 (for help creating an v 5.33.0 EMR cluster, please see [Configure the Data Lakes AWS Environment](data-lakes-manual-setup.md)) |
| 18 | +* An EMR v5.33.0 cluster (for instructions on creating an EMR cluster, please see [Configure the Data Lakes AWS Environment](data-lakes-manual-setup.md)) |
19 | 19 |
|
20 | 20 | ## Procedure
|
21 |
| -1. Open your Segment App workspace and select your Data Lakes destination. |
22 |
| -2. On the Settings tab, select EMR Cluster ID field and enter the ID of your new EMR cluster. For more information about your EMR Cluster, please see Amazon's [View cluster status and details](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-manage-view-clusters.html) documentation. <br/> |
23 |
| -**Note:** Your Glue Catalog ID, IAM Role ARN, and Glue database name should remain the same. |
| 21 | +1. Open your Segment App workspace and select the Data Lakes destination. |
| 22 | +2. On the Settings tab, select EMR Cluster ID field and enter the ID of your v5.33.0 EMR cluster. For help finding the cluster ID, please see Amazon's [View cluster status and details](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-manage-view-clusters.html) documentation. <br/> |
| 23 | +**Note:** Your Glue Catalog ID, IAM Role ARN, and Glue database name fields in Segment should remain the same. |
24 | 24 | 3. Select **Save**.
|
25 |
| -4. You can delete your old EMR cluster from AWS when the following conditions have been met: |
| 25 | +4. View the EMR cluster in the AWS EMR Clusters page to verify the cluster is working correctly. |
| 26 | +5. Delete your v5.27.0 EMR cluster from AWS after the following conditions have been met: |
26 | 27 | * You have updated all Data Lakes to use the EMR cluster
|
27 | 28 | * A sync has successfully completed in the new cluster
|
28 | 29 | * Data is synced into the new cluster
|
|
0 commit comments