Skip to content

Commit 2d9103c

Browse files
Merge pull request #89 from MaterializeInc/swap_follow_up_improvements
Follow ups after enabling swap
2 parents ccb42c8 + 17a63ac commit 2d9103c

File tree

10 files changed

+109
-161
lines changed

10 files changed

+109
-161
lines changed

README.md

Lines changed: 29 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -61,41 +61,8 @@ You can also set the `AWS_PROFILE` environment variable to the name of the profi
6161
export AWS_PROFILE=your-profile-name
6262
```
6363

64-
## Disk Support for Materialize
65-
66-
This module supports configuring disk support for Materialize using NVMe instance storage and OpenEBS and lgalloc.
67-
68-
When using disk support, you need to use instance types from the `r7gd` or `r6gd` family or other instance types with NVMe instance storage.
69-
70-
### Enabling Disk Support
71-
72-
To enable disk support with default settings:
73-
74-
```hcl
75-
enable_disk_support = true
76-
```
77-
78-
This will:
79-
1. Install OpenEBS via Helm
80-
2. Configure NVMe instance store volumes using the bootstrap script
81-
3. Create appropriate storage classes for Materialize
82-
8364
### Advanced Configuration
8465

85-
In case that you need more control over the disk setup:
86-
87-
```hcl
88-
enable_disk_support = true
89-
90-
disk_support_config = {
91-
openebs_version = "4.3.3"
92-
storage_class_name = "custom-storage-class"
93-
storage_class_parameters = {
94-
volgroup = "custom-volume-group"
95-
}
96-
}
97-
```
98-
9966
## `materialize_instances` variable
10067

10168
The `materialize_instances` variable is a list of objects that define the configuration for each Materialize instance.
@@ -139,11 +106,11 @@ These flags configure default limits for clusters, connections, and tables. You
139106
| <a name="module_certificates"></a> [certificates](#module\_certificates) | ./modules/certificates | n/a |
140107
| <a name="module_database"></a> [database](#module\_database) | ./modules/database | n/a |
141108
| <a name="module_eks"></a> [eks](#module\_eks) | ./modules/eks | n/a |
109+
| <a name="module_materialize_node_group"></a> [materialize\_node\_group](#module\_materialize\_node\_group) | ./modules/eks-node-group | n/a |
142110
| <a name="module_networking"></a> [networking](#module\_networking) | ./modules/networking | n/a |
143111
| <a name="module_nlb"></a> [nlb](#module\_nlb) | ./modules/nlb | n/a |
144112
| <a name="module_operator"></a> [operator](#module\_operator) | github.com/MaterializeInc/terraform-helm-materialize | v0.1.35 |
145113
| <a name="module_storage"></a> [storage](#module\_storage) | ./modules/storage | n/a |
146-
| <a name="module_swap_node_group"></a> [swap\_node\_group](#module\_swap\_node\_group) | ./modules/eks-node-group | n/a |
147114

148115
## Resources
149116

@@ -191,17 +158,15 @@ These flags configure default limits for clusters, connections, and tables. You
191158
| <a name="input_kubernetes_namespace"></a> [kubernetes\_namespace](#input\_kubernetes\_namespace) | The Kubernetes namespace for the Materialize resources | `string` | `"materialize-environment"` | no |
192159
| <a name="input_log_group_name_prefix"></a> [log\_group\_name\_prefix](#input\_log\_group\_name\_prefix) | Prefix for the CloudWatch log group name (will be combined with environment name) | `string` | `"materialize"` | no |
193160
| <a name="input_materialize_instances"></a> [materialize\_instances](#input\_materialize\_instances) | Configuration for Materialize instances. Due to limitations in Terraform, `materialize_instances` cannot be defined on the first `terraform apply`. | <pre>list(object({<br/> name = string<br/> namespace = optional(string)<br/> database_name = string<br/> environmentd_version = optional(string)<br/> cpu_request = optional(string, "1")<br/> memory_request = optional(string, "1Gi")<br/> memory_limit = optional(string, "1Gi")<br/> create_database = optional(bool, true)<br/> create_nlb = optional(bool, true)<br/> internal_nlb = optional(bool, true)<br/> enable_cross_zone_load_balancing = optional(bool, true)<br/> in_place_rollout = optional(bool, false)<br/> request_rollout = optional(string)<br/> force_rollout = optional(string)<br/> balancer_memory_request = optional(string, "256Mi")<br/> balancer_memory_limit = optional(string, "256Mi")<br/> balancer_cpu_request = optional(string, "100m")<br/> license_key = optional(string)<br/> authenticator_kind = optional(string, "None")<br/> external_login_password_mz_system = optional(string)<br/> environmentd_extra_args = optional(list(string), [])<br/> }))</pre> | `[]` | no |
161+
| <a name="input_materialize_node_group_desired_size"></a> [materialize\_node\_group\_desired\_size](#input\_materialize\_node\_group\_desired\_size) | Desired number of worker nodes | `number` | `2` | no |
162+
| <a name="input_materialize_node_group_instance_types"></a> [materialize\_node\_group\_instance\_types](#input\_materialize\_node\_group\_instance\_types) | Instance types for worker nodes.<br/><br/>Recommended Configuration for Running Materialize with disk:<br/>- Tested instance types: `r6gd`, `r7gd` families (ARM-based Graviton instances)<br/>- Enable disk setup when using `r7gd`<br/>- Note: Ensure instance store volumes are available and attached to the nodes for optimal performance with disk-based workloads. | `list(string)` | <pre>[<br/> "r7gd.2xlarge"<br/>]</pre> | no |
163+
| <a name="input_materialize_node_group_max_size"></a> [materialize\_node\_group\_max\_size](#input\_materialize\_node\_group\_max\_size) | Maximum number of worker nodes | `number` | `4` | no |
164+
| <a name="input_materialize_node_group_min_size"></a> [materialize\_node\_group\_min\_size](#input\_materialize\_node\_group\_min\_size) | Minimum number of worker nodes | `number` | `1` | no |
194165
| <a name="input_metrics_retention_days"></a> [metrics\_retention\_days](#input\_metrics\_retention\_days) | Number of days to retain CloudWatch metrics | `number` | `7` | no |
195166
| <a name="input_namespace"></a> [namespace](#input\_namespace) | Namespace for all resources, usually the organization or project name | `string` | n/a | yes |
196167
| <a name="input_network_id"></a> [network\_id](#input\_network\_id) | The ID of the VPC in which resources will be deployed. Only used if create\_vpc is false. | `string` | `""` | no |
197168
| <a name="input_network_private_subnet_ids"></a> [network\_private\_subnet\_ids](#input\_network\_private\_subnet\_ids) | A list of private subnet IDs in the VPC. Only used if create\_vpc is false. | `list(string)` | `[]` | no |
198169
| <a name="input_network_public_subnet_ids"></a> [network\_public\_subnet\_ids](#input\_network\_public\_subnet\_ids) | A list of public subnet IDs in the VPC. Only used if create\_vpc is false. | `list(string)` | `[]` | no |
199-
| <a name="input_node_group_ami_type"></a> [node\_group\_ami\_type](#input\_node\_group\_ami\_type) | AMI type for the node group | `string` | `"AL2023_ARM_64_STANDARD"` | no |
200-
| <a name="input_node_group_capacity_type"></a> [node\_group\_capacity\_type](#input\_node\_group\_capacity\_type) | Capacity type for worker nodes (ON\_DEMAND or SPOT) | `string` | `"ON_DEMAND"` | no |
201-
| <a name="input_node_group_desired_size"></a> [node\_group\_desired\_size](#input\_node\_group\_desired\_size) | Desired number of worker nodes | `number` | `2` | no |
202-
| <a name="input_node_group_instance_types"></a> [node\_group\_instance\_types](#input\_node\_group\_instance\_types) | Instance types for worker nodes.<br/><br/>Recommended Configuration for Running Materialize with disk:<br/>- Tested instance types: `r6gd`, `r7gd` families (ARM-based Graviton instances)<br/>- Enable disk setup when using `r7gd`<br/>- Note: Ensure instance store volumes are available and attached to the nodes for optimal performance with disk-based workloads. | `list(string)` | <pre>[<br/> "r7gd.2xlarge"<br/>]</pre> | no |
203-
| <a name="input_node_group_max_size"></a> [node\_group\_max\_size](#input\_node\_group\_max\_size) | Maximum number of worker nodes | `number` | `4` | no |
204-
| <a name="input_node_group_min_size"></a> [node\_group\_min\_size](#input\_node\_group\_min\_size) | Minimum number of worker nodes | `number` | `1` | no |
205170
| <a name="input_operator_namespace"></a> [operator\_namespace](#input\_operator\_namespace) | Namespace for the Materialize operator | `string` | `"materialize"` | no |
206171
| <a name="input_operator_version"></a> [operator\_version](#input\_operator\_version) | Version of the Materialize operator to install | `string` | `null` | no |
207172
| <a name="input_orchestratord_version"></a> [orchestratord\_version](#input\_orchestratord\_version) | Version of the Materialize orchestrator to install | `string` | `null` | no |
@@ -210,7 +175,10 @@ These flags configure default limits for clusters, connections, and tables. You
210175
| <a name="input_public_subnet_cidrs"></a> [public\_subnet\_cidrs](#input\_public\_subnet\_cidrs) | CIDR blocks for public subnets | `list(string)` | <pre>[<br/> "10.0.101.0/24",<br/> "10.0.102.0/24",<br/> "10.0.103.0/24"<br/>]</pre> | no |
211176
| <a name="input_service_account_name"></a> [service\_account\_name](#input\_service\_account\_name) | Name of the service account | `string` | `"12345678-1234-1234-1234-123456789012"` | no |
212177
| <a name="input_single_nat_gateway"></a> [single\_nat\_gateway](#input\_single\_nat\_gateway) | Use a single NAT Gateway for all private subnets | `bool` | `false` | no |
213-
| <a name="input_swap_enabled"></a> [swap\_enabled](#input\_swap\_enabled) | Enable swap for Materialize. When enabled, this configures swap on a new nodepool, and adds it to the clusterd node selectors. | `bool` | `false` | no |
178+
| <a name="input_system_node_group_desired_size"></a> [system\_node\_group\_desired\_size](#input\_system\_node\_group\_desired\_size) | Desired number of worker nodes | `number` | `2` | no |
179+
| <a name="input_system_node_group_instance_types"></a> [system\_node\_group\_instance\_types](#input\_system\_node\_group\_instance\_types) | Instance types for system nodes. | `list(string)` | <pre>[<br/> "r7g.xlarge"<br/>]</pre> | no |
180+
| <a name="input_system_node_group_max_size"></a> [system\_node\_group\_max\_size](#input\_system\_node\_group\_max\_size) | Maximum number of worker nodes | `number` | `4` | no |
181+
| <a name="input_system_node_group_min_size"></a> [system\_node\_group\_min\_size](#input\_system\_node\_group\_min\_size) | Minimum number of worker nodes | `number` | `1` | no |
214182
| <a name="input_tags"></a> [tags](#input\_tags) | Default tags to apply to all resources | `map(string)` | <pre>{<br/> "Environment": "dev",<br/> "Project": "materialize",<br/> "Terraform": "true"<br/>}</pre> | no |
215183
| <a name="input_use_local_chart"></a> [use\_local\_chart](#input\_use\_local\_chart) | Whether to use a local chart instead of one from a repository | `bool` | `false` | no |
216184
| <a name="input_use_self_signed_cluster_issuer"></a> [use\_self\_signed\_cluster\_issuer](#input\_use\_self\_signed\_cluster\_issuer) | Whether to install and use a self-signed ClusterIssuer for TLS. To work around limitations in Terraform, this will be treated as `false` if no materialize instances are defined. | `bool` | `true` | no |
@@ -263,6 +231,26 @@ More advanced TLS support using user-provided CAs or per-Materialize `Issuer`s a
263231

264232
## Upgrade Notes
265233

234+
#### v0.7.0
235+
236+
This is an intermediate version to handle some changes that must be applied in stages.
237+
It is recommended to upgrade to v0.8.x after upgrading to this version.
238+
239+
Breaking changes:
240+
* Swap is enabled by default.
241+
* Support for lgalloc, our legacy spill to disk mechanism, is deprecated, and will be removed in the next version.
242+
* We now always use two node groups, one for system workloads and one for Materialize workloads.
243+
* Variables for configuring these node groups have been renamed, so they may be configured separately.
244+
245+
To avoid downtime when upgrading to future versions, you must perform a rollout at this version.
246+
1. Ensure your `environmentd_version` is at least `v26.0.0`.
247+
2. Update your `request_rollout` (and `force_rollout` if already at the correct `environmentd_version`).
248+
3. Run `terraform apply`.
249+
250+
You must upgrade to at least v0.6.x before upgrading to v0.7.0 of this terraform code.
251+
252+
It is strongly recommended to have enabled swap on v0.6.x before upgrading to v0.7.0 or higher.
253+
266254
#### v0.6.1
267255

268256
We now have some initial support for swap.

docs/footer.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,26 @@ More advanced TLS support using user-provided CAs or per-Materialize `Issuer`s a
2525

2626
## Upgrade Notes
2727

28+
#### v0.7.0
29+
30+
This is an intermediate version to handle some changes that must be applied in stages.
31+
It is recommended to upgrade to v0.8.x after upgrading to this version.
32+
33+
Breaking changes:
34+
* Swap is enabled by default.
35+
* Support for lgalloc, our legacy spill to disk mechanism, is deprecated, and will be removed in the next version.
36+
* We now always use two node groups, one for system workloads and one for Materialize workloads.
37+
* Variables for configuring these node groups have been renamed, so they may be configured separately.
38+
39+
To avoid downtime when upgrading to future versions, you must perform a rollout at this version.
40+
1. Ensure your `environmentd_version` is at least `v26.0.0`.
41+
2. Update your `request_rollout` (and `force_rollout` if already at the correct `environmentd_version`).
42+
3. Run `terraform apply`.
43+
44+
You must upgrade to at least v0.6.x before upgrading to v0.7.0 of this terraform code.
45+
46+
It is strongly recommended to have enabled swap on v0.6.x before upgrading to v0.7.0 or higher.
47+
2848
#### v0.6.1
2949

3050
We now have some initial support for swap.

docs/header.md

Lines changed: 0 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -60,41 +60,8 @@ You can also set the `AWS_PROFILE` environment variable to the name of the profi
6060
export AWS_PROFILE=your-profile-name
6161
```
6262

63-
## Disk Support for Materialize
64-
65-
This module supports configuring disk support for Materialize using NVMe instance storage and OpenEBS and lgalloc.
66-
67-
When using disk support, you need to use instance types from the `r7gd` or `r6gd` family or other instance types with NVMe instance storage.
68-
69-
### Enabling Disk Support
70-
71-
To enable disk support with default settings:
72-
73-
```hcl
74-
enable_disk_support = true
75-
```
76-
77-
This will:
78-
1. Install OpenEBS via Helm
79-
2. Configure NVMe instance store volumes using the bootstrap script
80-
3. Create appropriate storage classes for Materialize
81-
8263
### Advanced Configuration
8364

84-
In case that you need more control over the disk setup:
85-
86-
```hcl
87-
enable_disk_support = true
88-
89-
disk_support_config = {
90-
openebs_version = "4.3.3"
91-
storage_class_name = "custom-storage-class"
92-
storage_class_parameters = {
93-
volgroup = "custom-volume-group"
94-
}
95-
}
96-
```
97-
9865
## `materialize_instances` variable
9966

10067
The `materialize_instances` variable is a list of objects that define the configuration for each Materialize instance.

docs/operator-setup.md

Lines changed: 0 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -24,42 +24,6 @@ Verify the connection:
2424
kubectl get nodes
2525
```
2626

27-
## (Optional) Storage Configuration
28-
29-
The Materialize Operator requires fast, locally-attached NVMe storage for optimal performance. We'll set up OpenEBS with LVM Local PV for managing local volumes.
30-
31-
1. Install OpenEBS:
32-
```bash
33-
# Add the OpenEBS Helm repository
34-
helm repo add openebs https://openebs.github.io/openebs
35-
helm repo update
36-
37-
# Install OpenEBS with only Local PV enabled
38-
helm install openebs --namespace openebs openebs/openebs \
39-
--set engines.replicated.mayastor.enabled=false \
40-
--create-namespace
41-
```
42-
43-
2. Verify the installation:
44-
```bash
45-
kubectl get pods -n openebs -l role=openebs-lvm
46-
```
47-
48-
### LVM Configuration for AWS Bottlerocket nodes
49-
50-
TODO: Add more detailed instructions for setting up LVM on Bottlerocket nodes.
51-
52-
If you're using the recommended Bottlerocket AMI with the Terraform module, the LVM configuration needs to be done through the Bottlerocket bootstrap container. This is automatically handled by the EKS module using the provided user data script.
53-
54-
To verify the LVM setup:
55-
```bash
56-
kubectl debug -it node/<node-name> --image=amazonlinux:2
57-
chroot /host
58-
lvs
59-
```
60-
61-
You should see a volume group named `instance-store-vg`.
62-
6327
## Install the Materialize Operator
6428

6529
The Materialize Operator is installed automatically when you set the following in your Terraform configuration:

examples/simple/main.tf

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -50,15 +50,19 @@ module "materialize_infrastructure" {
5050
single_nat_gateway = true
5151

5252
# EKS Configuration
53-
cluster_version = "1.32"
54-
node_group_instance_types = ["r7gd.2xlarge"]
55-
node_group_desired_size = 1
56-
node_group_min_size = 1
57-
node_group_max_size = 2
58-
node_group_capacity_type = "ON_DEMAND"
59-
enable_cluster_creator_admin_permissions = true
53+
cluster_version = "1.32"
54+
55+
system_node_group_instance_types = ["r7gd.2xlarge"]
56+
system_node_group_desired_size = 2
57+
system_node_group_min_size = 1
58+
system_node_group_max_size = 2
6059

61-
swap_enabled = var.swap_enabled
60+
materialize_node_group_instance_types = ["r7gd.2xlarge"]
61+
materialize_node_group_desired_size = 1
62+
materialize_node_group_min_size = 1
63+
materialize_node_group_max_size = 2
64+
65+
enable_cluster_creator_admin_permissions = true
6266

6367
# Storage Configuration
6468
bucket_force_destroy = true

examples/simple/variables.tf

Lines changed: 0 additions & 5 deletions
This file was deleted.

0 commit comments

Comments
 (0)