diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/README.md b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/README.md new file mode 100644 index 0000000..8e7663f --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/README.md @@ -0,0 +1,190 @@ +# Azure VNet injection Workspace Setup Guide (with existing VNet) + +## Requirements + +- Terraform is installed on your local machine: [link](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli#install-terraform) +- Azure CLI is installed on your local machine: [Mac](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew) or [Windows](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows?view=azure-cli-latest&pivots=winget) +- Azure CLI configured with appropriate credentials +- Databricks account created +- Databricks account admin access +- Contributor rights to your Azure subscription (Contributor rights on the resource group level are not sufficient, as Databricks provisioning creates resources in a separate managed resource group, which requires subscription-level access.) + +## Before you begin + +In this deployment, we define key configuration values, such as subscription ID, resource group location, CIDR block, asset naming, and others, as variables. This keeps our code organized and makes it easy to adjust settings without changing the core infrastructure definitions. You can choose to define these variables directly or reference them from a separate configuration file for better modularity. In this document, we will create a configuration file to store them separately (`terraform.tfvars.example`). + +## Authenticate the Azure CLI + +### Option 1: Interactive user login (for users) + +```sh +az login +``` + +This command opens a browser for user authentication, and it is commonly referred to as U2M (User-to-machine) authentication. This command is sufficient for all operations in this document. + +### Option 2: Service principal login (for automation, CI/CD) + +Choose this option if you want to deploy the Terraform script to a Git repository and integrate it into your CI/CD processes after completing this guide. It is the recommended approach for automation in non-interactive environments such as pipelines or scripts. + +Steps to Create a Service Principal via Azure CLI: + +1. Log in to Azure via Azure CLI + +```sh +az login +``` + +This command opens a browser to authenticate your Azure user account. + +2. (Optional) Choose the Target Subscription + +If you have multiple subscriptions, set your target subscription: + +```sh +az account set --subscription "" +``` + +You can find your subscription ID with: + +```sh +az account show +``` + +3. Create the Service Principal +Use the following command to create a service principal, specifying the name, role, and scope: + +```sh +az ad sp create-for-rbac --name "" --role --scopes /subscriptions/ +``` + +- ``: Desired service principal name. +- ``: e.g. Contributor, Reader, Owner. +- ``: Your Azure Subscription ID. + +The command outputs JSON with appId, password, and tenant. + +**Important**: Save the password (client secret) immediately; you cannot retrieve it later. + +4. Use the Newly Created SP Credentials + +You can now use the output values: +- `appId` for the username +- `password` as the client secret +- `tenant` as the tenant ID + +For authentication in automation (like CI/CD or scripts), use: + +```sh +az login --service-principal -u -p --tenant +``` + +For more information on creating a Service Principal, visit the [following link](https://learn.microsoft.com/en-us/cli/azure/azure-cli-sp-tutorial-1?view=azure-cli-latest&tabs=bash). + + +## General Requirements for VNet +Before proceeding, ensure your VNet meets the following requirements: + +- The VNet must be in the same Azure region and subscription as the Databricks workspace. +- The address space for the VNet must use a CIDR block between /16 and /24. + +## Variables + +If you want Terraform to automatically load values for variables from a file, the file must be named either `terraform.tfvars`, `terraform.tfvars.json`, or end with `.auto.tfvars` or `.auto.tfvars.json`. If your file has a custom name (like `random_name.tfvars`), you must provide it explicitly using the `-var-file` flag when running Terraform commands. + +You can use the `terraform.tfvars.example` file as a base for your variables. Leter renaming this file to `terraform.tfvars` will automatically load the values for the variables. + +### List of variables + +- azure_subscription_id + - Your Azure Subscription ID +- resource_group_name + - The name of the resource group where the Databricks Workspace will be deployed +- tags + - A map of tags to assign to the resources +- workspace_name + - The name of the Databricks workspace +- root_storage_name + - The name of the root storage account. Can only consist of lowercase letters and numbers, and must be between 3 and 24 characters long. +- location + - The Azure region to deploy the workspace to. See [supported regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions). +- vnet_name + - The name of the existing virtual network +- cidr + - The CIDR address of the virtual network +- vnet_resource_group_name + - The name of the resource group where the existing VNet is located +- subnet_public_cidr + - The CIDR address of the first subnet +- subnet_private_cidr + - The CIDR address of the second subnet + +Additional variables for Customer Managed Key encryption: +- managed_services_cmk_key_vault_id + - The Key Vault ID of the CMK for managed services encryption +- managed_services_cmk_key_vault_key_id + - The CMK ID for managed services encryption +- managed_disk_cmk_key_vault_id + - The Key Vault ID of the CMK for managed disks encryption +- managed_disk_cmk_key_vault_key_id + - The CMK ID for managed disks encryption + + +## Deploy + +```bash +# Initialize Terraform +terraform init + +# Review the execution plan +terraform plan + +# Apply the configuration +terraform apply +``` + +Occasionally, you'll be asked to confirm certain actions; type yes when prompted. The deployment typically takes 10-15 minutes. Once the execution finishes, the terminal will output the URL of the created workspace. + +## Access Your Workspace + +After successful deployment: +```bash +# Get the workspace URL +terraform output workspace_url + +# Get the workspace ID +terraform output workspace_id +``` + +Navigate to the workspace URL and log in with your Databricks credentials. + +## File Structure + +This project uses a flat, organized structure with purpose-specific files instead of a monolithic `main.tf`: + +``` +tf/ +├── azure.tf # Azure resources +├── databricks.tf # Databricks workspace +├── network.tf # VNet, subnets, and networking +├── outputs.tf # All output values +├── providers.tf # Provider configurations (Azure) +├── terraform.tfvars.example # Configuration template +├── variables.tf # All input variable definitions +``` + +**Note:** There is no `main.tf` file in this project. Instead, resources are organized into descriptive, purpose-specific files. + +Terraform will automatically load all `.tf` files in the directory, so the absence of `main.tf` doesn't affect functionality. + + +## Terraform template examples and more documentation: + +Keep in mind that the git code is not always up to date. You should use these templates as an example and not directly copy and paste. Please note that the code in the template projects is provided for your exploration only and is not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS, and we do not make any guarantees of any kind. + +- [Deploy with Private Link](https://github.com/databricks/terraform-databricks-examples/tree/main/examples/adb-with-private-link-standard) +- [Security Reference Architecture Template](https://github.com/databricks/terraform-databricks-sra/tree/main/azure) + - This is a template that adheres to the best security practices we recommend. +- [Terraform Databricks provider documentation](https://registry.terraform.io/providers/databricks/databricks/latest/docs) +- [Configure a workspace with VNet injection](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject) + diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/.gitignore b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/.gitignore new file mode 100644 index 0000000..78e7733 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/.gitignore @@ -0,0 +1,44 @@ +# Local .terraform directories +.terraform/ + +# .tfstate files +*.tfstate +*.tfstate.* + +# Crash log files +crash.log +crash.*.log + +# Exclude all .tfvars files, which are likely to contain sensitive data, such as +# password, private keys, and other secrets. These should not be part of version +# control as they are data points which are potentially sensitive and subject +# to change depending on the environment. +*.tfvars +*.tfvars.json + +# Ignore override files as they are usually used to override resources locally and so +# are not checked in +override.tf +override.tf.json +*_override.tf +*_override.tf.json + +# Ignore transient lock info files created by terraform apply +.terraform.tfstate.lock.info + +# Include override files you do wish to add to version control using negated pattern +# !example_override.tf + +# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan +# example: *tfplan* + +# Ignore CLI configuration files +.terraformrc +terraform.rc + +# Optional: ignore graph output files generated by `terraform graph` +# *.dot + +# Optional: ignore plan files saved before destroying Terraform configuration +# Uncomment the line below if you want to ignore planout files. +# planout \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/azure.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/azure.tf new file mode 100644 index 0000000..8b355bb --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/azure.tf @@ -0,0 +1,5 @@ +resource "azurerm_resource_group" "this" { + name = var.resource_group_name + location = var.location + tags = var.tags +} \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/databricks.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/databricks.tf new file mode 100644 index 0000000..4885a37 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/databricks.tf @@ -0,0 +1,28 @@ +resource "azurerm_databricks_workspace" "this" { + name = var.workspace_name + resource_group_name = azurerm_resource_group.this.name + location = azurerm_resource_group.this.location + sku = "premium" + tags = var.tags + +# Customer managed key (CMK) configuration +# customer_managed_key_enabled = true +# managed_services_cmk_key_vault_id = var.managed_services_cmk_key_vault_id +# managed_services_cmk_key_vault_key_id = var.managed_services_cmk_key_vault_key_id +# managed_disk_cmk_key_vault_id = var.managed_disk_cmk_key_vault_id +# managed_disk_cmk_key_vault_key_id = var.managed_disk_cmk_key_vault_key_id + + custom_parameters { + virtual_network_id = data.azurerm_virtual_network.existing.id + private_subnet_name = azurerm_subnet.private.name + public_subnet_name = azurerm_subnet.public.name + public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id + private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id + storage_account_name = var.root_storage_name + } + + depends_on = [ + azurerm_subnet_network_security_group_association.public, + azurerm_subnet_network_security_group_association.private + ] +} \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/network.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/network.tf new file mode 100644 index 0000000..e7f5c83 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/network.tf @@ -0,0 +1,66 @@ +locals { + network_prefix = var.workspace_name +} + +data "azurerm_virtual_network" "existing" { + name = var.vnet_name + resource_group_name = var.vnet_resource_group_name +} + +data "azurerm_resource_group" "vnet_resource_group" { + name = var.vnet_resource_group_name +} + + +resource "azurerm_network_security_group" "this" { + name = "${local.network_prefix}-nsg" + location = data.azurerm_resource_group.vnet_resource_group.location + resource_group_name = data.azurerm_resource_group.vnet_resource_group.name + tags = var.tags +} + +resource "azurerm_subnet" "public" { + name = "${local.network_prefix}-public-subnet" + resource_group_name = data.azurerm_resource_group.vnet_resource_group.name + virtual_network_name = data.azurerm_virtual_network.existing.name + address_prefixes = [var.subnet_public_cidr] + + delegation { + name = "databricks" + service_delegation { + name = "Microsoft.Databricks/workspaces" + actions = [ + "Microsoft.Network/virtualNetworks/subnets/join/action", + "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", + "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] + } + } +} + +resource "azurerm_subnet_network_security_group_association" "public" { + subnet_id = azurerm_subnet.public.id + network_security_group_id = azurerm_network_security_group.this.id +} + +resource "azurerm_subnet" "private" { + name = "${local.network_prefix}-private-subnet" + resource_group_name = data.azurerm_resource_group.vnet_resource_group.name + virtual_network_name = data.azurerm_virtual_network.existing.name + address_prefixes = [var.subnet_private_cidr] + + delegation { + name = "databricks" + service_delegation { + name = "Microsoft.Databricks/workspaces" + actions = [ + "Microsoft.Network/virtualNetworks/subnets/join/action", + "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", + "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] + } + } +} + +resource "azurerm_subnet_network_security_group_association" "private" { + subnet_id = azurerm_subnet.private.id + network_security_group_id = azurerm_network_security_group.this.id +} \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/outputs.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/outputs.tf new file mode 100644 index 0000000..6a5fd47 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/outputs.tf @@ -0,0 +1,7 @@ +output "databricks_workspace_id" { + value = azurerm_databricks_workspace.this.id +} + +output "workspace_url" { + value = "https://${azurerm_databricks_workspace.this.workspace_url}/" +} \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/providers.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/providers.tf new file mode 100644 index 0000000..55433cb --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/providers.tf @@ -0,0 +1,12 @@ +terraform { + required_providers { + azurerm = { + source = "hashicorp/azurerm" + } + } +} + +provider "azurerm" { + subscription_id = var.azure_subscription_id + features {} +} diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/terraform.tfvars.example b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/terraform.tfvars.example new file mode 100644 index 0000000..55dc6bd --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/terraform.tfvars.example @@ -0,0 +1,37 @@ +# ============================================================================= +# Azure Configuration +# ============================================================================= + +azure_subscription_id = "xxx" +resource_group_name = "new-resource-group" +tags = { + "Owner": "Name Of Owner" +} + +# ============================================================================= +# Databricks Workspace Configuration +# ============================================================================= + +workspace_name = "databricks-workspace" +root_storage_name = "databricksrootstorage" +location = "West Europe" + +# ============================================================================= +# Network Configuration +# ============================================================================= + +vnet_name = "existing-vnet-name" +cidr = "10.0.0.0/16" +vnet_resource_group_name = "existing-vnet-resource-group" +subnet_public_cidr = "10.0.1.0/24" +subnet_private_cidr = "10.0.2.0/24" + +# ============================================================================= +# Customer Managed Keys Configuration +# ============================================================================= + +# managed_services_cmk_key_vault_id = "" +# managed_services_cmk_key_vault_key_id = "" +# managed_disk_cmk_key_vault_id = "" +# managed_disk_cmk_key_vault_key_id = "" + diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/variables.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/variables.tf new file mode 100644 index 0000000..5509449 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-existing-vnet/tf/variables.tf @@ -0,0 +1,92 @@ +# ============================================================================= +# Azure Configuration +# ============================================================================= + +variable "azure_subscription_id" { + description = "Your Azure Subscription ID" + type = string +} + +variable "resource_group_name" { + description = "The name of the resource group" + type = string +} + +variable "tags" { + description = "A map of tags to assign to the resources" + type = map(string) + default = {} +} + +# ============================================================================= +# Databricks Workspace Configuration +# ============================================================================= + +variable "workspace_name" { + description = "The name of the Databricks workspace" + type = string +} + +variable "root_storage_name" { + description = "The name of the root storage account" + type = string +} + +variable "location" { + description = "The Azure region to deploy the workspace to" + type = string +} + +# ============================================================================= +# Network Configuration +# ============================================================================= + +variable "vnet_name" { + description = "The name of the virtual network" + type = string +} + +variable "cidr" { + description = "The CIDR address of the virtual network" + type = string + default = "10.0.0.0/20" +} + +variable "vnet_resource_group_name" { + description = "The name of the resource group where the existing VNet is located" + type = string +} + +variable "subnet_public_cidr" { + description = "The CIDR address of the first subnet" + type = string +} + +variable "subnet_private_cidr" { + description = "The CIDR address of the second subnet" + type = string +} + +# ============================================================================= +# Customer Managed Keys Configuration +# ============================================================================= + +# variable "managed_services_cmk_key_vault_id" { +# description = "The Key Vault ID of the CMK for managed services encryption" +# type = string +# } + +# variable "managed_services_cmk_key_vault_key_id" { +# description = "The CMK ID for managed services encryption" +# type = string +# } + +# variable "managed_disk_cmk_key_vault_id" { +# description = "The Key Vault ID of the CMK for managed disks encryption" +# type = string +# } + +# variable "managed_disk_cmk_key_vault_key_id" { +# description = "The CMK ID for managed disks encryption" +# type = string +# } \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/README.md b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/README.md new file mode 100644 index 0000000..6ca03a4 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/README.md @@ -0,0 +1,177 @@ +# Azure VNet injection Workspace Setup Guide (with VNet deployment) + +## Requirements + +- Terraform is installed on your local machine: [link](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli#install-terraform) +- Azure CLI is installed on your local machine: [Mac](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew) or [Windows](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows?view=azure-cli-latest&pivots=winget) +- Azure CLI configured with appropriate credentials +- Databricks account created +- Databricks account admin access +- Contributor rights to your Azure subscription (Contributor rights on the resource group level are not sufficient, as Databricks provisioning creates resources in a separate managed resource group, which requires subscription-level access.) + +## Before you begin + +In this deployment, we define key configuration values, such as subscription ID, resource group location, CIDR block, asset naming, and others, as variables. This keeps our code organized and makes it easy to adjust settings without changing the core infrastructure definitions. You can choose to define these variables directly or reference them from a separate configuration file for better modularity. In this document, we will create a configuration file to store them separately (`terraform.tfvars.example`). + +## Authenticate the Azure CLI + +### Option 1: Interactive user login (for users) + +```sh +az login +``` + +This command opens a browser for user authentication, and it is commonly referred to as U2M (User-to-machine) authentication. This command is sufficient for all operations in this document. + +### Option 2: Service principal login (for automation, CI/CD) + +Choose this option if you want to deploy the Terraform script to a Git repository and integrate it into your CI/CD processes after completing this guide. It is the recommended approach for automation in non-interactive environments such as pipelines or scripts. + +Steps to Create a Service Principal via Azure CLI: + +1. Log in to Azure via Azure CLI + +```sh +az login +``` + +This command opens a browser to authenticate your Azure user account. + +2. (Optional) Choose the Target Subscription + +If you have multiple subscriptions, set your target subscription: + +```sh +az account set --subscription "" +``` + +You can find your subscription ID with: + +```sh +az account show +``` + +3. Create the Service Principal +Use the following command to create a service principal, specifying the name, role, and scope: + +```sh +az ad sp create-for-rbac --name "" --role --scopes /subscriptions/ +``` + +- ``: Desired service principal name. +- ``: e.g. Contributor, Reader, Owner. +- ``: Your Azure Subscription ID. + +The command outputs JSON with appId, password, and tenant. + +**Important**: Save the password (client secret) immediately; you cannot retrieve it later. + +4. Use the Newly Created SP Credentials + +You can now use the output values: +- `appId` for the username +- `password` as the client secret +- `tenant` as the tenant ID + +For authentication in automation (like CI/CD or scripts), use: + +```sh +az login --service-principal -u -p --tenant +``` + +For more information on creating a Service Principal, visit the [following link](https://learn.microsoft.com/en-us/cli/azure/azure-cli-sp-tutorial-1?view=azure-cli-latest&tabs=bash). + + +## General Requirements for VNet +Before proceeding, ensure your VNet meets the following requirements: + +- The address space for the VNet must use a CIDR block between /16 and /24. + +## Variables + +If you want Terraform to automatically load values for variables from a file, the file must be named either `terraform.tfvars`, `terraform.tfvars.json`, or end with `.auto.tfvars` or `.auto.tfvars.json`. If your file has a custom name (like `random_name.tfvars`), you must provide it explicitly using the `-var-file` flag when running Terraform commands. + +You can use the `terraform.tfvars.example` file as a base for your variables. Leter renaming this file to `terraform.tfvars` will automatically load the values for the variables. + +### List of variables + +- azure_subscription_id + - Your Azure Subscription ID +- resource_group_name + - The name of the resource group where the Databricks Workspace will be deployed +- tags + - A map of tags to assign to the resources +- workspace_name + - The name of the Databricks workspace +- root_storage_name + - The name of the root storage account. Can only consist of lowercase letters and numbers, and must be between 3 and 24 characters long. +- location + - The Azure region to deploy the workspace to. See [supported regions](https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions). +- vnet_name + - The name of the virtual network +- cidr + - The CIDR address of the virtual network +- subnet_public_cidr + - The CIDR address of the first subnet +- subnet_private_cidr + - The CIDR address of the second subnet + + +## Deploy + +```bash +# Initialize Terraform +terraform init + +# Review the execution plan +terraform plan + +# Apply the configuration +terraform apply +``` + +Occasionally, you'll be asked to confirm certain actions; type yes when prompted. The deployment typically takes 10-15 minutes. Once the execution finishes, the terminal will output the URL of the created workspace. + +## Access Your Workspace + +After successful deployment: +```bash +# Get the workspace URL +terraform output workspace_url + +# Get the workspace ID +terraform output workspace_id +``` + +Navigate to the workspace URL and log in with your Databricks credentials. + +## File Structure + +This project uses a flat, organized structure with purpose-specific files instead of a monolithic `main.tf`: + +``` +tf/ +├── azure.tf # Azure resources +├── databricks.tf # Databricks workspace +├── network.tf # VNet, subnets, and networking +├── outputs.tf # All output values +├── providers.tf # Provider configurations (Azure) +├── terraform.tfvars.example # Configuration template +├── variables.tf # All input variable definitions +``` + +**Note:** There is no `main.tf` file in this project. Instead, resources are organized into descriptive, purpose-specific files. + +Terraform will automatically load all `.tf` files in the directory, so the absence of `main.tf` doesn't affect functionality. + + +## Terraform template examples and more documentation: + +Keep in mind that the git code is not always up to date. You should use these templates as an example and not directly copy and paste. Please note that the code in the template projects is provided for your exploration only and is not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS, and we do not make any guarantees of any kind. + +- [Deploy with Private Link](https://github.com/databricks/terraform-databricks-examples/tree/main/examples/adb-with-private-link-standard) +- [Security Reference Architecture Template](https://github.com/databricks/terraform-databricks-sra/tree/main/azure) + - This is a template that adheres to the best security practices we recommend. +- [Terraform Databricks provider documentation](https://registry.terraform.io/providers/databricks/databricks/latest/docs) +- [Configure a workspace with VNet injection](https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject) + diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/.gitignore b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/.gitignore new file mode 100644 index 0000000..78e7733 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/.gitignore @@ -0,0 +1,44 @@ +# Local .terraform directories +.terraform/ + +# .tfstate files +*.tfstate +*.tfstate.* + +# Crash log files +crash.log +crash.*.log + +# Exclude all .tfvars files, which are likely to contain sensitive data, such as +# password, private keys, and other secrets. These should not be part of version +# control as they are data points which are potentially sensitive and subject +# to change depending on the environment. +*.tfvars +*.tfvars.json + +# Ignore override files as they are usually used to override resources locally and so +# are not checked in +override.tf +override.tf.json +*_override.tf +*_override.tf.json + +# Ignore transient lock info files created by terraform apply +.terraform.tfstate.lock.info + +# Include override files you do wish to add to version control using negated pattern +# !example_override.tf + +# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan +# example: *tfplan* + +# Ignore CLI configuration files +.terraformrc +terraform.rc + +# Optional: ignore graph output files generated by `terraform graph` +# *.dot + +# Optional: ignore plan files saved before destroying Terraform configuration +# Uncomment the line below if you want to ignore planout files. +# planout \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/azure.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/azure.tf new file mode 100644 index 0000000..8b355bb --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/azure.tf @@ -0,0 +1,5 @@ +resource "azurerm_resource_group" "this" { + name = var.resource_group_name + location = var.location + tags = var.tags +} \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/databricks.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/databricks.tf new file mode 100644 index 0000000..5817b19 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/databricks.tf @@ -0,0 +1,22 @@ +resource "azurerm_databricks_workspace" "this" { + name = var.workspace_name + resource_group_name = azurerm_resource_group.this.name + location = azurerm_resource_group.this.location + sku = "premium" + tags = var.tags + + custom_parameters { + virtual_network_id = azurerm_virtual_network.this.id + private_subnet_name = azurerm_subnet.private.name + public_subnet_name = azurerm_subnet.public.name + public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id + private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id + storage_account_name = var.root_storage_name + no_public_ip = true + } + + depends_on = [ + azurerm_subnet_network_security_group_association.public, + azurerm_subnet_network_security_group_association.private + ] +} \ No newline at end of file diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/network.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/network.tf new file mode 100644 index 0000000..62d8d80 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/network.tf @@ -0,0 +1,102 @@ +locals { + network_prefix = var.workspace_name +} + +resource "azurerm_virtual_network" "this" { + name = "${local.network_prefix}-vnet" + location = azurerm_resource_group.this.location + resource_group_name = azurerm_resource_group.this.name + address_space = [var.cidr] + tags = var.tags +} + +resource "azurerm_network_security_group" "this" { + name = "${local.network_prefix}-nsg" + location = azurerm_resource_group.this.location + resource_group_name = azurerm_resource_group.this.name + tags = var.tags +} + +resource "azurerm_subnet" "public" { + name = "${local.network_prefix}-public-subnet" + resource_group_name = azurerm_resource_group.this.name + virtual_network_name = azurerm_virtual_network.this.name + address_prefixes = [var.subnet_public_cidr] + default_outbound_access_enabled = false + + delegation { + name = "databricks" + service_delegation { + name = "Microsoft.Databricks/workspaces" + actions = [ + "Microsoft.Network/virtualNetworks/subnets/join/action", + "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", + "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] + } + } +} + +resource "azurerm_subnet_network_security_group_association" "public" { + subnet_id = azurerm_subnet.public.id + network_security_group_id = azurerm_network_security_group.this.id +} + +resource "azurerm_subnet" "private" { + name = "${local.network_prefix}-private-subnet" + resource_group_name = azurerm_resource_group.this.name + virtual_network_name = azurerm_virtual_network.this.name + address_prefixes = [var.subnet_private_cidr] + default_outbound_access_enabled = false + + delegation { + name = "databricks" + service_delegation { + name = "Microsoft.Databricks/workspaces" + actions = [ + "Microsoft.Network/virtualNetworks/subnets/join/action", + "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", + "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] + } + } +} + +resource "azurerm_subnet_network_security_group_association" "private" { + subnet_id = azurerm_subnet.private.id + network_security_group_id = azurerm_network_security_group.this.id +} + +################################################ + +resource "azurerm_public_ip" "this" { + name = "${local.network_prefix}-public-ip" + resource_group_name = azurerm_resource_group.this.name + location = azurerm_resource_group.this.location + allocation_method = "Static" + zones = ["1"] + tags = var.tags +} + +resource "azurerm_nat_gateway" "this" { + name = "${local.network_prefix}-nat-gateway" + resource_group_name = azurerm_resource_group.this.name + location = azurerm_resource_group.this.location + sku_name = "Standard" + idle_timeout_in_minutes = 10 + zones = ["1"] + tags = var.tags +} + +resource "azurerm_nat_gateway_public_ip_association" "this" { + nat_gateway_id = azurerm_nat_gateway.this.id + public_ip_address_id = azurerm_public_ip.this.id +} + +resource "azurerm_subnet_nat_gateway_association" "public_nat_gateway_association" { + subnet_id = azurerm_subnet.public.id + nat_gateway_id = azurerm_nat_gateway.this.id +} + +resource "azurerm_subnet_nat_gateway_association" "private_nat_gateway_association" { + subnet_id = azurerm_subnet.private.id + nat_gateway_id = azurerm_nat_gateway.this.id +} diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/outputs.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/outputs.tf new file mode 100644 index 0000000..76fd6ad --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/outputs.tf @@ -0,0 +1,58 @@ +# ============================================================================= +# Databricks Workspace Outputs +# ============================================================================= + +output "databricks_workspace_id" { + description = "ID of the Databricks workspace" + value = azurerm_databricks_workspace.this.id +} + +output "workspace_url" { + description = "URL of the Databricks workspace" + value = "https://${azurerm_databricks_workspace.this.workspace_url}/" +} + +# ============================================================================= +# Network Outputs +# ============================================================================= + +output "nat_gateway_public_ip" { + description = "Public IP of the NAT Gateway" + value = azurerm_public_ip.this.ip_address +} + +output "vnet_id" { + description = "ID of the VNet used for the workspace" + value = azurerm_virtual_network.this.id +} + +output "private_subnet_id" { + description = "ID of the private subnet" + value = azurerm_subnet.private.id +} + +output "public_subnet_id" { + description = "ID of the public subnet" + value = azurerm_subnet.public.id +} + +output "nat_gateway_id" { + description = "ID of the NAT Gateway" + value = azurerm_nat_gateway.this.id +} + +output "security_group_id" { + description = "ID of the security group used for the workspace" + value = azurerm_network_security_group.this.id +} + +# ============================================================================= +# Other Azure Resources Outputs +# ============================================================================= + + +output "managed_resource_group_id" { + description = "ID of the managed resource group" + value = azurerm_databricks_workspace.this.managed_resource_group_id +} + diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/providers.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/providers.tf new file mode 100644 index 0000000..1323e3f --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/providers.tf @@ -0,0 +1,4 @@ +provider "azurerm" { + subscription_id = var.azure_subscription_id + features {} +} diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/terraform.tfvars.example b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/terraform.tfvars.example new file mode 100644 index 0000000..92a902e --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/terraform.tfvars.example @@ -0,0 +1,27 @@ +# ============================================================================= +# Azure Configuration +# ============================================================================= + +azure_subscription_id = "xxx" +resource_group_name = "new-resource-group" +tags = { + "Owner": "Name Of Owner" +} + +# ============================================================================= +# Databricks Workspace Configuration +# ============================================================================= + +workspace_name = "databricks-workspace" +root_storage_name = "databricksrootstorage" +location = "westeurope" + +# ============================================================================= +# Network Configuration +# ============================================================================= + +vnet_name = "vnet-name" +cidr = "10.0.0.0/16" +subnet_public_cidr = "10.0.1.0/24" +subnet_private_cidr = "10.0.2.0/24" + diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/variables.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/variables.tf new file mode 100644 index 0000000..a5453cc --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/variables.tf @@ -0,0 +1,78 @@ +# ============================================================================= +# Azure Configuration +# ============================================================================= + +variable "azure_subscription_id" { + description = "Your Azure Subscription ID" + type = string +} + +variable "resource_group_name" { + description = "The name of the resource group" + type = string +} + +variable "tags" { + description = "A map of tags to assign to the resources" + type = map(string) + default = {} +} + +# ============================================================================= +# Databricks Workspace Configuration +# ============================================================================= + +variable "workspace_name" { + description = "The name of the Databricks workspace" + type = string +} + +variable "root_storage_name" { + type = string + description = "The root storage name. Only lowercase letters and numbers, 3-24 characters." + validation { + condition = length(var.root_storage_name) >= 3 && length(var.root_storage_name) <= 24 + error_message = "root_storage_name must be between 3 and 24 characters." + } + validation { + condition = can(regex("^[a-z0-9]+$", var.root_storage_name)) + error_message = "root_storage_name can only contain lowercase letters and numbers." + } +} + +variable "location" { + description = "The Azure region to deploy the workspace to" + type = string + validation { + condition = contains([ + "australiacentral", "australiacentral2", "australiaeast", "australiasoutheast", "brazilsouth", "canadacentral", "canadaeast", "centralindia", "centralus", "chinaeast2", "chinaeast3", "chinanorth2", "chinanorth3", "eastasia", "eastus", "eastus2", "francecentral", "germanywestcentral", "japaneast", "japanwest", "koreacentral", "mexicocentral", "northcentralus", "northeurope", "norwayeast", "qatarcentral", "southafricanorth", "southcentralus", "southeastasia", "southindia", "swedencentral", "switzerlandnorth", "switzerlandwest", "uaenorth", "uksouth", "ukwest", "westcentralus", "westeurope", "westindia", "westus", "westus2", "westus3" + ], var.location) + error_message = "Valid values for var.location are standard Azure regions supported by Databricks." + } +} + +# ============================================================================= +# Network Configuration +# ============================================================================= + +variable "vnet_name" { + description = "The name of the virtual network" + type = string +} + +variable "cidr" { + description = "The CIDR address of the virtual network" + type = string + default = "10.0.0.0/20" +} + +variable "subnet_public_cidr" { + description = "The CIDR address of the first subnet" + type = string +} + +variable "subnet_private_cidr" { + description = "The CIDR address of the second subnet" + type = string +} + diff --git a/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/versions.tf b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/versions.tf new file mode 100644 index 0000000..be0d1a1 --- /dev/null +++ b/workspace-setup/terraform-examples/azure/azure-vnet-injection-new-vnet/tf/versions.tf @@ -0,0 +1,12 @@ +terraform { + required_version = "~> 1.3" + + + required_providers { + azurerm = { + source = "hashicorp/azurerm" + version = "~> 4.50" + } + } +} +