Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEXT_CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
### Documentation

* Add documentation on the `service_principal_client_id` attribute of `databricks_app` and related [#5134](https://github.com/databricks/terraform-provider-databricks/pull/5134)
* Add documentation for Unified Terraform Provider ([#5122](https://github.com/databricks/terraform-provider-databricks/pull/5122))

### Exporter

### Internal Changes
Expand Down
121 changes: 121 additions & 0 deletions docs/guides/unified-provider.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Unified Terraform Provider
[![Public Beta](https://img.shields.io/badge/Release_Stage-Public_Beta-orange)](https://docs.databricks.com/aws/en/release-notes/release-types)

## Introduction

The Unified Terraform Provider allows you to manage workspace-level resources and data sources through an account-level provider. This significantly simplifies Terraform configurations and resource management, as only one provider is needed per account.

**Note:** This feature is in [Public Beta](https://docs.databricks.com/aws/en/release-notes/release-types). If you experience any issues, please refer to the [Reporting Issues](#reporting-issues) section below.

## Usage

To manage a workspace-level resource through the account provider, specify the `provider_config` at the resource level with the `workspace_id` of the workspace the resource belongs to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to have the "account admin" permission. (Let's check this with Alex or LRM team.)


Depending on the internal implementation of the resource or data source, `provider_config` can be either a block or an attribute. For details, please refer to the documentation of the specific resource or data source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With attributes, an = separates the provider_config from its configuration, whereas with blocks, this = is omitted.


### Block
```hcl
resource "workspace_level_resource" "this" {
provider_config {
workspace_id = "12345"
}
...
}
```

### Attribute
```hcl
resource "workspace_level_resource" "this" {
provider_config = {
workspace_id = "12345"
}
...
}
```
Comment on lines +16 to +34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the subheadings and just add English describing the code blocks.

Use actual resources rather than fake ones.


**Note:** This feature is being rolled out incrementally. Some resources do not yet support the unified provider. Please check the resource-specific documentation to see if the `provider_config` attribute or block is available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest not to document this change until after non-SDK resources are supported.


## Migrating to Unified Provider

If you are currently managing both workspace-level and account-level resources and data sources, you likely have multiple provider configurations that you specify for each resource using aliases. For example:
```hcl
// Define an account provider with alias
provider "databricks" {
alias = "account"
host = var.account_host
account_id = var.account_id
client_id = var.account_client_id
client_secret = var.account_client_secret
}
// Create a workspace
resource "databricks_mws_workspaces" "this" {
provider = databricks.account
...
}
```

Once the workspace is created, create a workspace-level provider.
```hcl
Comment on lines +56 to +59
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```
Once the workspace is created, create a workspace-level provider.
```hcl

// Define a workspace provider with alias
provider "databricks" {
alias = "workspace"
host = var.workspace_host
client_id = var.workspace_client_id
client_secret = var.workspace_client_secret
}
// Use the workspace provider for workspace-level resources
resource "databricks_workspace_level_resource" "this" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use real resources just for demonstration

provider = databricks.workspace
name = "resource name"
}
```

Migration to the unified provider happens in 2 steps:
1. Add `provider_config` and `workspace_id` to the resource without removing the workspace-level provider. Then run `terraform apply` so these values are included in the state.
2. Remove the workspace-level provider.

For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to include the result after step 1 materialized in docs.


```hcl
// Define an account provider
provider "databricks" {
host = var.account_host
account_id = var.account_id
client_id = var.account_client_id
client_secret = var.account_client_secret
}
// Create a workspace under the account
resource "databricks_mws_workspaces" "this" {
account_id = var.account_id
aws_region = "us-east-1"
compute_mode = "SERVERLESS"
}
// Create a workspace-level resource using provider_config
resource "databricks_workspace_level_resource" "this" {
provider_config {
workspace_id = databricks_mws_workspaces.this.workspace_id
}
name = "resource name"
}
```

## FAQ

* An empty `workspace_id` is not allowed, and `terraform plan` will fail with the error: `"workspace_id string length must be at least 1"`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's format these as questions, with the questions themselves as subheaders.

* The `workspace_id` supplied to the resource through `provider_config` must belong to the account for which the provider is configured. If the workspace does not belong to the account, you will receive an error: `"failed to get workspace client, please check the workspace_id provided in the provider_config"`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or that the authenticated principal doesn't have permission to call the GetWorkspace API.

* Migrating to the unified provider in a single step (i.e., removing the workspace-level provider and adding `provider_config` to the resource simultaneously) will result in an error: `"workspace_id is not set, please set the workspace_id in the provider_config"`. This occurs because the state does not contain `workspace_id` during the refresh phase. Migration must be done in 2 steps as described above.

## Limitations

There are some limitations to this feature that we plan to address in the near future:

1. Databricks CLI and Azure CLI authentication methods are not currently supported
2. Some resources do not yet support the unified provider as the support is rolling out incrementally. Please refer to the documentation for each resource or data source to check if they support the `provider_config` attribute or block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also will go away.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is the right spot to mention the account admin permission restriction?

## Reporting Issues

This feature is in Public Beta. If you encounter any issues, please report them on [GitHub Issues](https://github.com/databricks/terraform-provider-databricks/issues) with the `Unified Provider` label.
50 changes: 42 additions & 8 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ There are currently a number of supported methods to [authenticate](https://docs
!> **Warning** Please be aware that hard coding any credentials in plain text is not something that is recommended. We strongly recommend using a Terraform backend that supports encryption. Please use [environment variables](#environment-variables), `~/.databrickscfg` file, encrypted `.tfvars` files or secret store of your choice (Hashicorp [Vault](https://www.vaultproject.io/), AWS [Secrets Manager](https://aws.amazon.com/secrets-manager/), AWS [Param Store](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html), Azure [Key Vault](https://azure.microsoft.com/en-us/services/key-vault/))

### Authenticating with GitHub OpenID Connect (OIDC)
The arguments `host` and `client_id` are used for the authentication which maps to the `github-oidc` authentication type.
The arguments `host` and `client_id` are used for the authentication which maps to the `github-oidc` authentication type.

These can be declared in the provider block or set in the environment variables `DATABRICKS_HOST` and `DATABRICKS_CLIENT_ID` respectively. Example:

Expand All @@ -197,10 +197,11 @@ provider "databricks" {
account_id = var.account_id
}
```
Note: Some workspace level resources can be managed through account provider. please see the [unified provider](./guides/unified-provider.md) for more details.

### Authenticating with Databricks CLI

The provider can authenticate using the Databricks CLI. After logging in with the `databricks auth login` command to your account or workspace, you only need to specify the name of the profile in your provider configuration. Terraform will automatically read and reuse the cached OAuth token to interact with the Databricks REST API. See [the user-to-machine authentication guide](https://docs.databricks.com/aws/en/dev-tools/cli/authentication#oauth-user-to-machine-u2m-authentication) for more details.
The provider can authenticate using the Databricks CLI. After logging in with the `databricks auth login` command to your account or workspace, you only need to specify the name of the profile in your provider configuration. Terraform will automatically read and reuse the cached OAuth token to interact with the Databricks REST API. See [the user-to-machine authentication guide](https://docs.databricks.com/aws/en/dev-tools/cli/authentication#oauth-user-to-machine-u2m-authentication) for more details.

You can specify a [CLI connection profile](https://docs.databricks.com/aws/en/dev-tools/cli/profiles) through `profile` parameter or `DATABRICKS_CONFIG_PROFILE` environment variable:

Expand Down Expand Up @@ -230,7 +231,9 @@ provider "databricks" {
}
```

To create resources at both the account and workspace levels, you can create two providers as shown below:
To create resources at both the account and workspace levels, you can create two providers as shown below.

Note: Some workspace level resources can be managed through account provider. please see the [unified provider](./guides/unified-provider.md) for more details.

``` hcl
provider "databricks" {
Expand Down Expand Up @@ -275,9 +278,11 @@ provider "databricks" {

### Authenticating with Workload Identity Federation (WIF)

Workload Identity Federation can be used to authenticate Databricks from automated workflows. This is done through the tokens issued by the automation environment. For more details on environment variables regarding the specific environments, please see: https://docs.databricks.com/aws/en/dev-tools/auth/oauth-federation-provider.
Workload Identity Federation can be used to authenticate Databricks from automated workflows. This is done through the tokens issued by the automation environment. For more details on environment variables regarding the specific environments, please see: https://docs.databricks.com/aws/en/dev-tools/auth/oauth-federation-provider.

To create resources at both the account and workspace levels, you can create two providers as shown below.

To create resources at both the account and workspace levels, you can create two providers as shown below:
Note: Some workspace level resources can be managed through account provider. please see the [unified provider](./guides/unified-provider.md) for more details.

Workspace level provider:
```hcl
Expand All @@ -300,7 +305,36 @@ provider "databricks" {
}
```

Note: `auth_type` for Github Actions would be "github-oidc". For more details, please see the document linked above.
Note: `auth_type` for Github Actions would be "github-oidc". For more details, please see the document linked above.

### Authenticating with Unified Provider
The Unified Provider allows management of workspace-level terraform resources using an account-level provider. You can specify the `provider_config` block or attribute depending on the resource with the `workspace_id` that the resource will belong to. For more details, please see the [documentation](./guides/unified-provider.md).


Example:
```hcl
// Create an account level provider
provider "databricks" {
host = var.account_host
client_id = var.client_id
client_secret = var.client_secret
account_id = var.account_id
}

// Create a workspace under the account
resource "databricks_mws_workspaces" "this" {
...
}

// Create a workspace level resource under the workspace above
resource "databricks_workspace_level_resource" "this" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

provider_config = {
workspace_id = databricks_mws_workspaces.this.workspace_id
}
...
}

```


## Special configurations for Azure
Expand Down Expand Up @@ -351,7 +385,7 @@ resource "databricks_user" "my-user" {
}
```

Follow the [Configuring OpenID Connect in Azure](https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-azure). You can then use the Azure service principal to authenticate in databricks.
Follow the [Configuring OpenID Connect in Azure](https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-azure). You can then use the Azure service principal to authenticate in databricks.


There are `ARM_*` environment variables provide a way to share authentication configuration using the `databricks` provider alongside the [`azurerm` provider](https://registry.terraform.io/providers/hashicorp/azurerm/latest).
Expand Down Expand Up @@ -462,7 +496,7 @@ To make Databricks Terraform Provider generally available, we've moved it from [
You should have [`.terraform.lock.hcl`](https://github.com/databrickslabs/terraform-provider-databricks/blob/v0.6.2/scripts/versions-lock.hcl) file in your state directory that is checked into source control. terraform init will give you the following warning.

```text
Warning: Additional provider information from registry
Warning: Additional provider information from registry

The remote registry returned warnings for registry.terraform.io/databrickslabs/databricks:
- For users on Terraform 0.13 or greater, this provider has moved to databricks/databricks. Please update your source in required_providers.
Expand Down
Loading