| subcategory | Compute |
|---|
Installs a library on databricks_cluster. Each different type of library has a slightly different syntax. It's possible to set only one type of library within one resource. Otherwise, the plan will fail with an error.
-> This resource can only be used with a workspace-level provider!
-> databricks_library resource would always start the associated cluster if it's not running, so make sure to have auto-termination configured. It's not possible to atomically change the version of the same library without cluster restart. Libraries are fully removed from the cluster only after restart.
The library resource has been migrated from sdkv2 to plugin framework。 If you encounter any problem with this resource and suspect it is due to the migration, you can fallback to sdkv2 by setting the environment variable in the following way export USE_SDK_V2_RESOURCES="databricks_library".
You can install libraries on all clusters with the help of databricks_clusters data resource:
data "databricks_clusters" "all" {
}
resource "databricks_library" "cli" {
for_each = data.databricks_clusters.all.ids
cluster_id = each.key
pypi {
package = "databricks-cli"
}
}resource "databricks_file" "app" {
source = "${path.module}/app-0.0.1.jar"
path = "/Volumes/catalog/schema/volume/app-0.0.1.jar"
}
resource "databricks_library" "app" {
cluster_id = databricks_cluster.this.id
jar = databricks_file.app.path
}Installing artifacts from Maven repository. You can also optionally specify a repo parameter for a custom Maven-style repository, that should be accessible without any authentication. Maven libraries are resolved in Databricks Control Plane, so repo should be accessible from it. It can even be properly configured maven s3 wagon, AWS CodeArtifact or Azure Artifacts.
resource "databricks_library" "deequ" {
cluster_id = databricks_cluster.this.id
maven {
coordinates = "com.amazon.deequ:deequ:1.0.4"
// exclusions block is optional
exclusions = ["org.apache.avro:avro"]
}
}resource "databricks_file" "app" {
source = "${path.module}/baz.whl"
path = "/Volumes/catalog/schema/volume/baz.whl"
}
resource "databricks_library" "app" {
cluster_id = databricks_cluster.this.id
whl = databricks_file.app.path
}Installing Python PyPI artifacts. You can optionally also specify the repo parameter for a custom PyPI mirror, which should be accessible without any authentication for the network that cluster runs in.
-> repo host should be accessible from the Internet by Databricks control plane. If connectivity to custom PyPI repositories is required, please modify cluster-node /etc/pip.conf through databricks_global_init_script.
resource "databricks_library" "fbprophet" {
cluster_id = databricks_cluster.this.id
pypi {
package = "fbprophet==0.6"
// repo can also be specified here
}
}Installing Python libraries listed in the requirements.txt file. Only Workspace paths and Unity Catalog Volumes paths are supported. Requires a cluster with DBR 15.0+.
resource "databricks_library" "libraries" {
cluster_id = databricks_cluster.this.id
requirements = "/Workspace/path/to/requirements.txt"
}resource "databricks_dbfs_file" "app" {
source = "${path.module}/foo.egg"
path = "/FileStore/foo.egg"
}
resource "databricks_library" "app" {
cluster_id = databricks_cluster.this.id
egg = databricks_dbfs_file.app.dbfs_path
}Installing artifacts from CRan. You can also optionally specify a repo parameter for a custom cran mirror.
resource "databricks_library" "rkeops" {
cluster_id = databricks_cluster.this.id
cran {
package = "rkeops"
}
}The following arguments are supported:
cluster_id- (Required) ID of the databricks_cluster to install the library on.
You must specify exactly one of the following library types:
-
jar- (Optional) Path to the JAR library. Supported URIs include Workspace paths, Unity Catalog Volumes paths, and S3 URIs. For example:/Workspace/path/to/library.jar,/Volumes/path/to/library.jarors3://my-bucket/library.jar. If S3 is used, make sure the cluster has read access to the library. You may need to launch the cluster with an IAM role to access the S3 URI. -
egg- (Optional, Deprecated) Path to the EGG library. Installing Python egg files is deprecated and is not supported in Databricks Runtime 14.0 and above. Usewhlorpypiinstead. -
whl- (Optional) Path to the wheel library. Supported URIs include Workspace paths, Unity Catalog Volumes paths, and S3 URIs. For example:/Workspace/path/to/library.whl,/Volumes/path/to/library.whlors3://my-bucket/library.whl. If S3 is used, make sure the cluster has read access to the library. You may need to launch the cluster with an IAM role to access the S3 URI. -
requirements- (Optional) Path to the requirements.txt file. Only Workspace paths and Unity Catalog Volumes paths are supported. For example:/Workspace/path/to/requirements.txtor/Volumes/path/to/requirements.txt. Requires a cluster with DBR 15.0+. -
maven- (Optional) Configuration block for a Maven library. The block consists of the following fields:coordinates- (Required) Gradle-style Maven coordinates. For example:org.jsoup:jsoup:1.7.2.repo- (Optional) Maven repository to install the Maven package from. If omitted, both Maven Central Repository and Spark Packages are searched.exclusions- (Optional) List of dependencies to exclude. For example:["slf4j:slf4j", "*:hadoop-client"]. See Maven dependency exclusions for more information.
-
pypi- (Optional) Configuration block for a PyPI library. The block consists of the following fields:package- (Required) The name of the PyPI package to install. An optional exact version specification is also supported. For example:simplejsonorsimplejson==3.8.0.repo- (Optional) The repository where the package can be found. If not specified, the default pip index is used.
-
cran- (Optional) Configuration block for a CRAN library. The block consists of the following fields:package- (Required) The name of the CRAN package to install.repo- (Optional) The repository where the package can be found. If not specified, the default CRAN repo is used.
-
provider_config- (Optional) Configuration block for management through the account provider. This block consists of the following fields:workspace_id- (Required) Workspace ID that the resource belongs to. This workspace must be part of the account that the provider is configured with.
!> Importing this resource is not currently supported.
The following resources are often used in the same context:
- End to end workspace management guide.
- databricks_clusters data to retrieve a list of databricks_cluster ids.
- databricks_cluster to create Databricks Clusters.
- databricks_cluster_policy to create a databricks_cluster policy, which limits the ability to create clusters based on a set of rules.
- databricks_global_init_script to manage global init scripts, which are run on all databricks_cluster and databricks_job.
- databricks_job to manage Databricks Jobs to run non-interactive code in a databricks_cluster.
- databricks_pipeline to deploy Lakeflow Declarative Pipelines.
- databricks_repo to manage Databricks Repos.