Skip to content

Using Databricks Labs Data Generator on Databricks Runtime 14.x #241

@ronanstokes-db

Description

@ronanstokes-db

Expected Behavior

No errors

Current Behavior

There is an issue when running the Data Generator on Unity Catalog environment in the 14.1 runtime. The DataGenerator will try to determine the appropriate number of partitions to use when they are not specified.

When running with this release on a Unity Catalog enabled shared cluster using runtime release 14.1, you will receive an error (exception) if you don't explicitly specify the number of partitions.

This does not affect the other access modes for Unity Catalog enabled clusters, or the 13.3 LTS runtime (which we recommend for now)

Workaround

The workaround is to explicitly specify the number of partitions. We are working on a solution to this and will update this over the next couple of days.

Context

Only applies to use of data generator on Databricks runtime release 14.x with Unity Catalog enabled cluster with shared mode enabled

Your Environment

  • dbldatagen version used: latest
  • Databricks Runtime version: 14.1
  • Cloud environment used: any

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions