databrickslabs
diff --git a/‎.github/workflows/codeql-analysis.yml‎
Lines changed: 3 additions & 3 deletions b/‎.github/workflows/codeql-analysis.yml‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎.github/workflows/push.yml‎
Lines changed: 5 additions & 1 deletion b/‎.github/workflows/push.yml‎
Lines changed: 5 additions & 1 deletion
diff --git a/‎.github/workflows/onrelease.yml‎ ‎.github/workflows/release.yml‎.github/workflows/onrelease.yml renamed to .github/workflows/release.yml
Lines changed: 12 additions & 12 deletions b/‎.github/workflows/onrelease.yml‎ ‎.github/workflows/release.yml‎.github/workflows/onrelease.yml renamed to .github/workflows/release.yml
Lines changed: 12 additions & 12 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 29 additions & 3 deletions b/‎CHANGELOG.md‎
Lines changed: 29 additions & 3 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 26 additions & 2 deletions b/‎CONTRIBUTING.md‎
Lines changed: 26 additions & 2 deletions
@@ -18,7 +18,7 @@ on:
 jobs:
   analyze:
     name: Analyze
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-22.04
 
     strategy:
       fail-fast: false
@@ -31,9 +31,9 @@ jobs:
 
     # Initializes the CodeQL tools for scanning.
     - name: Initialize CodeQL
-      uses: github/codeql-action/init@v2
+      uses: github/codeql-action/init@v3
       with:
         languages: ${{ matrix.language }}
 
     - name: Perform CodeQL Analysis
-      uses: github/codeql-action/analyze@v2
+      uses: github/codeql-action/analyze@v3
@@ -9,7 +9,7 @@ on:
 jobs:
   tests:
     # Ubuntu latest no longer installs Python 3.9 by default so install it
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-22.04
     steps:
       - name: Checkout
         uses: actions/checkout@v4
@@ -26,6 +26,10 @@ jobs:
       #     key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
       #     restore-keys: |
       #       ${{ runner.os }}-go-
+      - name: Set Java 8
+        run: |
+          sudo update-alternatives --set java /usr/lib/jvm/temurin-8-jdk-amd64/bin/java
+          java -version
 
       - name: Set up Python 3.8
         uses: actions/setup-python@v5
 
@@ -7,17 +7,23 @@ on:
 
 jobs:
   release:
-    runs-on: ${{ matrix.os }}
-    strategy:
-      max-parallel: 1
-      matrix:
-        python-version: [ 3.8 ]
-        os: [ ubuntu-latest ]
+    runs-on: ubuntu-22.04
+    environment: release
+    permissions:
+      # Used to authenticate to PyPI via OIDC and sign the release's artifacts with sigstore-python.
+      id-token: write
+      # Used to attach signing artifacts to the published release.
+      contents: write
 
     steps:
       - name: Checkout
         uses: actions/checkout@v4
 
+      - name: Set Java 8
+        run: |
+          sudo update-alternatives --set java /usr/lib/jvm/temurin-8-jdk-amd64/bin/java
+          java -version
+
       - name: Set up Python 3.8
         uses: actions/setup-python@v5
         with:
@@ -44,9 +50,3 @@ jobs:
 
       - name: Publish a Python distribution to PyPI
         uses: pypa/gh-action-pypi-publish@release/v1
-        with:
-          user: __token__
-          password: ${{ secrets.LABS_PYPI_TOKEN }}
-
-
-
@@ -1,12 +1,39 @@
-# Databricks Labs Data Generator Release Notes
+# Databricks Labs Synthetic Data Generator Release Notes
 
 ## Change History
 All notable changes to the Databricks Labs Data Generator will be documented in this file.
 
-### Unreleased
+### unreleased
+
+#### Fixed 
+* Updated build scripts to use Ubuntu 22.04 to correspond to environment in Databricks runtime
+
+### Version 0.4.0 Hotfix 2
+
+#### Fixed
+* Added basic stock ticker and multi-table sales order standard datasets
+* Added min and max latitude and longitude options for the basic geometries dataset provider
+* Added default max values for numeric data types
+
+### Version 0.4.0 Hotfix 1
+
+#### Fixed
+* Fixed issue with running on serverless environment
+
+
+### Version 0.4.0
 
 #### Changed
+* Updated minimum pyspark version to be 3.2.1, compatible with Databricks runtime 10.4 LTS or later
+* Modified data generator to allow specification of constraints to the data generation process
 * Updated documentation for generating text data.
+* Modified data distribiutions to use abstract base classes
+* migrated data distribution tests to use `pytest`
+* Additional standard datasets
+
+#### Added
+* Added classes for constraints on the data generation via new package `dbldatagen.constraints`
+* Added support for standard data sets via the new package `dbldatagen.datasets`
 
 
 ### Version 0.3.6 Post 1
@@ -18,7 +45,6 @@ All notable changes to the Databricks Labs Data Generator will be documented in
 #### Fixed 
 * Fixed scenario where `DataAnalyzer` is used on dataframe containing a column named `summary`
 
-
 ### Version 0.3.6
 
 #### Changed
 
@@ -26,7 +26,7 @@ runtime 9.1 LTS or later.
 
 ## Checking your code for common issues
 
-Run `./lint.sh` from the project root directory to run various code style checks. 
+Run `make dev-lint` from the project root directory to run various code style checks. 
 These are based on the use of `prospector`, `pylint` and related tools.
 
 ## Setting up your build environment
@@ -45,6 +45,11 @@ Our recommended mechanism for building the code is to use a `conda` or `pipenv`
 
 But it can be built with any Python virtualization environment.
 
+### Spark dependencies
+The builds have been tested against Spark 3.2.1. This requires the OpenJDK 1.8.56 or later version of Java 8.
+The Databricks runtimes use the Azul Zulu version of OpenJDK 8 and we have used these in local testing.
+These are not installed automatically by the build process, so you will need to install them separately.
+
 ### Building with Conda
 To build with `conda`, perform the following commands:
   - `make create-dev-env` from the main project directory to create your conda environment, if using
@@ -70,7 +75,7 @@ To build with `pipenv`, perform the following commands:
     - Run `make dist` from the main project directory
   - The resulting wheel file will be placed in the `dist` subdirectory
 
-The resulting build has been tested against Spark 3.0.1
+The resulting build has been tested against Spark 3.2.1
 
 ## Creating the HTML documentation
 
@@ -153,3 +158,22 @@ Basically it follows the Python PEP8 coding conventions - but method and argumen
 with a lower case letter rather than underscores following Pyspark coding conventions.
 
 See https://legacy.python.org/dev/peps/pep-0008/
+
+# Github expectations
+When running the unit tests on Github, the environment should use the same environment as the latest Databricks
+runtime latest LTS release. While compatibility is preserved on LTS releases from Databricks runtime 10.4 onwards, 
+unit tests will be run on the environment corresponding to the latest LTS release. 
+
+Libraries will use the same versions as the earliest supported LTS release - currently 10.4 LTS
+
+This means for the current build:
+
+- Use of Ubuntu 22.04 for the test runner
+- Use of Java 8
+- Use of Python 3.11
+
+See the following resources for more information
+= https://docs.databricks.com/en/release-notes/runtime/15.4lts.html
+- https://docs.databricks.com/en/release-notes/runtime/10.4lts.html
+- https://github.com/actions/runner-images/issues/10636
+