Skip to content

Releases: data-dot-all/dataall

v2.2.0

14 Dec 07:32
823d642

Choose a tag to compare

What's Changed

This time there are no warnings.

New features 🆕

Enhancements 🥇

Fixes 🪲

  • Add the cloudformation:ContinueUpdateRollback permission to the pivotRole, for administration of linked environment accounts. by @rbernotas in #850
  • Fix Module Enabled Pipelines by @noah-paige in #874
  • Add Athena:UpdateWorkGroup permissions to CDK Exec Policy by @noah-paige in #892
  • Add Pagination to Return Full List Cognito Groups by @noah-paige in #891
  • Remove unnecessary MANAGE_ORGANIZATIONS check by @dlpzx in #887
  • Fix S3DatasetClient upload data by @noah-paige in #909
  • Fix Migration Script for New Deployment by @noah-paige in #908
  • Create frontend config role regardless of custom auth or not in backend by @noah-paige in #913
  • Fix permissions on share workflows by @dlpzx in #914

Documentation 📚

Dependencies

  • Upgrade Athena engine version to v3 by @dlpzx in #886
  • Bump axios from 0.26.1 to 1.6.0 in /frontend by @dependabot in #867
  • Bump certifi from 2022.12.7 to 2023.7.22 in /deploy/custom_resources/custom_authorizer by @dependabot in #910
  • Bump urllib3 from 1.26.15 to 1.26.18 in /deploy/custom_resources/custom_authorizer by @dependabot in #911
  • Bump requests from 2.29.0 to 2.31.0 in /deploy/custom_resources/custom_authorizer by @dependabot in #912

New Contributors 👨‍💻 👩‍💻

Full Changelog: v2.1.0...v2.2.0

v2.1.0

08 Nov 08:00
f917a7a

Choose a tag to compare

What's Changed

⚠️ Important: After upgrading to v2.1.0 environment stacks need to be updated before creating or editing datasets. If the environment stack is not updated Dataset creation and other functionalities will fail. To update the environment stacks there are 3 options:

  1. Using cdk.json parameter enable_update_dataall_stacks_in_cicd_pipeline --> automatically updates the environments and dataset stacks in the CICD pipeline
  2. Waiting for overnight update stack task --> same as the above, but it runs at a daily schedule.
  3. Updating environments in Environment > Stack tab > click on Update button --> manual update

Governance 🏛️

New features 🆕

Enhancements 🥇

  • Fix shell=true semgrep issues by @dlpzx in #760
  • Add global flag to replace and avoid scanning issues on incomplete-sanitization by @dlpzx in #762
  • Allow to submit a share when you are both an approver and a requester by @zsaltys in #793
  • Redirect upon creating a share request by @zsaltys in #799
  • Add frontend and backend feature flags by @zsaltys in #817
  • Make hosted_zone_id optional by @lorchda in #812
  • Add configurable session timeout to Cognito by @manjulaK in #786
  • Modularization of notifications, refactor from core to modules by @dlpzx in #822
  • Add Additional Error Messages for KMS Key lookup on imported dataset by @noah-paige in #748
  • Handle Environment Import of IAM service roles by @noah-paige in #749
  • Add condition when there are no public subnets by @lorchda in #794
  • Check other share exists before clean up by @noah-paige in #769
  • Configure Pytests on Feature Flags by @noah-paige in #764

Fixes 🪲

Dependencies

  • Add resolutions for yarn.lock pinned packages by @dlpzx in #757
  • Upgrade babel to non-vulnerable version 7.23.2 by @dlpzx in #816
  • Bump werkzeug from 2.2.3 to 3.0.1 in /tests by @dependabot in #831
  • Bump werkzeug from 2.3.3 to 3.0.1 in /backend/dataall/base/cdkproxy by @dependabot in #832
  • Bump react-devtools-core from 4.28.0 to 4.28.4 in /frontend by @dependabot in #824

Documentation 📚

New Contributors 👨‍💻 👩‍💻

Special thanks to the new contributors!

Full Changelog: v2.0.0...v2.1.0

v2.0.0

13 Sep 14:51
13c1baf

Choose a tag to compare

What's Changed

Major version upgrade ☀️

Data.all v2 is a modular version of data.all that allows customers to easily configure and customize data.all to their needs. In a single config file, the different modules can be configured, enabled or disabled. New features and customizations to the modules can now be added to the source code, as well as complete new modules.

In this release we have carried out a deep refactoring of the backend and frontend packages and the resulting code shows significant differences with the v1.6.2 structure. Refer to the following PRs and issues for more details on the design changes.

⚠️ Breaking changes?
Upgrading from v1.6.2 to v2 does NOT include any breaking changes. Despite the magnitude of the code changes, there are no changes to the architecture diagram or to existing resources. Pre-existing datasets, environments, shares or any other resources are not affected by the upgrade.

Enhancements and fixes 🪲

Documentation 📚

Contributors

Full Changelog: v1.6.2...v2.0.0

v1.6.2

08 Aug 15:14
f235c19

Choose a tag to compare

What's Changed

⚠️ This is a patch for V1.6.1. If you are upgrading from a previous version of data.all, please have a look at the "Manual actions required" section. Fresh deployments are unaffected.

  • Add missing KMS keys for canaries by @dlpzx in #619
  • Allow restricted nacls backend VPC by @noah-paige in #626
  • Fix cloudfront stack in case custom domain is given by @dbalintx in #607
  • resolve unnecessary dependency in git_release role by @dlpzx in #623
  • get prefix list ids for dbmigration for infra region by @dlpzx in #624
  • Handle External ID SSM v1.6.1> by @noah-paige in #630

Upgrading from <v1.6.0 to v1.6.2

The externalID used to secure the pivotRole(s) in linked environments will be moved from AWS Secrets Manager to AWS Systems Manger Parameter Store as part of this upgrade.

⚠️ NOTE: If you have deployed data.all with enable_pivot_role_auto_create set to true in your cdk.json then you will not have to perform the manual steps listed below and can simply upgrade to v1.6.2. If not please continue with the manual steps below:

In order to retain the same externalID and not have to update the pivotRole(s) of each linked environment, follow the below steps:

  1. In your data.all deployment account, Navigate to AWS Secrets Manager and retrieve the secret value of the external ID (named dataall-externalId-{envname}) --> keep this value somewhere for later reference
    Screenshot 2023-08-08 at 9 34 20 AM

  2. Upgrade code from existing version to v1.6.2 and commit latest code changes to deploy via CodePipeline

  3. Once the CodePipeline execution is complete, Navigate to SSM Parameter Store in Deployment Account and find externalID Parameter (named /dataall/{envname}/pivotRole/externalId) --> edit the existing value with the one retained from Step 1
    Screenshot 2023-08-08 at 9 34 28 AM

Full Changelog: v1.6.1...v1.6.2

v2.0.0-beta1

03 Aug 20:15
9220140

Choose a tag to compare

v2.0.0-beta1 Pre-release
Pre-release

Beta pre-release of version 2.0.0, focused on the refactor to modularize data.all. This version includes a modularized backend but not yet a modularized front-end, which will be published with the final release.

⚠️ We recommend installing this release from scratch instead of upgrading an existing system, since this is a pre-production release.

⚠️ WARNING If upgrading, do so from version 1.6.2.

Known issues affecting deployment

In the deployment guide, run step 8 before step 5, then continue from step 5. This is needed because data.all uses the cdk look up roles in CDK synth, which requires bootstrapping the accounts before running cdk synth locally. Documentation will be updated for the final release.

Known issues

  • #556 Request for share is being sent for invalid environment (CREATE_FAILED)
  • #540 OpenSearch stack failed during backend deploy due to length of policy name
  • #534 Catalog Search along with filters
  • #533 Profille Job run fails
  • #428 Prefix crawling is crawling complete bucket instead of specific folder
  • #374 Error in Monitoring tab in Admin Settings
  • #338 Import of Dashboard / Dataset - Environment selection drop-down list is limited to 5 environments
  • #288 Can't Paginate to view all Folders
  • #625 CDK execution role (custom template) throws S3 access denied error for pivotRole auto-created nested stack
  • Denied share requests show the wrong message to the asking user: approved instead of denied (no effect on actual sharing)
  • Logging of approvals for sharing shows AWSResourceNotFound for some approvals
  • There is an issue when user creates a dataset he/she can’t upload the data using UPLOAD button. We are facing CORS error which disappears after some time
  • After creating a dataset, a user may temporarily be unable to upload data using the UPLOAD button

What's Changed

New Contributors

  • @blitzmohit made their first contribution in #538

Full Changelog: v1.6.1...v2.0.0-beta1

v1.6.1

25 Jul 10:15
f3baf14

Choose a tag to compare

What's Changed

⚠️ We strongly recommend you to upgrade to V1.6.2 directly and skip this release. V1.6.2 includes a better implementation of V1.6.1 fixes ⚠️

  • Fix wrong update of externalId for pivotRole by @dlpzx in #591

Manual actions required

ONLY if you are upgrading!
In the first run the CodePipeline will fail in the CDK Synth stage if no additional changes are done:

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::111111111111:assumed-role/SOME ROLE/... is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::222222222222:role/cdk-hnb659fds-lookup-role-22222222222-eu-west-1

CodeBuild needs additional permissions to assume the IAM role in the CDK Synth stage. Since we cannot update this CodeBuild stage without running it, the permissions need to be added manually.

Upgrading from V1.6.0 to v1.6.1

The role that we need to update is a role named <PREFIX>-<GITBRANCH>-codebuild-baseline-role. It will say it in the error message in the CodeBuild logs

  1. Go to the IAM role (<PREFIX>-<GITBRANCH>-codebuild-baseline-role) and click on Add permissions > Create inline policy
image 2. Update the policy, use the JSON and copy the policy below: image

The policy of the Codebuild execution role need to include the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::*:role/cdk-hnb659fds-lookup-role*"
        }
    ]
}
  1. After the pipeline has successfully run, go back to the IAM role and remove the manually added policy. The policy is now added as part of infrastructure as code.
image

Upgrading from <V1.6.0 to v1.6.1

The error points at a different role some. A role created by CDK that looks like the following in the CodeBuild logs:

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts:::111111111111:assumed-role/dataall-sbx8-cicd-stack-dataallsbx8cdkpipelinePipe-HMXY7D9OX4FM/AWSCodeBuild-30c50765-4529-4d20-99ce-88f82139a82c is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::22222222222:role/cdk-hnb659fds-lookup-role-22222222222-eu-west-1

We find the role and update it as we explained in the "Upgrading from V1.6.0 to v1.6.1" section.
image

Once that is done, retry the CodeBuild Synth stage. In this case you do NOT need to cleanup the manually added policies as this role will be deleted.
Full Changelog: v1.6.0...v1.6.1

v1.6.0

19 Jul 13:27
84c555e

Choose a tag to compare

⚠️ Read the IMPORTANT section before upgrading ⚠️
⚠️ We strongly recommend you to upgrade to V1.6.2 directly ⚠️

What's Changed

New features

  • Add share reason in share requests by @noah-paige in #498
  • Import KMS key in imported datasets by @dlpzx in #515 and #572. Support for pre-existing imported datasets in #578

Security

  • Fine-grained NACLs for backend VPC creation by @noah-paige in #543 and in #573
  • Implement security response headers in Cloudfront distributions by @nikpodsh in #529
  • Sanitize the string to avoid a connection string injection by @nikpodsh in #532
  • Restrict KMS keys' policies by @noah-paige in #524
  • Limit dataset IAM role permissions by @dlpzx in #497
  • Limit environment IAM roles permissions by @dlpzx in #515
  • Limit pivot role (IAM role) permissions by @dlpzx in #535 --> it will only be automatically applied to dataallPivotRole-cdk . Migrate to auto-created dataallPivotRole-cdk released in V1.4.0 or manually update the dataallPivotRole roles in your environments.
  • Move parameters from Secrets Manager to SSM by @dlpzx in #455
  • Disable profiling results from "secret" and "official" datasets by @dlpzx in #482
  • CDK execution role policy template by @mourya-33 in #562

Bug-fixes

  • Fix deletion of imported Glue database by @dlpzx in #512
  • Removed unused resources and consolidate KMS keys in environment stack by @noah-paige in #524
  • Fix urllib3 dependencies for glue profiling job by @noah-paige in #513
  • Add cookiecutter config and environment variable for datapipelines stacks by @dbalintx in #582
  • v1.6.0 backwards compatibility changes by @dlpzx in #567
  • Add Glue Resource Policy Permissions for cross account share requests by @noah-paige in #579

⚠️ ⚠️ ⚠️ Important ⚠️ ⚠️ ⚠️

Breaking changes

  • ⚠️ IMPORTANT: It is necessary to upgrade to version >V1.5.0 before upgrading to V1.6 to avoid deletion of resources due to the removal of custom resources.
  • ⚠️ IMPORTANT: requires an update of environments and then datasets after upgrading. Either using cdk.json parameter enable_update_dataall_stacks_in_cicd_pipeline, waiting for overnight update stack task, or manually updating first environments and then datasets. If the environment stack is not updated Dataset creation and other functionalities will fail.
  • ⚠️ IMPORTANT: Because of the implementation of #529 the CloudFront distribution will be recreated. This means that the url used in the CloudFront distribution will be new. You can directly use the new url. In case you are using a custom domain with an SSL certificate, before upgrading to v1.6, you should remove the CNAME's (for both frontend and userguide ) from the old distributions as mentioned in #603
  • ⚠️ IMPORTANT: Additional EC2 permissions are needed in the CDK Synth CodeBuild because of the implementation of #543 --> this can be avoided by upgrading to v1.5.6 before upgrading to v1.6.0 or manually adding the necessary permissions and retrying the pipeline run. Check the PR for more details.
  • Developing locally requires using a role ending in -graphql-role, -awsworker-role or ecs-tasks-role to work with the more restrictive pivotRole trust policy implemented in #535.

New Contributors 🚀

Full Changelog: v1.5.6...v1.6.0

v1.5.6

12 Jul 10:40
45c5cfb

Choose a tag to compare

What's Changed

Bug Fixes

  • Resolve dataset share checks when deleting dataset by @noah-paige in #554

Enhancements

Package updates

New Contributors

Welcome to the project 🎉

Full Changelog: v1.5.5...v1.5.6

v1.5.5

20 Jun 11:25
aa9d3df

Choose a tag to compare

What's Changed

  • hotfix: dynamic SQL generation by @chamcca in #514
  • dependabot: upgrade fast-xml-parser, aws-amplify, react-scripts, override react-redux to non-vulnerable version by @dlpzx in #521
  • dependabot: resolve nth-check in sub-dependencies by @dlpzx in #525

New Contributors

Full Changelog: v1.5.4...v1.5.5

v1.5.4

06 Jun 11:41
fa45abd

Choose a tag to compare

What's Changed

  • Update CDK Version to v2.77.0 to fix vulnerability with CDK Pipeline role in CDK Pipelines construct by @gmuslia in #484
  • Safe removal of consumption roles and teams with open share requests by @dlpzx in #485
  • Fix typo that destroys storage locations by @dlpzx in #481

Full Changelog: v1.5.3...v1.5.4