Skip to content

2.0.0 - First Release Of Data Output

Latest

Choose a tag to compare

@neomatamune neomatamune released this 21 Nov 15:27
· 4 commits to main since this release

Release Notes - Version 2.0.0 - First Release Of Data Output

It is time to move on from the 1.x that tried to keep a link between cosmotech API 3.x and 5.x and fully embrace the non retro-compatibility.

elijah-gif

Now for a full breakdown of the release content by our favorite redactor : Mr. Art Ifficial.

Overview

This major release represents a significant evolution of the CosmoTech Acceleration Library (CoAL), introducing CoAL 2.0 with breaking changes that remove retrocompatibility with cosmotech-api versions below 5.0. The release includes comprehensive architectural refactoring, new features for data export, and removal of deprecated functionality.

🚨 Breaking Changes

API Compatibility

  • Removed retrocompatibility with cosmotech-api < 5.0
  • This is a major version bump requiring updates to dependent projects

Removed Deprecated Commands

The following deprecated commands have been removed from csm-data:

  • rds-load-csv - RDS CSV loading functionality
  • rds-send-csv - RDS CSV sending functionality
  • rds-send-store - RDS store sending functionality
  • tdl-load-files - Twin Data Layer file loading
  • tdl-send-files - Twin Data Layer file sending
  • runtemplate-load-handler - Run template handler loading

Removed Modules

  • cosmotech.coal.azure.functions - Deprecated Azure functions module
  • cosmotech.coal.cosmotech_api.twin_data_layer - Legacy Twin Data Layer module
  • cosmotech.coal.cosmotech_api.dataset/* - Legacy dataset submodules (converters, download, upload, utils)
  • cosmotech.coal.cosmotech_api.runner/* - Legacy runner submodules (data, datasets, download, metadata, parameters)
  • cosmotech.coal.utils.postgresql - Moved to cosmotech.coal.postgresql.utils
  • cosmotech.coal.utils.semver - Removed deprecated semver utilities
  • Legacy connection, parameters, run, run_data, run_template, and workspace modules from cosmotech_api

✨ New Features

Output Channel System

A new flexible output channel system for data export operations:

  • Channel Interface: Base interface for output operations (channel_interface.py)
  • AWS S3 Channel: Export data directly to S3 buckets (aws_channel.py)
  • Azure Storage Channel: Export data to Azure Blob Storage (az_storage_channel.py)
  • PostgreSQL Channel: Export data to PostgreSQL databases (postgres_channel.py)
  • Channel Splitter: Multi-destination output support (channel_spliter.py)

Configuration Management

  • New configuration.py module in cosmotech.coal.utils for centralized configuration loading and management
  • Enhanced configuration handling across the library

New csm-data Command

  • output - New command for flexible data export functionality

🔧 Refactoring & Improvements

CosmoTech API Architecture

Complete restructuring of the API client with modular design:

New /apis/ Directory Structure

  • dataset.py - Dataset management operations
  • meta.py - Metadata operations
  • organization.py - Organization management
  • run.py - Run operations
  • runner.py - Runner management
  • solution.py - Solution operations
  • workspace.py - Workspace management

New /objects/ Directory

  • connection.py - Enhanced connection objects and handling
  • parameters.py - Improved parameter management objects

Cloud Provider Modules

Azure Improvements

  • Enhanced blob.py and storage.py modules
  • Updated ADX modules (auth, runner, store, tables)
  • Improved error handling and authentication

AWS S3 Enhancements

  • Refactored s3.py for better performance and maintainability
  • Improved error handling and retry logic

PostgreSQL Integration

  • New utils.py module with comprehensive PostgreSQL utilities
  • Enhanced runner.py with improved functionality
  • Updated store.py with better operations handling
  • Moved utilities from cosmotech.coal.utils.postgresql to cosmotech.coal.postgresql.utils

Updated Commands

All existing csm-data commands have been updated to work with the new architecture:

  • postgres-send-runner-metadata
  • run-load-data
  • wsf-load-file and wsf-send-file
  • S3 commands (s3-bucket-delete, s3-bucket-download, s3-bucket-upload)
  • Store commands (dump-to-s3, store)

🧪 Testing

Test Suite Restructure

Major reorganization of tests to match the new code structure:

New Test Modules

  • test_apis/test_dataset.py - Dataset API tests
  • test_apis/test_runner.py - Runner API tests
  • test_apis/test_simple_apis.py - Simple API tests
  • test_apis/test_workspace.py - Workspace API tests
  • test_objects/test_connection.py - Connection object tests
  • test_objects/test_parameters.py - Parameters object tests
  • test_store/test_output/* - Complete test suite for output channels
  • test_utils/test_utils_configuration.py - Configuration utility tests

Reorganized Tests

  • Updated AWS S3 tests for new structure
  • Enhanced Azure ADX tests (store, tables, utils)
  • Improved PostgreSQL tests with better organization
  • Removed tests for deprecated modules

Test Statistics

  • 166 files changed
  • 5,548 insertions(+), 12,552 deletions(-)
  • Net reduction of ~7,000 lines while improving functionality

📚 Documentation

Updated Documentation

  • Removed documentation for deprecated RDS and TDL commands
  • Removed runtemplate-load-handler documentation
  • Updated csm-data tutorial to reflect command changes
  • Updated API documentation to match new structure

Removed Tutorial Files

  • tutorial/csm-data/tdl_load_files.bash
  • tutorial/csm-data/tdl_send_files.bash

🛠️ Development & Tooling

Configuration Updates

  • Enhanced .coveragerc for better test coverage tracking
  • Added .flake8 configuration for code linting
  • Updated .pre-commit-config.yaml with new hooks
  • Updated pyproject.toml with current project metadata

Dependencies

  • Consolidated requirements files (removed requirements.extra.txt and requirements.past.txt)
  • Updated requirements.all.txt, requirements.dev.txt, and requirements.txt
  • Added new development dependencies for improved tooling

🔍 Translation Files

Added and updated translation files for internationalization:

  • New translations for output modules
  • Updated dataset service translations
  • Added configuration utility translations
  • Removed translations for deprecated commands

📋 Design Documentation

  • Added designs/data_outputs/requirements.md documenting the output system requirements and architecture

🐛 Bug Fixes

  • Fixed missing lint on files (commit 58fcb76)
  • Various bug fixes and improvements throughout the refactoring process

📦 Migration Guide

For Users of Deprecated Commands

If you were using any of the removed commands, you'll need to:

  1. Update to the new output command for data export operations
  2. Migrate from RDS/TDL commands to the new modular API structure
  3. Review the updated documentation for equivalent functionality

For API Users

If you were using the CoAL API directly:

  1. Update imports from legacy modules to new /apis/ and /objects/ structure
  2. Replace cosmotech_api.twin_data_layer usage with new modular APIs
  3. Update connection and parameter handling to use new objects
  4. Migrate PostgreSQL utilities from coal.utils.postgresql to coal.postgresql.utils

🎯 Requirements

  • Python: 3.11+
  • CosmoTech API: 5.0+

🔗 Links

👥 Contributors

This release includes contributions from the CosmoTech development team, representing a significant collaborative effort to modernize and improve the library architecture.


Full Changelog: 1.1.0...2.0.0