@@ -7,6 +7,116 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77
88## [ Unreleased]
99
10+ ### Version 1.14.0-SNAPSHOT - February 03, 2026
11+
12+ #### Weather Ingestion Module - Dual Storage Implementation for NOAA Data
13+
14+ ** Added:**
15+ - ** Dual Storage System** - Simultaneous storage of raw text and JSON formats
16+ - Raw text files stored in ` raw-data/{source}/{type}/{year}/{month}/{day}/ ` structure
17+ - JSON files stored in ` speed-layer/{source}/{type}/{year}/{month}/{day}/ ` structure
18+ - Consistent date partitioning across both storage types
19+ - File naming: ` {station}_{timestamp}.{ext} ` where timestamp is ` yyyyMMdd_HHmm `
20+
21+ - ** S3UploadService Enhancements** (weather-ingestion)
22+ - ` uploadWeatherDataDual() ` - Recommended method for NOAA data ingestion with dual storage
23+ - ` uploadRawDataWithPartitioning() ` - Enhanced raw data upload with date partitioning
24+ - ` DualStorageResult ` record - Immutable result containing both S3 keys (raw text + JSON)
25+ - Compact constructor with validation
26+ - Ensures both keys are non-null and non-empty
27+ - Accessor methods: ` rawTextKey() ` , ` jsonKey() `
28+ - Enhanced metadata tagging for both raw text and JSON uploads
29+ - Comprehensive parameter validation in all upload methods
30+
31+ - ** Enhanced Metadata Tracking** (weather-ingestion)
32+ - ` s3_raw_key ` - S3 key for raw text file location
33+ - ` s3_json_key ` - S3 key for JSON file location
34+ - ` s3_key ` - Legacy field maintained for backward compatibility (points to JSON)
35+ - ` storage_format ` - Set to "dual" to indicate both formats stored
36+ - ` processor_version ` - Updated to "2.1"
37+
38+ - ** Documentation** (weather-ingestion)
39+ - ` S3_BUCKET_SETUP.md ` - Comprehensive S3 bucket configuration guide
40+ - AWS CLI and Console setup instructions
41+ - Lifecycle policies for cost optimization
42+ - Bucket structure and partitioning examples
43+ - Security best practices
44+ - Troubleshooting guide
45+ - ` SINGLE_STATION_INTEGRATION_TEST.md ` - Step-by-step integration testing procedures
46+ - Pre-flight checklist
47+ - Test execution instructions
48+ - Validation commands
49+ - Success criteria
50+ - Troubleshooting scenarios
51+
52+ ** Changed:**
53+ - ** S3UploadService** (weather-ingestion)
54+ - ` uploadWeatherDataDual() ` now the recommended method for NOAA data ingestion
55+ - Enhanced partitioning structure matches between raw-data and speed-layer paths
56+ - Improved metadata tagging with source, station-id, data-type, and ingestion-time
57+ - Added comprehensive validation in ` uploadRawDataWithPartitioning() ` for all parameters
58+ - Updated S3 content types: ` text/plain ` for raw text, ` application/json ` for JSON
59+
60+ - ** SpeedLayerProcessor** (weather-ingestion)
61+ - Updated to use dual storage by default via ` uploadWeatherDataDual() `
62+ - Processor version incremented from "2.0" to "2.1"
63+ - Enhanced metadata enrichment with ` storage_format ` field
64+ - Both S3 keys now stored in processed WeatherData metadata
65+ - Updated statistics output to indicate dual storage enabled
66+ - Improved logging with both raw and JSON file paths
67+
68+ - ** S3 Bucket Structure** (weather-ingestion)
69+ - Standardized date partitioning: ` {year}/{month}/{day}/ ` for both storage types
70+ - File naming convention: ` {station}_{timestamp}.{ext} `
71+ - Timestamp format: ` yyyyMMdd_HHmm ` (UTC timezone)
72+ - Consistent metadata across both raw and JSON uploads
73+ - Example raw path: ` raw-data/noaa/metar/2026/02/03/KCLT_20260203_1430.txt `
74+ - Example JSON path: ` speed-layer/noaa/metar/2026/02/03/KCLT_20260203_1430.json `
75+
76+ ** Technical Details:**
77+ - ** Dual Storage Benefits:**
78+ - Raw text enables long-term archival and reprocessing
79+ - JSON enables fast querying and analysis
80+ - Both formats stored simultaneously in single transaction
81+ - Date partitioning optimizes query performance and cost
82+
83+ - ** File Format Specifications:**
84+ - Raw text files: ` .txt ` extension with ` text/plain ` content type
85+ - JSON files: ` .json ` extension with ` application/json ` content type
86+ - Both include comprehensive S3 metadata for tracking and filtering
87+
88+ - ** Time Handling:**
89+ - All timestamps in UTC for consistency
90+ - Date partitioning uses ingestion time (not observation time)
91+ - Supports month/year boundary transitions correctly
92+
93+ - ** Backward Compatibility:**
94+ - Existing single-storage deployments continue to work
95+ - Legacy ` s3_key ` metadata field maintained (points to JSON)
96+ - New deployments should use ` uploadWeatherDataDual() ` method
97+ - Graceful handling of missing dual storage fields
98+
99+ ** Migration Notes:**
100+ - New deployments should use ` uploadWeatherDataDual() ` for NOAA data
101+ - Existing code using ` uploadWeatherData() ` (JSON-only) continues to work
102+ - Legacy ` s3_key ` field maintained for backward compatibility
103+ - Update lifecycle policies to handle both ` raw-data/ ` and ` speed-layer/ ` prefixes
104+ - Recommended lifecycle:
105+ - Speed layer JSON: Delete after 30 days (recent data only)
106+ - Raw data text: Archive to Glacier after 90 days (long-term storage)
107+
108+ ** Build & Quality:**
109+ - All existing tests passing (0 failures, 0 errors)
110+ - No breaking changes to public APIs
111+ - Requires Java 16+ for record types (` DualStorageResult ` )
112+ - AWS SDK S3 client configuration unchanged
113+
114+ ** Notes:**
115+ - Dual storage implementation complete and production-ready
116+ - Comprehensive documentation enables smooth deployment
117+ - Integration test guide validates end-to-end functionality
118+ - Ready for production deployment with monitoring and lifecycle policies
119+
10120### Version 1.13.0-SNAPSHOT - January 28, 2026
11121
12122#### Weather Storage Module - Phase 4 GSI Implementation & DynamoDB Integration Testing
0 commit comments