Skip to content

Commit 6f17c8e

Browse files
authored
Merge pull request #530 from RADAR-base/release-2.3.0
Release 2.3.0
2 parents 3e6a139 + afec0fd commit 6f17c8e

37 files changed

+711
-275
lines changed

.github/workflows/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ jobs:
2424

2525
- uses: actions/setup-java@v3
2626
with:
27-
distribution: zulu
27+
distribution: temurin
2828
java-version: 17
2929

3030
- name: Setup Gradle

.github/workflows/publish_snapshots.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323

2424
- uses: actions/setup-java@v3
2525
with:
26-
distribution: zulu
26+
distribution: temurin
2727
java-version: 17
2828

2929
- name: Setup Gradle

.github/workflows/release.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ jobs:
2020

2121
- uses: actions/setup-java@v3
2222
with:
23-
distribution: zulu
23+
distribution: temurin
2424
java-version: 17
2525

2626
- name: Setup Gradle

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
# See the License for the specific language governing permissions and
1111
# limitations under the License.
1212

13-
FROM --platform=$BUILDPLATFORM gradle:7.4-jdk17 AS builder
13+
FROM --platform=$BUILDPLATFORM gradle:7.5-jdk17 AS builder
1414

1515
RUN mkdir /code
1616
WORKDIR /code

README.md

Lines changed: 64 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ When upgrading to version 0.6.0 from version 0.5.x or earlier, please follow the
7070
This package is available as docker image [`radarbase/radar-output-restructure`](https://hub.docker.com/r/radarbase/radar-output-restructure). The entrypoint of the image is the current application. So in all the commands listed in usage, replace `radar-output-restructure` with for example:
7171

7272
```shell
73-
docker run --rm -t --network s3 -v "$PWD/output:/output" radarbase/radar-output-restructure:2.2.1 -o /output /myTopic
73+
docker run --rm -t --network s3 -v "$PWD/output:/output" radarbase/radar-output-restructure:2.3.0 -o /output /myTopic
7474
```
7575

7676
## Command line usage
@@ -138,19 +138,65 @@ target:
138138
139139
Secrets can be provided as environment variables as well:
140140
141-
| Environment variable | Corresponding value |
142-
| --- | --- |
143-
| `SOURCE_S3_ACCESS_TOKEN` | `source.s3.accessToken` |
144-
| `SOURCE_S3_SECRET_KEY` | `source.s3.secretKey` |
145-
| `SOURCE_AZURE_USERNAME` | `source.azure.username` |
146-
| `SOURCE_AZURE_PASSWORD` | `source.azure.password` |
141+
| Environment variable | Corresponding value |
142+
|-----------------------------|----------------------------|
143+
| `SOURCE_S3_ACCESS_TOKEN` | `source.s3.accessToken` |
144+
| `SOURCE_S3_SECRET_KEY` | `source.s3.secretKey` |
145+
| `SOURCE_AZURE_USERNAME` | `source.azure.username` |
146+
| `SOURCE_AZURE_PASSWORD` | `source.azure.password` |
147147
| `SOURCE_AZURE_ACCOUNT_NAME` | `source.azure.accountName` |
148-
| `SOURCE_AZURE_ACCOUNT_KEY` | `source.azure.accountKey` |
149-
| `SOURCE_AZURE_SAS_TOKEN` | `source.azure.sasToken` |
150-
| `REDIS_URL` | `redis.url` |
148+
| `SOURCE_AZURE_ACCOUNT_KEY` | `source.azure.accountKey` |
149+
| `SOURCE_AZURE_SAS_TOKEN` | `source.azure.sasToken` |
150+
| `REDIS_URL` | `redis.url` |
151151

152152
Replace `SOURCE` with `TARGET` in the variables above to configure the target storage.
153153

154+
### Path format
155+
156+
The output path at the target storage is determined by the path format. The class that handles path
157+
output by default is the `org.radarbase.output.path.FormattedPathFactory`. The default format is
158+
```
159+
${projectId}/${userId}/${topic}/${filename}
160+
```
161+
Each format parameter is enclosed by a dollar sign with curly brackets.
162+
163+
The full set of parameters is listed here:
164+
```yaml
165+
paths:
166+
# Input directories in source storage
167+
inputs:
168+
- /testIn
169+
# Temporary directory for local file processing.
170+
temp: ./output/+tmp
171+
# Output directory in target storage
172+
output: /output
173+
# Output path construction factory
174+
factory: org.radarbase.output.path.FormattedPathFactory
175+
# Additional properties
176+
# properties:
177+
# format: ${projectId}/${userId}/${topic}/${time:mm}/${time:YYYYmmDD_HH'00'}${attempt}${extension}
178+
# plugins: fixed time key value org.example.plugin.MyPathPlugin
179+
```
180+
181+
The FormattedPathFactory can use multiple plugins to format paths based on a given record.
182+
The `fixed` plugin has a number of fixed parameters that can be used:
183+
184+
| Parameter | Description |
185+
|-----------|-------------------------------------------------------------------------|
186+
| projectId | record project ID |
187+
| userId | record user ID |
188+
| sourceId | record source ID |
189+
| topic | Kafka topic |
190+
| filename | default time binning with attempt suffix and file extension |
191+
| attempt | attempt suffix for if a file with an incompatible format already exists |
192+
| extension | file extension |
193+
194+
At least `filename` should be used, or a combination of `attempt` and `extension`.
195+
196+
Then there are also plugins that take their own format. The `time` plugin formats a parameter according to the record time. It takes parameters with format `time:<date format>` where `<date format>` should be replaced by a [Java date format](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/time/format/DateTimeFormatter.html), such as `YYYY-mm-dd`. The plugin tries to use the following time fields, in this order: a double `time` in the value struct, `timeStart` double or `start` long in the key struct, `dateTime` string in the value struct, `date` string in the value struct, `timeReceived` double in the value struct or `timeCompleted` double in the value struct. The first valid value used. If no valid time values are found, `unknown-date` is returned.
197+
198+
The `key` and `value` plugins read values from the key or value structs of a given record. For example, parameter `value:color.red` will attempt to read the value struct, finding first the `color` field and then the enclosed `red` field. If no such value exists, `unknown-value` will be used in the format.
199+
154200
### Cleaner
155201

156202
Source files can be automatically be removed by a cleaner process. This checks whether the file has already been extracted and is older than a configured age. This feature is not enabled by default. It can be configured in the `cleaner` configuration section:
@@ -182,7 +228,7 @@ This package requires at least Java JDK 8. Build the distribution with
182228
and install the package into `/usr/local` with for example
183229
```shell
184230
sudo mkdir -p /usr/local
185-
sudo tar -xzf build/distributions/radar-output-restructure-2.2.1.tar.gz -C /usr/local --strip-components=1
231+
sudo tar -xzf build/distributions/radar-output-restructure-2.3.0.tar.gz -C /usr/local --strip-components=1
186232
```
187233

188234
Now the `radar-output-restructure` command should be available.
@@ -192,10 +238,12 @@ Now the `radar-output-restructure` command should be available.
192238
To implement alternative storage paths, storage drivers or storage formats, put your custom JAR in
193239
`$APP_DIR/lib/radar-output-plugins`. To load them, use the following options:
194240

195-
| Parameter | Base class | Behaviour | Default |
196-
| --------------------------- | --------------------------------------------------- | ------------------------------------------ | ------------------------- |
197-
| `paths: factory: ...` | `org.radarbase.output.path.RecordPathFactory` | Factory to create output path names with. | ObservationKeyPathFactory |
198-
| `format: factory: ...` | `org.radarbase.output.format.FormatFactory` | Factory for output formats. | FormatFactory |
199-
| `compression: factory: ...` | `org.radarbase.output.compression.CompressionFactory` | Factory class to use for data compression. | CompressionFactory |
241+
| Parameter | Base class | Behaviour | Default |
242+
|-----------------------------|-------------------------------------------------------|--------------------------------------------|----------------------|
243+
| `paths: factory: ...` | `org.radarbase.output.path.RecordPathFactory` | Factory to create output path names with. | FormattedPathFactory |
244+
| `format: factory: ...` | `org.radarbase.output.format.FormatFactory` | Factory for output formats. | FormatFactory |
245+
| `compression: factory: ...` | `org.radarbase.output.compression.CompressionFactory` | Factory class to use for data compression. | CompressionFactory |
200246

201247
The respective `<type>: properties: {}` configuration parameters can be used to provide custom configuration of the factory. This configuration will be passed to the `Plugin#init(Map<String, String>)` method.
248+
249+
By adding additional path format plugins to the classpath, the path format of FormattedPathFactory may be expanded with different parameters or lookup engines.

build.gradle.kts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ plugins {
1212
id("com.avast.gradle.docker-compose")
1313
id("com.github.ben-manes.versions")
1414
id("io.github.gradle-nexus.publish-plugin")
15-
id("org.jlleitschuh.gradle.ktlint") version "10.3.0"
15+
id("org.jlleitschuh.gradle.ktlint") version "11.0.0"
1616
}
1717

1818
group = "org.radarbase"
19-
version = "2.2.1"
19+
version = "2.3.0"
2020

2121
repositories {
2222
mavenCentral()
@@ -141,6 +141,7 @@ tasks.withType<KotlinCompile> {
141141
jvmTarget = "17"
142142
apiVersion = "1.6"
143143
languageVersion = "1.6"
144+
freeCompilerArgs = listOf("-opt-in=kotlin.RequiresOptIn")
144145
}
145146
}
146147

gradle.properties

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,26 +2,26 @@ kotlin.code.style=official
22

33
kotlinVersion=1.7.10
44
dokkaVersion=1.7.10
5-
dockerComposeVersion=0.16.8
5+
dockerComposeVersion=0.16.9
66
dependencyUpdateVersion=0.42.0
77
nexusPublishVersion=1.1.0
8-
jsoupVersion=1.15.2
8+
jsoupVersion=1.15.3
99

1010
coroutinesVersion=1.6.4
1111
avroVersion=1.11.1
1212
snappyVersion=1.1.8.4
13-
jacksonVersion=2.13.3
13+
jacksonVersion=2.13.4
1414
jCommanderVersion=1.82
1515
almworksVersion=1.1.2
1616
minioVersion=8.4.3
1717
guavaVersion=31.1-jre
18-
opencsvVersion=5.6
18+
opencsvVersion=5.7.0
1919
okhttpVersion=4.10.0
2020
jedisVersion=4.2.3
2121
slf4jVersion=1.7.36
2222
log4jVersion=2.18.0
2323
azureStorageVersion=12.19.0
24-
nettyVersion=4.1.79.Final
24+
nettyVersion=4.1.80.Final
2525

2626
junitVersion=5.9.0
2727
mockitoKotlinVersion=4.0.0

restructure.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@ paths:
136136
# Additional properties
137137
# properties:
138138
# format: ${projectId}/${userId}/${topic}/${time:mm}/${time:YYYYmmDD_HH'00'}${attempt}${extension}
139+
# plugins: fixed time key value org.example.plugin.MyPathPlugin
139140

140141
# Individual topic configuration
141142
topics:
@@ -160,3 +161,8 @@ topics:
160161
# Disable deduplication
161162
deduplication:
162163
enable: false
164+
questionnaire_response:
165+
# Specify an alternative path format.
166+
pathProperties:
167+
format: ${projectId}/${userId}/${topic}/${value:name}/${filename}
168+
plugins: fixed value

src/integrationTest/java/org/radarbase/output/RestructureS3IntegrationTest.kt

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ import org.radarbase.output.util.objectBuild
1414
import java.nio.charset.StandardCharsets.UTF_8
1515
import java.nio.file.Paths
1616

17+
@OptIn(ExperimentalCoroutinesApi::class)
1718
class RestructureS3IntegrationTest {
1819
@Test
1920
fun integration() = runTest {
@@ -24,17 +25,20 @@ class RestructureS3IntegrationTest {
2425
secretKey = "minioadmin",
2526
bucket = "source",
2627
)
27-
val targetConfig = S3Config(
28-
endpoint = "http://localhost:9000",
29-
accessToken = "minioadmin",
30-
secretKey = "minioadmin",
31-
bucket = "target",
28+
val targetConfig = sourceConfig.copy(bucket = "target")
29+
val topicConfig = mapOf(
30+
"application_server_status" to TopicConfig(
31+
pathProperties = mapOf(
32+
"format" to "\${projectId}/\${userId}/\${topic}/\${value:serverStatus}/\${filename}"
33+
)
34+
)
3235
)
3336
val config = RestructureConfig(
3437
source = ResourceConfig("s3", s3 = sourceConfig),
3538
target = ResourceConfig("s3", s3 = targetConfig),
3639
paths = PathConfig(inputs = listOf(Paths.get("in"))),
37-
worker = WorkerConfig(minimumFileAge = 0L)
40+
worker = WorkerConfig(minimumFileAge = 0L),
41+
topics = topicConfig,
3842
)
3943
val application = Application(config)
4044
val sourceClient = sourceConfig.createS3Client()
@@ -72,7 +76,7 @@ class RestructureS3IntegrationTest {
7276
}
7377

7478
val firstParticipantOutput =
75-
"output/STAGING_PROJECT/1543bc93-3c17-4381-89a5-c5d6272b827c/application_server_status"
79+
"output/STAGING_PROJECT/1543bc93-3c17-4381-89a5-c5d6272b827c/application_server_status/CONNECTED"
7680
val secondParticipantOutput =
7781
"output/radar-test-root/4ab9b985-6eec-4e51-9a29-f4c571c89f99/android_phone_acceleration"
7882

@@ -96,16 +100,15 @@ class RestructureS3IntegrationTest {
96100
assertEquals(csvContents, targetContent.toString(UTF_8))
97101
}
98102

99-
withContext(Dispatchers.IO) {
103+
return@coroutineScope withContext(Dispatchers.IO) {
100104
targetClient.listObjects(
101105
ListObjectsArgs.Builder().bucketBuild(targetConfig.bucket) {
102106
prefix("output")
103107
recursive(true)
104108
useUrlEncodingType(false)
105109
}
106110
)
107-
.map { it.get().objectName() }
108-
.toHashSet()
111+
.mapTo(HashSet()) { it.get().objectName() }
109112
}
110113
}
111114

src/integrationTest/java/org/radarbase/output/accounting/OffsetRangeRedisTest.kt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
package org.radarbase.output.accounting
22

3+
import kotlinx.coroutines.ExperimentalCoroutinesApi
34
import kotlinx.coroutines.test.runTest
45
import org.junit.jupiter.api.AfterEach
56
import org.junit.jupiter.api.Assertions.*
@@ -13,6 +14,7 @@ import java.nio.file.Path
1314
import java.nio.file.Paths
1415
import java.time.Instant
1516

17+
@OptIn(ExperimentalCoroutinesApi::class)
1618
class OffsetRangeRedisTest {
1719
private lateinit var testFile: Path
1820
private lateinit var redisHolder: RedisHolder

0 commit comments

Comments
 (0)