Skip to content

Commit a51929c

Browse files
committed
Improve Spark and Provenance modules
2 parents 53b97b4 + 910dcfc commit a51929c

File tree

252 files changed

+27982
-24105
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

252 files changed

+27982
-24105
lines changed

.github/workflows/ci.yml

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,54 @@ name: Trevas CI
33
on:
44
push:
55
pull_request:
6-
types: [opened, synchronize, reopened]
6+
types: [ opened, synchronize, reopened ]
77

88
jobs:
9+
format:
10+
runs-on: ubuntu-latest
11+
steps:
12+
- name: Checkout code
13+
uses: actions/checkout@v4
14+
15+
- name: Set up JDK 17
16+
uses: actions/setup-java@v4
17+
with:
18+
distribution: "temurin"
19+
java-version: 17
20+
21+
- name: Cache Maven dependencies
22+
uses: actions/cache@v4
23+
with:
24+
path: ~/.m2/repository
25+
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
26+
restore-keys: |
27+
${{ runner.os }}-maven-
28+
- name: Verify code format with Spotless
29+
run: mvn spotless:check
30+
test-excluding-vtl-sdmx:
31+
name: Run Trevas tests excluding vtl-sdmx module
32+
runs-on: ubuntu-latest
33+
needs: format
34+
steps:
35+
- uses: actions/checkout@v4
36+
with:
37+
fetch-depth: 0
38+
- name: Set up Maven Central Repository
39+
uses: actions/setup-java@v4
40+
with:
41+
java-version: 17
42+
distribution: "adopt"
43+
- name: Test
44+
run: mvn test -pl '!vtl-sdmx'
945
test:
1046
name: Run Trevas tests
1147
if: (github.repository != 'InseeFr/Trevas' &&
12-
github.event_name == 'push') ||
13-
(github.event.pull_request.head.repo.fork == true ||
14-
(github.event.pull_request.head.repo.fork == false &&
15-
github.event.pull_request.merged == false))
48+
github.event_name == 'push') ||
49+
(github.event.pull_request.head.repo.fork == true ||
50+
(github.event.pull_request.head.repo.fork == false &&
51+
github.event.pull_request.merged == false))
1652
runs-on: ubuntu-latest
53+
needs: format
1754
steps:
1855
- uses: actions/checkout@v4
1956
with:
@@ -62,12 +99,13 @@ jobs:
6299
test-sonar-package:
63100
name: Run Trevas tests with coverage & sonar checks
64101
# Trevas main repo commit branch or merged PR
65-
if: github.repository == 'InseeFr/Trevas' &&
66-
(github.event_name == 'push' ||
67-
github.event.pull_request.head.repo.fork == false ||
68-
(github.event.pull_request.head.repo.fork == true &&
69-
github.event.pull_request.merged == true))
102+
if: github.repository == 'InseeFr/Trevas' &&
103+
(github.event_name == 'push' ||
104+
github.event.pull_request.head.repo.fork == false ||
105+
(github.event.pull_request.head.repo.fork == true &&
106+
github.event.pull_request.merged == true))
70107
runs-on: ubuntu-latest
108+
needs: format
71109
steps:
72110
- uses: actions/checkout@v4
73111
with:

CONTRIBUTING.md

Lines changed: 91 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,45 @@
11
# 🚀 Contributing to Trevas
22

3-
Thank you for your interest in contributing to Trevas! Your help is invaluable in improving this project. Here’s how you can get started.
3+
Thank you for your interest in contributing to Trevas! Your help is invaluable in improving this project. Here’s how you
4+
can get started.
45

56
## 🛠️ Contribution Guidelines
67

78
### 📌 Roles & Responsibilities
89

9-
- **[Insee](https://www.insee.fr/en)** is responsible for the overall **project governance**. They guide the vision of the project, make final decisions, and ensure the project aligns with its long-term goals.
10-
- **[Making Sense](https://making-sense.info/)** are responsible for reviewing **Pull Requests (PRs)**, merging them into the master branch, adjusting the **GitHub project board**, and tagging issues to ensure proper tracking. They ensure the code meets the project's standards before merging and help prioritize tasks.
11-
- **Maintainers** (which can include anyone who is granted this role) help with day-to-day management of the project, such as reviewing code, merging PRs, and managing the project's direction. If you're interested in becoming a maintainer, please see the section below.
10+
- **[Insee](https://www.insee.fr/en)** is responsible for the overall **project governance**. They guide the vision of
11+
the project, make final decisions, and ensure the project aligns with its long-term goals.
12+
- **[Making Sense](https://making-sense.info/)** are responsible for reviewing **Pull Requests (PRs)**, merging them
13+
into the master branch, adjusting the **GitHub project board**, and tagging issues to ensure proper tracking. They
14+
ensure the code meets the project's standards before merging and help prioritize tasks.
15+
- **Maintainers** (which can include anyone who is granted this role) help with day-to-day management of the project,
16+
such as reviewing code, merging PRs, and managing the project's direction. If you're interested in becoming a
17+
maintainer, please see the section below.
1218

1319
### 📌 Issues & Pull Requests
1420

1521
- Every **Pull Request (PR) must be linked to an existing issue**
1622
- When creating a PR, use the following naming convention:
17-
- **`feat/your-feature-name`** for new features
18-
- **`fix/your-bug-fix`** for bug fixes
23+
- **`feat/your-feature-name`** for new features
24+
- **`fix/your-bug-fix`** for bug fixes
1925
- Discuss your ideas in an issue before starting major work
2026

2127
### 👀 Four Eyes Principle & Code Review Practices
2228

23-
To ensure the quality, reliability, and security of Trevas, we follow the **Four Eyes Principle** for all code contributions.
29+
To ensure the quality, reliability, and security of Trevas, we follow the **Four Eyes Principle** for all code
30+
contributions.
2431

2532
#### 🔍 What is the Four Eyes Principle?
2633

27-
The **Four Eyes Principle** means that **at least two people must review and approve any important action** before it is finalized. In the context of Trevas, this applies specifically to:
34+
The **Four Eyes Principle** means that **at least two people must review and approve any important action** before it is
35+
finalized. In the context of Trevas, this applies specifically to:
2836

2937
- Pull Request (PR) reviews and approvals
3038
- Merges into the `develop` or `master` branches
3139
- Structural changes in architecture, testing strategy, or documentation
3240

3341
This principle helps:
42+
3443
- Catch potential bugs or security issues early
3544
- Encourage collaborative development
3645
- Share knowledge across the team
@@ -49,10 +58,12 @@ All Pull Requests should:
4958

5059
**Only after meeting these conditions can a PR be merged**.
5160

52-
Maintainers are expected to **uphold these principles** and ensure reviews are thoughtful, constructive, and inclusive.
61+
Maintainers are expected to **uphold these principles** and ensure reviews are thoughtful, constructive, and
62+
inclusive.
5363
We believe that code review is not just a checkpoint, but a space to learn, improve, and grow as a team.
5464

55-
> 🤝 If you're contributing regularly and want to become a maintainer, don't hesitate to reach out to us at **contact@making-sense.info** — we’d love to welcome more eyes to the team!
65+
> 🤝 If you're contributing regularly and want to become a maintainer, don't hesitate to reach out to us at *
66+
*contact@making-sense.info** — we’d love to welcome more eyes to the team!
5667

5768
### 🧪 Test-Driven Development (TDD)
5869

@@ -62,20 +73,82 @@ We encourage a **test-driven approach** to ensure code reliability and maintaina
6273
2. **Implement the code** needed to pass the test
6374
3. **Refactor** while ensuring all tests remain green
6475

65-
### 🎨 Code Formatting
76+
### 🎨 Code Formatting with Spotless
77+
78+
We use [Spotless](https://github.com/diffplug/spotless) to enforce consistent code formatting based on the *
79+
*[Google Java Style Guide](https://google.github.io/styleguide/javaguide.html)**.
80+
81+
#### 💻 Check formatting
82+
83+
Before committing, run:
84+
85+
```bash
86+
mvn spotless:check
87+
```
88+
89+
#### 🛠️ Auto-format your code
90+
91+
To automatically format your code, run:
92+
93+
```bash
94+
mvn spotless:apply
95+
```
96+
97+
This ensures that your code:
98+
99+
- Uses **tabs** for indentation
100+
- Respects a **120-character line limit**
101+
- Applies **Google Java Format**
102+
103+
All contributions **must pass `spotless:check` in CI**.
104+
105+
### 🧠 IntelliJ IDEA Setup for Spotless
106+
107+
To use Spotless formatting from IntelliJ:
108+
109+
1. **Enable necessary VM options** (required to support Google Java Format):
110+
111+
Go to:
112+
113+
```
114+
Preferences (macOS) or Settings (Windows/Linux) >
115+
Build, Execution, Deployment >
116+
Compiler >
117+
Java Compiler >
118+
Additional command line parameters
119+
```
120+
121+
Add the following VM options:
122+
123+
```
124+
--add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED
125+
--add-exports=jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED
126+
--add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED
127+
--add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED
128+
--add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED
129+
--add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED
130+
```
131+
132+
2. **(Optional)**: Automate formatting on save using the *Save Actions* plugin:
133+
134+
- Install the **Save Actions** plugin from the IntelliJ Marketplace
135+
- Go to `Preferences > Save Actions`
136+
- Enable **"Activate save actions on save"**
137+
- Configure it to run an external tool with the command:
66138

67-
To keep our codebase consistent, please follow these formatting guidelines:
139+
```bash
140+
mvn spotless:apply
141+
```
68142

69-
- **Indentation**: Use tabs for indentation
70-
- **Line Length**: Keep lines under 120 characters
71-
- **Naming**: Use meaningful and descriptive English names for variables, functions, and classes
143+
> 💡 You can also define a custom keybinding in IntelliJ to run `mvn spotless:apply` easily when needed.
72144
73145
## 🔄 Creating a Pull Request
74146

75147
To contribute, follow these steps:
76148

77149
1. **Fork** the repository
78-
2. **Create a branch** from `develop` using the correct prefix (`feat/` or `fix/`). Example: `feat/authentication` or `fix/typo-readme`
150+
2. **Create a branch** from `develop` using the correct prefix (`feat/` or `fix/`). Example: `feat/authentication` or
151+
`fix/typo-readme`
79152
3. **Commit** your changes with a clear and concise message
80153
4. **Push** your branch to your fork
81154
5. **Open a Pull Request** targeting the `develop` branch
@@ -100,7 +173,8 @@ Check out our [Project Board](https://github.com/InseeFr/Trevas/projects) to see
100173
### 🗣️ Communication
101174

102175
We encourage **open and transparent discussions**.
103-
If you have any questions or suggestions, please use [GitHub Issues](https://github.com/InseeFr/Trevas/issues) so the whole community can participate.
176+
If you have any questions or suggestions, please use [GitHub Issues](https://github.com/InseeFr/Trevas/issues) so the
177+
whole community can participate.
104178

105179
## 📄 License
106180

README.md

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,23 +6,32 @@ Transformation engine and validator for statistics.
66
[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=InseeFr_Trevas&metric=alert_status)](https://sonarcloud.io/dashboard?id=InseeFr_Trevas)
77
[![Coverage](https://sonarcloud.io/api/project_badges/measure?project=InseeFr_Trevas&metric=coverage)](https://sonarcloud.io/dashboard?id=InseeFr_Trevas)
88
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/fr.insee.trevas/trevas-parent/badge.svg)](https://maven-badges.herokuapp.com/maven-central/fr.insee.trevas/trevas-parent)
9+
[![Javadoc](https://img.shields.io/badge/Javadoc-fr.insee.trevas-ff69b4?logo=java&style=flat-square)](https://javadoc.io/doc/fr.insee.trevas)
910
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
1011
[![Mentioned in Awesome Official Statistics ](https://awesome.re/mentioned-badge.svg)](http://www.awesomeofficialstatistics.org)
1112

12-
Trevas is a Java engine for the Validation and Transformation Language (VTL), an [SDMX standard](https://sdmx.org/?page_id=5096) that allows the formal definition of algorithms to validate statistical data and calculate derived data. VTL is user oriented and provides a technology-neutral and standard view of statistical processes at the business level. Trevas supports the latest VTL version (v2.1, July 2024).
13+
Trevas is a Java engine for the Validation and Transformation Language (VTL),
14+
an [SDMX standard](https://sdmx.org/?page_id=5096) that allows the formal definition of algorithms to validate
15+
statistical data and calculate derived data. VTL is user oriented and provides a technology-neutral and standard view of
16+
statistical processes at the business level. Trevas supports the latest VTL version (v2.1, July 2024).
1317

14-
For actual execution, VTL expressions need to be translated to the target runtime environment. Trevas provides this step for the Java platform, by using the VTL formal grammar and the [Antlr](https://www.antlr.org/) tool. For a given execution, Trevas receives the VTL expression and the data bindings that associate variable names in the expression to actual data sets. The execution results can then be retrieved from the bindings for further treatments.
18+
For actual execution, VTL expressions need to be translated to the target runtime environment. Trevas provides this step
19+
for the Java platform, by using the VTL formal grammar and the [Antlr](https://www.antlr.org/) tool. For a given
20+
execution, Trevas receives the VTL expression and the data bindings that associate variable names in the expression to
21+
actual data sets. The execution results can then be retrieved from the bindings for further treatments.
1522

1623
Trevas provides an abstract definition of a Java VTL engine, as well as two concrete implementations:
1724

18-
- an in-memory engine for relatively small data, for example at design time when developing and testing VTL expressions on data samples
25+
- an in-memory engine for relatively small data, for example at design time when developing and testing VTL expressions
26+
on data samples
1927
- an [Apache Spark](https://spark.apache.org/) engine for Big Data production environments
2028

2129
Other implementations can be easily developed for different contexts.
2230

2331
## Documentation
2432

25-
The documentation can be found in the [docs](https://github.com/InseeFr/Trevas/tree/master/docs) folder and [browsed online](https://inseefr.github.io/Trevas).
33+
The documentation can be found in the [docs](https://github.com/InseeFr/Trevas/tree/master/docs) folder
34+
and [browsed online](https://inseefr.github.io/Trevas).
2635

2736
If you want to contribute, see this [guide](docs/CONTRIBUTING.md).
2837

@@ -48,4 +57,5 @@ Trevas is part of the [sdmx.io](https://www.sdmx.io/) ecosystem.
4857
<img src="https://awesome.re/mentioned-badge.svg" />
4958
</p>
5059

51-
Trevas is referenced by [_Awesome official statistics software_](https://github.com/SNStatComp/awesome-official-statistics-software).
60+
Trevas is referenced by [_Awesome official statistics
61+
software_](https://github.com/SNStatComp/awesome-official-statistics-software).

coverage/pom.xml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
<parent>
77
<groupId>fr.insee.trevas</groupId>
88
<artifactId>trevas-parent</artifactId>
9-
<version>1.9.0</version>
9+
<version>1.10.0</version>
1010
</parent>
1111

1212
<artifactId>coverage</artifactId>
@@ -22,37 +22,37 @@
2222
<dependency>
2323
<groupId>fr.insee.trevas</groupId>
2424
<artifactId>vtl-engine</artifactId>
25-
<version>1.9.0</version>
25+
<version>1.10.0</version>
2626
</dependency>
2727
<dependency>
2828
<groupId>fr.insee.trevas</groupId>
2929
<artifactId>vtl-jackson</artifactId>
30-
<version>1.9.0</version>
30+
<version>1.10.0</version>
3131
</dependency>
3232
<dependency>
3333
<groupId>fr.insee.trevas</groupId>
3434
<artifactId>vtl-jdbc</artifactId>
35-
<version>1.9.0</version>
35+
<version>1.10.0</version>
3636
</dependency>
3737
<dependency>
3838
<groupId>fr.insee.trevas</groupId>
3939
<artifactId>vtl-model</artifactId>
40-
<version>1.9.0</version>
40+
<version>1.10.0</version>
4141
</dependency>
4242
<dependency>
4343
<groupId>fr.insee.trevas</groupId>
4444
<artifactId>vtl-parser</artifactId>
45-
<version>1.9.0</version>
45+
<version>1.10.0</version>
4646
</dependency>
4747
<dependency>
4848
<groupId>fr.insee.trevas</groupId>
4949
<artifactId>vtl-spark</artifactId>
50-
<version>1.9.0</version>
50+
<version>1.10.0</version>
5151
</dependency>
5252
<dependency>
5353
<groupId>fr.insee.trevas</groupId>
5454
<artifactId>vtl-csv</artifactId>
55-
<version>1.9.0</version>
55+
<version>1.10.0</version>
5656
</dependency>
5757
<dependency>
5858
<groupId>com.fasterxml.jackson.core</groupId>

0 commit comments

Comments
 (0)