Skip to content

Commit 533e26c

Browse files
committed
Script to generate keywords added
ATBD and PUM templates updated Show Version field in HTML doc
1 parent d17d9c5 commit 533e26c

File tree

11 files changed

+185
-98
lines changed

11 files changed

+185
-98
lines changed

.github/workflows/render-quarto.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,17 @@ jobs:
3636
sudo apt install -y libreoffice libreoffice-common libreoffice-writer fonts-dejavu
3737
sudo apt install -y coreutils procps
3838
39+
- name: Set up Python
40+
uses: actions/setup-python@v5
41+
with:
42+
python-version: '3.11'
43+
44+
- name: Generate keywords for documents
45+
run: |
46+
python -m pip install --upgrade pip
47+
pip install keybert ruamel.yaml pyyaml
48+
python scripts/generate_keywords.py
49+
3950
- name: Remove cached version of _site
4051
run: |
4152
rm -rf _site

metadata/default-before-body.html

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
2+
<script>
3+
document.addEventListener("DOMContentLoaded", function() {
4+
5+
// Remove link to .docx file
6+
//
7+
document.querySelectorAll("a[href$='.docx']").forEach(link => link.style.display = 'none');
8+
9+
// Add Version to the title's metadata
10+
//
11+
const titleMetaEl = document.querySelector(".quarto-title-meta");
12+
const wrapperDiv = document.createElement('div');
13+
wrapperDiv.innerHTML = `
14+
<div class="quarto-title-meta-heading">Version</div>
15+
<div class="quarto-title-meta-contents">
16+
<p>{{< meta version >}}</p>
17+
</div>
18+
`;
19+
titleMetaEl.appendChild(wrapperDiv);
20+
21+
});
22+
23+
</script>

metadata/default.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,12 @@ number-sections: true
66
sitemap: true # Enables sitemap generation for web crawlers
77
tbl-colwidths: auto
88
license: "EUPL (>= 1.2)"
9+
keywords: "{keywords_str}"
910
format:
1011
html:
1112
include-in-header:
1213
- ../../metadata/json-ld.html
14+
include-before-body: ../../metadata/default-before-body.html
1315
code-fold: true # Allow code blocks to be foldable
1416
self-contained: true # Embed resources like CSS and images
1517
docx:

scripts/generate_keywords.py

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
import yaml
2+
import re
3+
import glob
4+
from pathlib import Path
5+
from keybert import KeyBERT
6+
from io import StringIO
7+
from ruamel.yaml import YAML
8+
from ruamel.yaml.comments import CommentedSeq
9+
from ruamel.yaml.scalarstring import DoubleQuotedScalarString
10+
11+
12+
QUARTO_CONFIG="_quarto.yaml"
13+
14+
kw_model = KeyBERT('paraphrase-mpnet-base-v2')
15+
16+
def get_render_files(quarto_yml_path=QUARTO_CONFIG):
17+
with open(quarto_yml_path, 'r') as f:
18+
config = yaml.safe_load(f)
19+
20+
render_list = config.get('project', {}).get('render', [])
21+
matched_files = []
22+
23+
for pattern in render_list:
24+
matched = glob.glob(pattern, recursive=True)
25+
matched_files.extend([Path(f) for f in matched if f.endswith('.qmd')])
26+
27+
# Exclude files named 'index.qmd'
28+
return [f for f in matched_files if f.name.lower() != 'index.qmd']
29+
30+
def inject_keywords(file_path: Path):
31+
with open(file_path, 'r', encoding='utf-8') as f:
32+
content = f.read()
33+
34+
parts = re.split(r'^---\s*$', content, maxsplit=2, flags=re.MULTILINE)
35+
if len(parts) < 3:
36+
return
37+
38+
_, yaml_block, body = parts
39+
40+
yaml = YAML()
41+
yaml.preserve_quotes = True
42+
yaml_data = yaml.load(StringIO(yaml_block))
43+
44+
keywords = kw_model.extract_keywords(body,
45+
keyphrase_ngram_range=(1, 2),
46+
stop_words='english',
47+
use_mmr=True,
48+
diversity=0.2,
49+
top_n=10)
50+
51+
# Update keywords field
52+
keywords_list = CommentedSeq([DoubleQuotedScalarString(kw) for kw, _ in keywords])
53+
keywords_list.fa.set_flow_style()
54+
yaml_data['keywords'] = keywords_list
55+
56+
# Reconstruct YAML as string
57+
yaml_out = StringIO()
58+
yaml.dump(yaml_data, yaml_out)
59+
final_yaml = yaml_out.getvalue().strip()
60+
61+
# Combine back
62+
updated_content = f"---\n{final_yaml}\n---\n{body}"
63+
64+
with open(file_path, 'w', encoding='utf-8') as f:
65+
f.write(updated_content)
66+
67+
68+
if __name__ == "__main__":
69+
files = get_render_files()
70+
for file in files:
71+
inject_keywords(file)

src/guidelines/IT_Architecture_Principles_and_Implementation_Guidelines.qmd

Lines changed: 6 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,23 @@
11
---
22
title: "IT Architecture Principles and Implementation Guidelines"
33
subtitle: "Copernicus Land Monitoring Service"
4-
author: "European Environment Agency (EEA)"
4+
author: "European Environment Agency (EEA)"
55
version: "1.4a"
66
date: "2025-03-06"
77
product-name: IT Architecture Principles and Implementation Guidelines
88
description: "IT Architecture Principles and Implementation Guidelines"
9-
keywords: ["Copernicus Land Monitoring Service, CLMS IT Architecture, European Environment Agency, IT Principles and Guidelines, IT Ecosystem, IT Security, EUPL Licensing, Reproducibility, Reusability, Transparency, Scalability, Maintainability, Resilient IT Solutions, Modular IT Architecture, Continuous Integration"]
10-
119

1210
metadata-files:
13-
- ../../metadata/default.yml
14-
11+
- ../../metadata/default.yml
12+
1513
format:
16-
pdf: default
17-
html:
14+
html:
1815
css: ../styles/styles.css
1916
docx:
20-
reference-doc: ../styles/template-guideline.docx
21-
17+
reference-doc: ../styles/template-guideline.docx
18+
pdf: default
2219
---
2320

24-
```{=html}
25-
<script>
26-
document.addEventListener("DOMContentLoaded", function() {
27-
document.querySelectorAll("a[href$='.docx']").forEach(link => link.style.display = 'none');
28-
});
29-
</script>
30-
```
3121

3222
# Preface {#preface}
3323

src/styles/template-atbd.docx

12 Bytes
Binary file not shown.

src/styles/template-guideline.docx

82 Bytes
Binary file not shown.

src/styles/template-pum.docx

233 Bytes
Binary file not shown.
208 KB
Loading

src/templates/CLMS_ATBD_Template.qmd

Lines changed: 34 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -2,27 +2,19 @@
22
title: "Product SHORT NAME ALGORITHM THEORETICAL BASIS DOCUMENT (ATBD)"
33
subtitle: "ATBD Copernicus Land Monitoring Service -- Product full name"
44
date: "2022-10-06"
5-
version: Issue x.y (“(x) version of the document” + “.”+ “(y) version of the document update”)
5+
version: Issue x.y (“(x) version of the document” + “.”+ “(y) version of the document
6+
update”)
67
product-name: Product Name
78
description: "Product DESCRIPTION"
8-
99
metadata-files:
10-
- ../../metadata/default.yml
11-
10+
- ../../metadata/default.yml
11+
1212
format:
1313
docx:
1414
reference-doc: ../styles/template-atbd.docx
1515
hidden: true
1616
---
1717

18-
```{=html}
19-
<script>
20-
document.addEventListener("DOMContentLoaded", function() {
21-
document.querySelectorAll("a[href$='.docx']").forEach(link => link.style.display = 'none');
22-
});
23-
</script>
24-
```
25-
2618
{{< pagebreak >}}
2719

2820
::: {.subtitle custom-style="subtitle"}
@@ -221,7 +213,7 @@ In detail, the document is structured as follows:
221213
# Methodology [(mandatory chapter)]{.yellow-box custom-style="yellow-box-line"}
222214

223215
:::{.yellow-box custom-style="yellow-box"}
224-
Describes the workflow, input datasets, production methodology including data pre-processing, processing, post-processing and quality control and internal validation.
216+
Describes the workflow, input datasets, production methodology including data pre-processing, processing, post-processing and quality control and product verification.
225217

226218
In the "Methodology Description" chapter, some of the subchapters are classified as mandatory, but only if they are relevant to the methodology being described. Optional subchapters can be included or omitted, allowing flexibility to adapt the structure to the specific characteristics of the product.
227219

@@ -260,18 +252,24 @@ Provide an overview table of methodology. Follow the example below:
260252
<td>Example: 10m spatial resolution</td>
261253
</tr>
262254
<tr>
263-
<td>Validation and Accuracy</td>
264-
<td>Example: Validation using ground-truth data, accuracy assessment with RMSE calculations.</td>
255+
<td>Verification and Accuracy</td>
256+
<td>Example: Product verification using ground-based reference, accuracy assessment with RMSE calculations.</td>
265257
</tr>
266258
</tbody>
267259
</table>
268260
```
269261
{{< pagebreak >}}
270262

263+
## Theoretical Background [(optional subchapter)]{.yellow-box custom-style="yellow-box-line"}
264+
265+
:::{.yellow-box custom-style="yellow-box"}
266+
Provides an overview of concepts, common definitions and assumptions of the product and links it to related previous studies and work to allow an understanding of the product and applied methodology.
267+
:::
268+
271269
## Methodology and workflow [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
272270

273271
:::{.yellow-box custom-style="yellow-box"}
274-
Provides an overview of the product generation, highlighting the workflow, the input datasets, key processes involved, and algorithms used.
272+
Provides an overview of the product generation, highlighting the workflow, the input datasets, key processes involved, and algorithms used. Ideally, the description is complemented by a traceability diagram that allows users to understand the uncertainty budget of the product as a result of its inputs and workflow (if needed, add subchapter).
275273

276274
Below is a generic example that should be adapted to the specific needs of the product to provide an overview of the production process workflow.
277275
:::
@@ -286,7 +284,7 @@ Below is a generic example that should be adapted to the specific needs of the p
286284
Explain which datasets (e.g., Sentinel-x, DEM, or other CLMS products) serve as input for the product. Clarify internal dependencies within the product, such as primary data used to generate secondary outputs. Reference external thematic data where applicable and ensure the workflow diagram illustrates the connections between different datasets.
287285
:::
288286

289-
## Pre-processing [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
287+
## Pre-processing [(optional subchapter)]{.yellow-box custom-style="yellow-box-line"}
290288

291289
:::{.yellow-box custom-style="yellow-box"}
292290
Details the initial steps taken to prepare the data for further processing, including corrections and adjustments to raw data. Sub-chapter structure optional.
@@ -348,23 +346,23 @@ Outlines the workflow and methods used during the processing stage, detailing ea
348346
Provides detailed descriptions of the algorithms used in the processing stage.
349347
:::
350348

351-
## Post-processing [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
349+
## Post-processing [(optional subchapter)]{.yellow-box custom-style="yellow-box-line"}
352350

353351
:::{.yellow-box custom-style="yellow-box"}
354-
Presents tasks typically involved in the refinement, validation, and finalization of the datasets such has manual steps, filtering and smoothing, integration of ancillary data, reprojecting, resampling or merging, classification improvements and preparing data for end-users. A subchapter structure as suggested with previous sections may be introduced if necessary.
352+
Presents tasks typically involved in the refinement, quality assurance and control, and finalization of the datasets such has manual steps, filtering and smoothing, integration of ancillary data, reprojecting, resampling or merging, classification improvements and preparing data for end-users. A subchapter structure as suggested with previous sections may be introduced if necessary.
355353
:::
356354

357355
## Output products [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
358356

359357
:::{.yellow-box custom-style="yellow-box"}
360-
Specifies the types of output products generated (e.g., status layer / change, aggregated, expert, reference layers / and other datasets).
358+
Specifies the types of output products generated (e.g., status layer/change/auxiliary layer, reference layer, near-real time).
361359
:::
362360

363361
{{< pagebreak >}}
364362
# Quality control and production verification [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
365363

366364
:::{.yellow-box custom-style="yellow-box"}
367-
Discusses the methods and criteria used to assess the internal validation process (i.e. production verification), the quality and accuracy of the products, ensuring they meet the required standards and specifications.
365+
Discusses the methods and criteria used to assess the internal quality assurance and control process as well as production verification, the quality and accuracy of the products, ensuring they meet the required standards and specifications.
368366
:::
369367

370368
{{< pagebreak >}}
@@ -381,7 +379,6 @@ Discusses any challenges or limitations of the current methodology, algorithm, i
381379
Presents information on the terms of use, citation guidelines, and technical support for the products.
382380
:::
383381

384-
{{< pagebreak >}}
385382
## Terms of use [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
386383

387384
:::{.yellow-box custom-style="yellow-box"}
@@ -391,17 +388,18 @@ Here an example of a standard text.
391388
:::
392389

393390
:::{.grey-box custom-style="grey-box"}
394-
The Terms of Use for the product(s) described in this document, acknowledge that:
395-
396-
"Free, full and open access to the products and services of the Copernicus Land Monitoring Service is made on the conditions that:
391+
The Terms of Use for the product(s) described in this document, acknowledge the following:
392+
Free, full and open access to the products and services of the Copernicus Land Monitoring Service is made on the conditions that:
397393

398394
1. When distributing or communicating Copernicus Land Monitoring Service products and services (data, software scripts, web services, user and methodological documentation and similar) to the public, users shall inform the public of the source of these products and services and shall acknowledge that the Copernicus Land Monitoring Service products and services were produced "with funding by the European Union".
399395

400396
2. Where the Copernicus Land Monitoring Service products and services have been adapted or modified by the user, the user shall clearly state this.
401397

402398
3. Users shall make sure not to convey the impression to the public that the user\'s activities are officially endorsed by the European Union.
403399

404-
The user has all intellectual property rights to the products he/she has created based on the Copernicus Land Monitoring Service products and services." [^1]
400+
The user has all intellectual property rights to the products he/she has created based on the Copernicus Land Monitoring Service products and services.
401+
402+
[Consult Data policy — Copernicus Land Monitoring Service for further details](https://land.copernicus.eu/en/data-policy)
405403
:::
406404

407405
## Citation [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
@@ -413,17 +411,20 @@ Here are examples of standard text to be used according to the situation:
413411
:::
414412

415413
:::{.grey-box custom-style="grey-box"}
416-
When **planning to publish a publication (scientific, commercial, etc.)**, it should be explicitly mentioned:
414+
When **planning to publish a publication (scientific, commercial, etc.)**, it shall explicitly mention:
415+
416+
> "This publication has been prepared using European Union\'s Copernicus Land Monitoring Service information; \<insert all relevant DOI links here, if applicable\>"
417417
418-
> "This publication has been prepared using European Union\'s Copernicus Land Monitoring Service information; \<insert all relevant DOI links here, if applicable\>"^1^
418+
When developing a **product or service using the products or services of the Copernicus Land Monitoring Service**, it shall explicitly mention:
419419

420-
When developing a **product or service using the products or services of the Copernicus Land Monitoring Service**, it should explicitly mentioned:
420+
> "Generated using European Union\'s Copernicus Land Monitoring Service information; \<insert all relevant DOI links here, if applicable\>"
421421
422-
> "Generated using European Union\'s Copernicus Land Monitoring Service information; \<insert all relevant DOI links here, if applicable\>"^1^
422+
When **redistributing a part of the Copernicus Land Monitoring Service (product, dataset, documentation, picture, web service, etc.)**, it shall explicitly mention:
423423

424-
When **redistributing a part of the Copernicus Land Monitoring Service (product, dataset, documentation, picture, web service, etc.)**, it should explicitly mentioned:
424+
> "European Union\'s Copernicus Land Monitoring Service information; \<insert all relevant DOI links here, if applicable\>"
425+
426+
[Consult Data policy — Copernicus Land Monitoring Service for further details](https://land.copernicus.eu/en/data-policy)
425427

426-
> "European Union\'s Copernicus Land Monitoring Service information; \<insert all relevant DOI links here, if applicable\>"^1^
427428
:::
428429

429430
## Product technical support [(mandatory subchapter)]{.yellow-box custom-style="yellow-box-line"}
@@ -435,8 +436,7 @@ Here an example of a standard text.
435436
:::
436437

437438
:::{.grey-box custom-style="grey-box"}
438-
Product technical support is provided by the product custodian through Copernicus Land Monitoring Service desk[^2]. Product technical support does not include software specific user support or general GIS or remote sensing support.
439-
439+
Product technical support is provided by the product custodian through [Copernicus Land Monitoring Service -- Service desk](https://land.copernicus.eu/en/data-policy). Product technical support does not include software specific user support or general GIS or remote sensing support.
440440
More information on the products can be found on the Copernicus Land Monitoring Service website (https://land.copernicus.eu/)
441441
:::
442442

@@ -472,7 +472,3 @@ Lists references cited throughout the document.
472472
:::{.yellow-box custom-style="yellow-box"}
473473
Annexes should be kept to a minimum but can be placed as needed to present technical details, data and information repository, resources on product development.
474474
:::
475-
476-
[^1]: [Copernicus Land Monitoring Service - Data policy](https://land.copernicus.eu/en/data-policy)
477-
478-
[^2]: [Copernicus Land Monitoring Service -- Service desk](https://land.copernicus.eu/en/data-policy)

0 commit comments

Comments
 (0)