Skip to content

Commit dafc4bb

Browse files
committed
chore: update ekglib docs and components
1 parent cd5681e commit dafc4bb

File tree

50 files changed

+428
-251
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+428
-251
lines changed

.vscode/settings.json

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,11 @@
11
{
2-
"python.testing.unittestArgs": [
3-
"-v",
4-
"-s",
5-
"./tests",
6-
"-p",
7-
"test_*.py"
8-
],
9-
"python.testing.pytestEnabled": false,
10-
"python.testing.nosetestsEnabled": false,
11-
"python.testing.unittestEnabled": true,
12-
"python.linting.pylintEnabled": true,
13-
"python.linting.enabled": true
14-
}
2+
// Python formatter configuration
3+
"[python]": {
4+
"editor.defaultFormatter": "charliermarsh.ruff",
5+
"editor.formatOnSave": true,
6+
"editor.codeActionsOnSave": {
7+
"source.fixAll": "explicit",
8+
"source.organizeImports": "explicit"
9+
}
10+
}
11+
}

README.md

Lines changed: 52 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# ekglib
22

3-
> ⚠️ **Warning:** This is a pre-alpha project. Everything in this repo is subject to heavy change.
4-
53
A Python Library for various tasks in an EKG DataOps operation.
64

75
## Badges
@@ -13,39 +11,67 @@ A Python Library for various tasks in an EKG DataOps operation.
1311
[![Linting & Formatting: ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
1412
[![Type checking: mypy](https://img.shields.io/badge/type%20checking-mypy-blue.svg)](https://mypy.readthedocs.io/)
1513

16-
**Links:**
14+
## Metadata Parsers
1715

18-
- 📖 [Documentation](https://ekgf.github.io/ekglib/)
19-
- 🐛 [Issue Tracker](https://github.com/EKGF/ekglib/issues)
20-
- 💬 [Discussions](https://github.com/EKGF/ekglib/discussions)
16+
- [Concept Parser](concept_parser/README.md)
17+
- [Persona Parser](persona_parser/README.md)
18+
- [Story Validate Rule Parser](dataops_rule_parser/README.md)
19+
- [Story Vaidate Rules Capture](dataops_rules_capture/README.md)
20+
- [Story Validate Rules Executor](dataops_rules_execute/README.md)
21+
- [Use Case Parser](use_case_parser/README.md)
22+
- [User Story Parser](user_story_parser/README.md)
2123

22-
---
24+
## Capture Steps
2325

24-
## Metadata Parsers
26+
- [Xlsx Parser](xlsx_parser/README.md)
27+
- [LDAP Parser](ldap_parser/README.md)
2528

26-
- [Concept Parser](concept_parser/)
27-
- [Persona Parser](persona_parser/)
28-
- [Story Validate Rule Parser](dataops_rule_parser/)
29-
- [Story Vaidate Rules Capture](dataops_rules_capture/)
30-
- [Story Validate Rules Executor](dataops_rules_execute/)
31-
- [Use Case Parser](use_case_parser/)
32-
- [User Story Parser](user_story_parser/)
29+
## Maturity Model Tools
3330

34-
## Capture Steps
31+
- [Maturity Model Parser](maturity_model_parser/README.md)
32+
33+
## Pipelines and Export
34+
35+
- [Pipeline Framework](pipeline/README.md)
36+
- [Step Export](step_export/README.md)
37+
38+
## LDAP Variants
39+
40+
- [LDAP Parser to File](ldap_parser_to_file/README.md)
41+
- [LDAP Parser to S3](ldap_parser_to_s3/README.md)
3542

36-
- [Xlsx Parser](xlsx_parser/)
37-
- [LDAP Parser](ldap_parser/)
43+
## Storage and Data Access
3844

39-
---
45+
- [S3 Helpers](s3/README.md)
46+
- [Data Sources](data_source/README.md)
47+
- [Datasets](dataset/README.md)
48+
49+
## Knowledge Graph and SPARQL Utilities
50+
51+
- [KG IRI Utilities](kgiri/README.md)
52+
- [SPARQL Helpers](sparql/README.md)
53+
- [Namespaces](namespace/README.md)
54+
- [Ontologies and Resources](resources/README.md)
55+
56+
## Core Utilities
57+
58+
- [Logging Utilities](log/README.md)
59+
- [String Utilities](string/README.md)
60+
- [Git Utilities](git/README.md)
61+
- [Exceptions](exceptions/README.md)
62+
- [MIME Helpers](mime/README.md)
63+
- [Main CLI Entrypoint](main/README.md)
4064

4165
## Installation
4266

43-
**Using `uv` (recommended)**
67+
`ekglib` is not yet published on PyPI. You can install it directly from GitHub.
4468

45-
If you are using `uv` to manage your project, add `ekglib` as a dependency:
69+
**Using `uv` from GitHub (recommended)**
70+
71+
Add `ekglib` as a dependency from GitHub:
4672

4773
```bash
48-
uv add ekglib
74+
uv add --git https://github.com/EKGF/ekglib.git
4975
```
5076

5177
You can then run the provided CLI tools via `uv`:
@@ -59,19 +85,19 @@ uv run pipeline-example --help
5985
To install the CLI tools as global commands (similar to `pipx`):
6086

6187
```bash
62-
uv tool install ekglib
88+
uv tool install --git https://github.com/EKGF/ekglib.git
6389

6490
xlsx-parser --help
6591
user-story-parser --help
6692
pipeline-example --help
6793
```
6894

69-
**Using `pip`**
95+
**Using `pip` from GitHub**
7096

71-
If you prefer to use `pip` directly:
97+
If you prefer to use `pip`, you can install from the GitHub repo:
7298

7399
```bash
74-
python -m pip install ekglib
100+
python -m pip install "git+https://github.com/EKGF/ekglib.git"
75101
```
76102

77103
The console scripts will then be available on your `PATH`:
@@ -82,8 +108,6 @@ user-story-parser --help
82108
pipeline-example --help
83109
```
84110

85-
---
86-
87111
## Development setup (from source)
88112

89113
If you cloned this repository and want to work on `ekglib` itself:
@@ -94,8 +118,6 @@ uv sync
94118

95119
This will create and populate a virtual environment using `uv` based on `pyproject.toml`.
96120

97-
---
98-
99121
## Tests
100122

101123
To run all tests:
@@ -110,8 +132,6 @@ To run a single test:
110132
uv run pytest tests/<path-to-test> -k <name-of-test>
111133
```
112134

113-
---
114-
115135
## Packaging
116136

117137
```bash

pyproject.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ dependencies = [
3434
"python-dateutil",
3535
"rdflib",
3636
"requests",
37-
"setuptools",
3837
"SPARQLWrapper",
3938
"stringcase",
4039
"toml",

src/ekglib/concept_parser/parse.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ def __init__(self, input_file_name, verbose):
3030
for s, p, o in self.g:
3131
print((s, p, o))
3232

33-
def check(self):
33+
def check(self) -> None:
3434
for concept in self.g.subjects(RDF.type, CONCEPT.Concept):
3535
log_item('Concept', concept)
3636
for rdfsLabel in self.g.objects(concept, RDFS.label):
@@ -41,7 +41,7 @@ def check(self):
4141
# TODO: Nothing much happens here yet
4242
#
4343

44-
def dump(self, output_file):
44+
def dump(self, output_file: str | Path | None) -> None:
4545
if not output_file:
4646
warning('You did not specify an output file, no output file created')
4747
return

src/ekglib/data_source/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# data_source
2+
3+
Abstractions and helpers for describing and working with input data sources in `ekglib` pipelines and parsers.
4+
5+
This package provides a small set of reusable building blocks used across multiple components.
6+
7+

src/ekglib/data_source/various.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
import os
22
from argparse import ArgumentParser
3+
from typing import Any
34

45

5-
def set_cli_params(parser: ArgumentParser) -> None:
6+
def set_cli_params(parser: ArgumentParser) -> Any:
67
ekg_data_source_code = os.getenv('EKG_DATA_SOURCE_CODE', None)
78
group = parser.add_argument_group('Data Source')
89
if ekg_data_source_code:

src/ekglib/dataops_rule_parser/parse.py

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,15 @@
55

66
import owlrl
77
import rdflib
8-
from rdflib import Graph, Literal, URIRef, OWL, RDF, RDFS, XSD
8+
from rdflib import OWL, RDF, RDFS, XSD, Graph, Literal, URIRef
99

1010
from ..data_source import set_cli_params as data_source_set_cli_params
11-
from ..kgiri import (
12-
EKG_NS,
13-
set_kgiri_base,
14-
set_kgiri_base_replace,
15-
set_cli_params as kgiri_set_cli_params,
16-
kgiri_replace_iri_in_literal,
17-
)
18-
from ..log import error, warning, log_error, log_list, log_item, log_iri
11+
from ..kgiri import EKG_NS, kgiri_replace_iri_in_literal
12+
from ..kgiri import set_cli_params as kgiri_set_cli_params
13+
from ..kgiri import set_kgiri_base, set_kgiri_base_replace
14+
from ..log import error, log_error, log_iri, log_item, log_list, warning
1915
from ..main import load_rdf_file_into_graph
20-
from ..namespace import RULE, PROV, RAW, DATAOPS, DATASET
16+
from ..namespace import DATAOPS, DATASET, PROV, RAW, RULE
2117

2218
OWL._fail = (
2319
False # workaround for this issue: https://github.com/RDFLib/OWL-RL/issues/53
@@ -95,11 +91,11 @@ def load_ontology(self, ontology_file_name: Path):
9591
log_item('Loading Ontology', ontology_file_name)
9692
load_rdf_file_into_graph(self.g, ontology_file_name)
9793

98-
def load_ontologies(self):
94+
def load_ontologies(self) -> None:
9995
for ontology_file_name in ontology_file_names:
10096
self.load_ontology(self.ontologies_root / ontology_file_name)
10197

102-
def rdfs_infer(self):
98+
def rdfs_infer(self) -> None:
10399
owlrl.RDFSClosure.RDFS_Semantics(self.g, True, True, True)
104100
closure_class = owlrl.return_closure_class(
105101
owl_closure=True, rdfs_closure=True, owl_extras=True, trimming=True
@@ -172,10 +168,10 @@ def set_sort_key(self):
172168
return '99-obfuscate'
173169
return f'10-{set_key}'
174170

175-
def key(self):
171+
def key(self) -> str:
176172
return f'{self.set_key()}-{self.rule_file.parent.stem}'
177173

178-
def sort_key(self):
174+
def sort_key(self) -> str:
179175
return f'{self.set_sort_key()}-{self.rule_file.parent.stem}'
180176

181177
def create_rule_set(self):
@@ -221,7 +217,7 @@ def check_rule(self, rule_iri) -> bool:
221217
# Check to see if in the same directory as the rule.ttl file we have a .sparql_endpoint file
222218
# with the given name. If so, return its contents as a Literal.
223219
#
224-
def check_sparql_file_name(self, sparql_file_name):
220+
def check_sparql_file_name(self, sparql_file_name: str) -> Literal | None:
225221
if self.verbose:
226222
log_item('Checking', self.rule_file.parent / sparql_file_name)
227223
sparql_file_name_full_path = self.rule_file.parent / sparql_file_name
@@ -233,7 +229,9 @@ def check_sparql_file_name(self, sparql_file_name):
233229
)
234230
return rdflib.Literal(sparql_file_name_full_path.read_text())
235231

236-
def process_sparql_literal(self, rule, sparql_literal):
232+
def process_sparql_literal(
233+
self, rule: URIRef, sparql_literal: Literal | None
234+
) -> None:
237235
if not sparql_literal:
238236
return
239237
self.replace_literal_triple(
@@ -249,16 +247,18 @@ def process_sparql_literal(self, rule, sparql_literal):
249247
#
250248
# Add a triple to the graph with the given sparql_endpoint literal.
251249
#
252-
def add_literal_triple(self, s, p, o):
250+
def add_literal_triple(self, s: URIRef, p: URIRef, o: Literal) -> None:
253251
if self.verbose:
254252
print('Adding triple <{0}> - <{1}> - "{2}"'.format(s, p, o))
255253
self.g.add((s, p, o))
256254

257-
def replace_literal_triple(self, s, p1, p2, o):
255+
def replace_literal_triple(
256+
self, s: URIRef, p1: URIRef, p2: URIRef, o: Literal
257+
) -> None:
258258
self.g.remove((s, p1, None))
259259
self.add_literal_triple(s, p2, o)
260260

261-
def dump(self, output_file) -> int:
261+
def dump(self, output_file: str | Path | None) -> int:
262262
if not output_file:
263263
warning('You did not specify an output file, no output file created')
264264
return 1
@@ -311,3 +311,5 @@ def main() -> int:
311311

312312
if __name__ == '__main__':
313313
exit(main())
314+
if __name__ == '__main__':
315+
exit(main())

src/ekglib/dataops_rules_capture/capture.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ def capture_rule_file(self, rule_file: Path, rule_directory_iri: URIRef):
103103
for triple in processor.g:
104104
self.g.add(triple)
105105

106-
def s3_file_name(self):
106+
def s3_file_name(self) -> str:
107107
return f'raw-data-dataops-rules-{self.data_source_code}.ttl.gz'
108108

109109
def export(self) -> int:

src/ekglib/dataset/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# dataset
2+
3+
Helpers and types for representing datasets used by `ekglib` tools and pipelines.
4+
5+
These utilities are typically consumed by higher-level parsers and exporters rather than used directly.
6+
7+

src/ekglib/exceptions/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# exceptions
2+
3+
Common exception types and error utilities shared across `ekglib`.
4+
5+
Centralizing these exceptions helps keep error handling consistent between components.
6+
7+

0 commit comments

Comments
 (0)