Skip to content

Commit e9ecbfb

Browse files
authored
Adding RoHub Documentation (#36)
* Adding RoHub Documentation * Adding rohub to mkdocs.yml
1 parent 5398cee commit e9ecbfb

File tree

2 files changed

+273
-0
lines changed

2 files changed

+273
-0
lines changed

docs/rohub.md

Lines changed: 272 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,272 @@
1+
2+
# Using RoHub Python API
3+
4+
## Setup
5+
To get started with RoHub using code, you can install [RoHub Python API](https://gitlab.pcss.pl/daisd-public/rohub/rohub-api). You can create a `yml` file, which will install the package into your current directory.
6+
```yml
7+
name: rohub_dev
8+
channels:
9+
- conda-forge
10+
dependencies:
11+
- python=3.12
12+
- sparqlwrapper
13+
- pip
14+
- pip:
15+
- "--editable=git+https://gitlab.pcss.pl/daisd-public/rohub/rohub-api.git#egg=rohub"
16+
```
17+
You can also install from develop branch, which may contain some functions which are not still in the latest release:
18+
19+
```bash
20+
--editable git+https://gitlab.pcss.pl/daisd-public/rohub/rohub-api.git@develop#egg=rohub
21+
```
22+
23+
After creating an environment, you have to specify which RoHub endpoints you want to use. There are two endpoints: [Production](https://www.rohub.org/) and [Development](https://rohub2020-rohub.apps.paas-dev.psnc.pl/). To work with the development endpoint , you have to find the location of the local installation (`pip show rohub`) and copy the a file named `.env` into this directory using a single command:
24+
```bash
25+
cp -v .env "$(pip show rohub | awk -F': ' '/Editable project location/ {print $2}')/.env"
26+
```
27+
This is a sample `.env` file which connects to the Development:
28+
29+
```bash
30+
[API_SECTION]
31+
API_URL = https://rohub2020-rohub.apps.paas-dev.psnc.pl/api/
32+
33+
[KEYCLOAK_SECTION]
34+
KEYCLOAK_CLIENT_ID = rohub2020-cli
35+
KEYCLOAK_CLIENT_SECRET = 714617a7-87bc-4a88-8682-5f9c2f60337d
36+
KEYCLOAK_URL = https://keycloak-dev.apps.paas-dev.psnc.pl/auth/realms/rohub/protocol/openid-connect/token
37+
```
38+
Value for `API_URL` specifies the endpoint. For Development, it should be
39+
```bash
40+
API_URL = https://rohub2020-rohub.apps.paas-dev.psnc.pl/api/
41+
```
42+
and for Production:
43+
```bash
44+
API_URL = https://api.rohub.org/api/
45+
```
46+
If you copy the `.env` file in the wrong location, it will automatically works with the Production endpoint.
47+
48+
## Login
49+
Before start coding, you need to create an account. You need to create separate accounts for [Production](https://www.rohub.org/) and [Development](https://rohub2020-rohub.apps.paas-dev.psnc.pl/). After creating an account, you need your username and password to login into hub, and start working with research objects:
50+
51+
```python
52+
import rohub
53+
54+
print("ROHub API URL:", rohub.settings.API_URL)
55+
56+
user_name= "your username"
57+
user_pwd = "your password"
58+
59+
rohub.login(username=user_name, password=user_pwd)
60+
```
61+
62+
Now you can list your uploaded research objects, and print their properties:
63+
64+
```python
65+
my_ros = rohub.list_my_ros()
66+
67+
for index, row in my_ros.iterrows():
68+
id = row["identifier"]
69+
ro = rohub.ros_load(id)
70+
print("RO type:", ro.ros_type)
71+
if hasattr(ro, "title") and ro.title:
72+
print("RO title:", ro.title)
73+
if hasattr(ro, "authors") and ro.authors:
74+
print("RO authors:", ro.authors)
75+
if hasattr(ro, "description") and ro.description:
76+
print("RO description:", ro.description)
77+
if hasattr(ro, "research_areas") and ro.research_areas:
78+
print("RO research areas:", ro.research_areas)
79+
if hasattr(ro, "creation_date") and ro.creation_date:
80+
print("RO creation date:", ro.creation_date)
81+
if hasattr(ro, "last_modified_date") and ro.last_modified_date:
82+
print("RO last modified date:", ro.last_modified_date)
83+
if hasattr(ro, "doi") and ro.doi:
84+
print("RO DOI:", ro.doi)
85+
if hasattr(ro, "url") and ro.url:
86+
print("RO URL:", ro.url)
87+
if hasattr(ro, "metadata") and ro.metadata:
88+
print("RO metadata:", ro.metadata)
89+
```
90+
91+
## Uploading Research Objects
92+
You can upload a research object like:
93+
94+
```python
95+
import rohub
96+
97+
research_areas = ["Environmental research"]
98+
title = "NFDI4ING Model Validation with NextFlow"
99+
description = f"Description for {title}"
100+
zip_path = "./ro-crate-metadata-21b6d1a7d9acc988.zip"
101+
RO = rohub.ros_upload(path_to_zip=zip_path)
102+
print(f"Identifier: {RO.identifier}")
103+
```
104+
105+
After successfully uploading or creating a research object, api gives back its `identifier` which could be viewed on the portal. For example, if the identifier is `716b082f-57f0-45de-84b2-4440ae8bcf57`, then you can view it on https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57 or https://w3id.org/ro-id/716b082f-57f0-45de-84b2-4440ae8bcf57, depending on which endpoint you are working on with.
106+
107+
## Accessing Research Objects
108+
109+
You can access research objects via code (API) or SPARQL endpoint. There are two endpoints, [Production](https://rohub2020-api-virtuoso-route-rohub2020.apps.paas.psnc.pl/sparql) and [Development](https://rohub2020-api-virtuoso-route-rohub.apps.paas-dev.psnc.pl/sparql/%22).
110+
111+
You can query object properties by id as:
112+
113+
```sparql
114+
SELECT *
115+
WHERE {
116+
<https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57> ?p ?o .
117+
}
118+
```
119+
120+
## Adding Annotations
121+
You can annotate a resource as well. For example, for a research object with identifier `716b082f-57f0-45de-84b2-4440ae8bcf57` you can add a list of properties and values:
122+
123+
```python
124+
RO = rohub.ros_load("716b082f-57f0-45de-84b2-4440ae8bcf57")
125+
annotation_json = [
126+
{
127+
"property": "http://hasStudySubject.com",
128+
"value": "http://inspire.ec.europa.eu/metadata-codelist/TopicCategory/environment"
129+
}
130+
]
131+
add_annotations_result = RO.add_annotations(body_specification_json=annotation_json)
132+
```
133+
After adding these annotations, it will appear in your SPARQL query:
134+
```sparql
135+
SELECT *
136+
WHERE {
137+
<https://w3id.org/ro-id/716b082f-57f0-45de-84b2-4440ae8bcf57> <http://hasStudySubject.com> ?o .
138+
}
139+
```
140+
141+
## Access to `ro-crate-metadata.json` Contents
142+
143+
After uploading or creating a research object, you can access all the data in `ro-crate-metadata.json` as triples. To get started, let's assume we have a research object with identifier `716b082f-57f0-45de-84b2-4440ae8bcf57`. The contents are in a named graph, but in order to find it, we have to query its Dataset:
144+
```sparql
145+
SELECT *
146+
WHERE {
147+
GRAPH ?g {
148+
<https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57> a <http://schema.org/Dataset> .
149+
}
150+
}
151+
```
152+
This query should return a single value, which is the named graph for `ro-crate-metadata.json`. In our example, the named graph should be:
153+
```
154+
https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57/.ro/annotations/b169ef47-3493-4dda-baa8-618c28e35ed2.ttl
155+
```
156+
Now that we have found the named graph, we can query on it:
157+
```sparql
158+
SELECT *
159+
WHERE {
160+
GRAPH <https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57/.ro/annotations/b169ef47-3493-4dda-baa8-618c28e35ed2.ttl> {
161+
?s ?p ?o .
162+
}
163+
}
164+
```
165+
This query returns the triples that are in the `ro-crate-metadata.json` file. For example, if the file contains a node as:
166+
```json
167+
{
168+
"@id": "#variable_young_modulus_8",
169+
"@type": "schema:PropertyValue",
170+
"rdfs:label": "young_modulus",
171+
"schema:unitCode": {
172+
"@id": "unit:PA"
173+
},
174+
"schema:value": 210000000000.0
175+
},
176+
```
177+
Then, part of the query + results for the specific node `#variable_young_modulus_8` would be:
178+
179+
```sparql
180+
SELECT *
181+
WHERE {
182+
GRAPH <https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57/.ro/annotations/b169ef47-3493-4dda-baa8-618c28e35ed2.ttl> {
183+
<http://w3id.org/ro-id/rohub/model#variable_young_modulus_8> ?p ?o .
184+
}
185+
}
186+
```
187+
188+
| ?p | ?o |
189+
|----------|----------|
190+
|"http://www.w3.org/1999/02/22-rdf-syntax-ns#type"|"http://schema.org/PropertyValue"|
191+
|"http://www.w3.org/2000/01/rdf-schema#label"|"young_modulus"|
192+
|"http://schema.org/value"|"210000000000.0"|
193+
|"http://schema.org/unitCode"|"unit:PA"|
194+
195+
## Sample parameter extraction from Snakemake provenance research object
196+
197+
It is possible upload the snakemake research object artificated which was created by snakemake-metadat4ing-reporter-plugin onto the Rohub, and query the workflow input and output parameters.
198+
199+
After uploading the artifact, the api gives back an id, in our case suppose that it is
200+
`a1485323-9904-438d-b188-794a71e58ea3`. We can run the following query on the [Development](https://rohub2020-api-virtuoso-route-rohub.apps.paas-dev.psnc.pl/sparql/%22), and see the results:
201+
202+
```sparql
203+
PREFIX schema: <http://schema.org/>
204+
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
205+
PREFIX m4i: <http://w3id.org/nfdi4ing/metadata4ing#>
206+
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
207+
208+
SELECT DISTINCT ?value_element_size ?value_max_von_mises_stress_gauss_points ?tool_name
209+
WHERE {
210+
# ---- Input UUID ----
211+
VALUES ?uuid { "a1485323-9904-438d-b188-794a71e58ea3" }
212+
213+
# ---- Construct the full Dataset IRI from UUID ----
214+
BIND(IRI(CONCAT("https://w3id.org/ro-id-dev/", ?uuid)) AS ?dataset)
215+
216+
# ---- Find which graph contains that dataset ----
217+
{
218+
SELECT ?g WHERE {
219+
GRAPH ?g { ?dataset a schema:Dataset . }
220+
}
221+
}
222+
223+
# ---- Query the graph found above ----
224+
GRAPH ?g {
225+
?processing_step a m4i:Method ;
226+
m4i:hasParameter ?element_size ;
227+
m4i:hasParameter ?element_order ;
228+
m4i:hasParameter ?element_degree ;
229+
m4i:investigates ?max_von_mises_stress_gauss_points ;
230+
m4i:implementedByTool ?tool .
231+
232+
?max_von_mises_stress_gauss_points a schema:PropertyValue ;
233+
rdfs:label "max_von_mises_stress_nodes" ;
234+
schema:value ?value_max_von_mises_stress_gauss_points .
235+
236+
?element_order a schema:PropertyValue ;
237+
rdfs:label "element_order" ;
238+
schema:value "1" .
239+
240+
?element_degree a schema:PropertyValue ;
241+
rdfs:label "element_degree" ;
242+
schema:value "1" .
243+
244+
?element_size a schema:PropertyValue ;
245+
rdfs:label "element_size" ;
246+
schema:value ?value_element_size .
247+
248+
?tool a schema:SoftwareApplication ;
249+
rdfs:label ?tool_name .
250+
251+
FILTER (LCASE(str(?tool_name)) = "fenics-dolfinx" || LCASE(str(?tool_name)) = "kratosmultiphysics-all")
252+
}
253+
}
254+
ORDER BY ?tool_name xsd:decimal(?value_element_size)
255+
256+
```
257+
The output should look like:
258+
259+
| value_element_size | value_max_von_mises_stress_gauss_points | tool_name |
260+
|---------------------|-----------------------------------------|------------------------|
261+
| 0.003125 | 299783353.3636479 | fenics-dolfinx |
262+
| 0.00625 | 299475432.93192506 | fenics-dolfinx |
263+
| 0.0125 | 300129622.72171265 | fenics-dolfinx |
264+
| 0.025 | 299791507.5586339 | fenics-dolfinx |
265+
| 0.05 | 296013209.51795876 | fenics-dolfinx |
266+
| 0.1 | 273190934.3950997 | fenics-dolfinx |
267+
| 0.003125 | 298100064.0 | kratosmultiphysics-all |
268+
| 0.00625 | 296148032.0 | kratosmultiphysics-all |
269+
| 0.0125 | 291662080.0 | kratosmultiphysics-all |
270+
| 0.025 | 283087904.0 | kratosmultiphysics-all |
271+
| 0.05 | 263992000.0 | kratosmultiphysics-all |
272+
| 0.1 | 226270384.0 | kratosmultiphysics-all |

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,5 @@ nav:
4242
- Linear Elasticity: "benchmarks/linear elasticity"
4343
#- Plasticity: benchmarks/plasticity
4444
- "zz_bibliography.md"
45+
- rohub.md
4546

0 commit comments

Comments
 (0)