|
| 1 | + |
| 2 | +# Using RoHub Python API |
| 3 | + |
| 4 | +## Setup |
| 5 | +To get started with RoHub using code, you can install [RoHub Python API](https://gitlab.pcss.pl/daisd-public/rohub/rohub-api). You can create a `yml` file, which will install the package into your current directory. |
| 6 | +```yml |
| 7 | +name: rohub_dev |
| 8 | +channels: |
| 9 | +- conda-forge |
| 10 | +dependencies: |
| 11 | +- python=3.12 |
| 12 | +- sparqlwrapper |
| 13 | +- pip |
| 14 | +- pip: |
| 15 | +- "--editable=git+https://gitlab.pcss.pl/daisd-public/rohub/rohub-api.git#egg=rohub" |
| 16 | +``` |
| 17 | +You can also install from develop branch, which may contain some functions which are not still in the latest release: |
| 18 | +
|
| 19 | +```bash |
| 20 | +--editable git+https://gitlab.pcss.pl/daisd-public/rohub/rohub-api.git@develop#egg=rohub |
| 21 | +``` |
| 22 | + |
| 23 | +After creating an environment, you have to specify which RoHub endpoints you want to use. There are two endpoints: [Production](https://www.rohub.org/) and [Development](https://rohub2020-rohub.apps.paas-dev.psnc.pl/). To work with the development endpoint , you have to find the location of the local installation (`pip show rohub`) and copy the a file named `.env` into this directory using a single command: |
| 24 | +```bash |
| 25 | +cp -v .env "$(pip show rohub | awk -F': ' '/Editable project location/ {print $2}')/.env" |
| 26 | +``` |
| 27 | +This is a sample `.env` file which connects to the Development: |
| 28 | + |
| 29 | +```bash |
| 30 | +[API_SECTION] |
| 31 | +API_URL = https://rohub2020-rohub.apps.paas-dev.psnc.pl/api/ |
| 32 | + |
| 33 | +[KEYCLOAK_SECTION] |
| 34 | +KEYCLOAK_CLIENT_ID = rohub2020-cli |
| 35 | +KEYCLOAK_CLIENT_SECRET = 714617a7-87bc-4a88-8682-5f9c2f60337d |
| 36 | +KEYCLOAK_URL = https://keycloak-dev.apps.paas-dev.psnc.pl/auth/realms/rohub/protocol/openid-connect/token |
| 37 | +``` |
| 38 | +Value for `API_URL` specifies the endpoint. For Development, it should be |
| 39 | +```bash |
| 40 | +API_URL = https://rohub2020-rohub.apps.paas-dev.psnc.pl/api/ |
| 41 | +``` |
| 42 | +and for Production: |
| 43 | +```bash |
| 44 | +API_URL = https://api.rohub.org/api/ |
| 45 | +``` |
| 46 | +If you copy the `.env` file in the wrong location, it will automatically works with the Production endpoint. |
| 47 | + |
| 48 | +## Login |
| 49 | +Before start coding, you need to create an account. You need to create separate accounts for [Production](https://www.rohub.org/) and [Development](https://rohub2020-rohub.apps.paas-dev.psnc.pl/). After creating an account, you need your username and password to login into hub, and start working with research objects: |
| 50 | + |
| 51 | +```python |
| 52 | +import rohub |
| 53 | + |
| 54 | +print("ROHub API URL:", rohub.settings.API_URL) |
| 55 | + |
| 56 | +user_name= "your username" |
| 57 | +user_pwd = "your password" |
| 58 | + |
| 59 | +rohub.login(username=user_name, password=user_pwd) |
| 60 | +``` |
| 61 | + |
| 62 | +Now you can list your uploaded research objects, and print their properties: |
| 63 | + |
| 64 | +```python |
| 65 | +my_ros = rohub.list_my_ros() |
| 66 | + |
| 67 | +for index, row in my_ros.iterrows(): |
| 68 | + id = row["identifier"] |
| 69 | + ro = rohub.ros_load(id) |
| 70 | + print("RO type:", ro.ros_type) |
| 71 | + if hasattr(ro, "title") and ro.title: |
| 72 | + print("RO title:", ro.title) |
| 73 | + if hasattr(ro, "authors") and ro.authors: |
| 74 | + print("RO authors:", ro.authors) |
| 75 | + if hasattr(ro, "description") and ro.description: |
| 76 | + print("RO description:", ro.description) |
| 77 | + if hasattr(ro, "research_areas") and ro.research_areas: |
| 78 | + print("RO research areas:", ro.research_areas) |
| 79 | + if hasattr(ro, "creation_date") and ro.creation_date: |
| 80 | + print("RO creation date:", ro.creation_date) |
| 81 | + if hasattr(ro, "last_modified_date") and ro.last_modified_date: |
| 82 | + print("RO last modified date:", ro.last_modified_date) |
| 83 | + if hasattr(ro, "doi") and ro.doi: |
| 84 | + print("RO DOI:", ro.doi) |
| 85 | + if hasattr(ro, "url") and ro.url: |
| 86 | + print("RO URL:", ro.url) |
| 87 | + if hasattr(ro, "metadata") and ro.metadata: |
| 88 | + print("RO metadata:", ro.metadata) |
| 89 | +``` |
| 90 | + |
| 91 | +## Uploading Research Objects |
| 92 | +You can upload a research object like: |
| 93 | + |
| 94 | +```python |
| 95 | +import rohub |
| 96 | + |
| 97 | +research_areas = ["Environmental research"] |
| 98 | +title = "NFDI4ING Model Validation with NextFlow" |
| 99 | +description = f"Description for {title}" |
| 100 | +zip_path = "./ro-crate-metadata-21b6d1a7d9acc988.zip" |
| 101 | +RO = rohub.ros_upload(path_to_zip=zip_path) |
| 102 | +print(f"Identifier: {RO.identifier}") |
| 103 | +``` |
| 104 | + |
| 105 | +After successfully uploading or creating a research object, api gives back its `identifier` which could be viewed on the portal. For example, if the identifier is `716b082f-57f0-45de-84b2-4440ae8bcf57`, then you can view it on https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57 or https://w3id.org/ro-id/716b082f-57f0-45de-84b2-4440ae8bcf57, depending on which endpoint you are working on with. |
| 106 | + |
| 107 | +## Accessing Research Objects |
| 108 | + |
| 109 | +You can access research objects via code (API) or SPARQL endpoint. There are two endpoints, [Production](https://rohub2020-api-virtuoso-route-rohub2020.apps.paas.psnc.pl/sparql) and [Development](https://rohub2020-api-virtuoso-route-rohub.apps.paas-dev.psnc.pl/sparql/%22). |
| 110 | + |
| 111 | +You can query object properties by id as: |
| 112 | + |
| 113 | +```sparql |
| 114 | +SELECT * |
| 115 | +WHERE { |
| 116 | + <https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57> ?p ?o . |
| 117 | +} |
| 118 | +``` |
| 119 | + |
| 120 | +## Adding Annotations |
| 121 | +You can annotate a resource as well. For example, for a research object with identifier `716b082f-57f0-45de-84b2-4440ae8bcf57` you can add a list of properties and values: |
| 122 | + |
| 123 | +```python |
| 124 | +RO = rohub.ros_load("716b082f-57f0-45de-84b2-4440ae8bcf57") |
| 125 | +annotation_json = [ |
| 126 | + { |
| 127 | + "property": "http://hasStudySubject.com", |
| 128 | + "value": "http://inspire.ec.europa.eu/metadata-codelist/TopicCategory/environment" |
| 129 | + } |
| 130 | +] |
| 131 | +add_annotations_result = RO.add_annotations(body_specification_json=annotation_json) |
| 132 | +``` |
| 133 | +After adding these annotations, it will appear in your SPARQL query: |
| 134 | +```sparql |
| 135 | +SELECT * |
| 136 | +WHERE { |
| 137 | + <https://w3id.org/ro-id/716b082f-57f0-45de-84b2-4440ae8bcf57> <http://hasStudySubject.com> ?o . |
| 138 | +} |
| 139 | +``` |
| 140 | + |
| 141 | +## Access to `ro-crate-metadata.json` Contents |
| 142 | + |
| 143 | +After uploading or creating a research object, you can access all the data in `ro-crate-metadata.json` as triples. To get started, let's assume we have a research object with identifier `716b082f-57f0-45de-84b2-4440ae8bcf57`. The contents are in a named graph, but in order to find it, we have to query its Dataset: |
| 144 | +```sparql |
| 145 | +SELECT * |
| 146 | +WHERE { |
| 147 | + GRAPH ?g { |
| 148 | + <https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57> a <http://schema.org/Dataset> . |
| 149 | + } |
| 150 | +} |
| 151 | +``` |
| 152 | +This query should return a single value, which is the named graph for `ro-crate-metadata.json`. In our example, the named graph should be: |
| 153 | +``` |
| 154 | +https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57/.ro/annotations/b169ef47-3493-4dda-baa8-618c28e35ed2.ttl |
| 155 | +``` |
| 156 | +Now that we have found the named graph, we can query on it: |
| 157 | +```sparql |
| 158 | +SELECT * |
| 159 | +WHERE { |
| 160 | + GRAPH <https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57/.ro/annotations/b169ef47-3493-4dda-baa8-618c28e35ed2.ttl> { |
| 161 | + ?s ?p ?o . |
| 162 | + } |
| 163 | +} |
| 164 | +``` |
| 165 | +This query returns the triples that are in the `ro-crate-metadata.json` file. For example, if the file contains a node as: |
| 166 | +```json |
| 167 | +{ |
| 168 | + "@id": "#variable_young_modulus_8", |
| 169 | + "@type": "schema:PropertyValue", |
| 170 | + "rdfs:label": "young_modulus", |
| 171 | + "schema:unitCode": { |
| 172 | + "@id": "unit:PA" |
| 173 | + }, |
| 174 | + "schema:value": 210000000000.0 |
| 175 | +}, |
| 176 | +``` |
| 177 | +Then, part of the query + results for the specific node `#variable_young_modulus_8` would be: |
| 178 | + |
| 179 | +```sparql |
| 180 | +SELECT * |
| 181 | +WHERE { |
| 182 | + GRAPH <https://w3id.org/ro-id-dev/716b082f-57f0-45de-84b2-4440ae8bcf57/.ro/annotations/b169ef47-3493-4dda-baa8-618c28e35ed2.ttl> { |
| 183 | + <http://w3id.org/ro-id/rohub/model#variable_young_modulus_8> ?p ?o . |
| 184 | + } |
| 185 | +} |
| 186 | +``` |
| 187 | + |
| 188 | +| ?p | ?o | |
| 189 | +|----------|----------| |
| 190 | +|"http://www.w3.org/1999/02/22-rdf-syntax-ns#type"|"http://schema.org/PropertyValue"| |
| 191 | +|"http://www.w3.org/2000/01/rdf-schema#label"|"young_modulus"| |
| 192 | +|"http://schema.org/value"|"210000000000.0"| |
| 193 | +|"http://schema.org/unitCode"|"unit:PA"| |
| 194 | + |
| 195 | +## Sample parameter extraction from Snakemake provenance research object |
| 196 | + |
| 197 | +It is possible upload the snakemake research object artificated which was created by snakemake-metadat4ing-reporter-plugin onto the Rohub, and query the workflow input and output parameters. |
| 198 | + |
| 199 | +After uploading the artifact, the api gives back an id, in our case suppose that it is |
| 200 | +`a1485323-9904-438d-b188-794a71e58ea3`. We can run the following query on the [Development](https://rohub2020-api-virtuoso-route-rohub.apps.paas-dev.psnc.pl/sparql/%22), and see the results: |
| 201 | + |
| 202 | +```sparql |
| 203 | +PREFIX schema: <http://schema.org/> |
| 204 | +PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> |
| 205 | +PREFIX m4i: <http://w3id.org/nfdi4ing/metadata4ing#> |
| 206 | +PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> |
| 207 | +
|
| 208 | +SELECT DISTINCT ?value_element_size ?value_max_von_mises_stress_gauss_points ?tool_name |
| 209 | +WHERE { |
| 210 | + # ---- Input UUID ---- |
| 211 | + VALUES ?uuid { "a1485323-9904-438d-b188-794a71e58ea3" } |
| 212 | +
|
| 213 | + # ---- Construct the full Dataset IRI from UUID ---- |
| 214 | + BIND(IRI(CONCAT("https://w3id.org/ro-id-dev/", ?uuid)) AS ?dataset) |
| 215 | +
|
| 216 | + # ---- Find which graph contains that dataset ---- |
| 217 | + { |
| 218 | + SELECT ?g WHERE { |
| 219 | + GRAPH ?g { ?dataset a schema:Dataset . } |
| 220 | + } |
| 221 | + } |
| 222 | +
|
| 223 | + # ---- Query the graph found above ---- |
| 224 | + GRAPH ?g { |
| 225 | + ?processing_step a m4i:Method ; |
| 226 | + m4i:hasParameter ?element_size ; |
| 227 | + m4i:hasParameter ?element_order ; |
| 228 | + m4i:hasParameter ?element_degree ; |
| 229 | + m4i:investigates ?max_von_mises_stress_gauss_points ; |
| 230 | + m4i:implementedByTool ?tool . |
| 231 | +
|
| 232 | + ?max_von_mises_stress_gauss_points a schema:PropertyValue ; |
| 233 | + rdfs:label "max_von_mises_stress_nodes" ; |
| 234 | + schema:value ?value_max_von_mises_stress_gauss_points . |
| 235 | +
|
| 236 | + ?element_order a schema:PropertyValue ; |
| 237 | + rdfs:label "element_order" ; |
| 238 | + schema:value "1" . |
| 239 | +
|
| 240 | + ?element_degree a schema:PropertyValue ; |
| 241 | + rdfs:label "element_degree" ; |
| 242 | + schema:value "1" . |
| 243 | +
|
| 244 | + ?element_size a schema:PropertyValue ; |
| 245 | + rdfs:label "element_size" ; |
| 246 | + schema:value ?value_element_size . |
| 247 | +
|
| 248 | + ?tool a schema:SoftwareApplication ; |
| 249 | + rdfs:label ?tool_name . |
| 250 | +
|
| 251 | + FILTER (LCASE(str(?tool_name)) = "fenics-dolfinx" || LCASE(str(?tool_name)) = "kratosmultiphysics-all") |
| 252 | + } |
| 253 | +} |
| 254 | +ORDER BY ?tool_name xsd:decimal(?value_element_size) |
| 255 | +
|
| 256 | +``` |
| 257 | +The output should look like: |
| 258 | + |
| 259 | +| value_element_size | value_max_von_mises_stress_gauss_points | tool_name | |
| 260 | +|---------------------|-----------------------------------------|------------------------| |
| 261 | +| 0.003125 | 299783353.3636479 | fenics-dolfinx | |
| 262 | +| 0.00625 | 299475432.93192506 | fenics-dolfinx | |
| 263 | +| 0.0125 | 300129622.72171265 | fenics-dolfinx | |
| 264 | +| 0.025 | 299791507.5586339 | fenics-dolfinx | |
| 265 | +| 0.05 | 296013209.51795876 | fenics-dolfinx | |
| 266 | +| 0.1 | 273190934.3950997 | fenics-dolfinx | |
| 267 | +| 0.003125 | 298100064.0 | kratosmultiphysics-all | |
| 268 | +| 0.00625 | 296148032.0 | kratosmultiphysics-all | |
| 269 | +| 0.0125 | 291662080.0 | kratosmultiphysics-all | |
| 270 | +| 0.025 | 283087904.0 | kratosmultiphysics-all | |
| 271 | +| 0.05 | 263992000.0 | kratosmultiphysics-all | |
| 272 | +| 0.1 | 226270384.0 | kratosmultiphysics-all | |
0 commit comments