Skip to content

Commit 14502d0

Browse files
committed
[irods/irods#7773] Policy Cookbook: Access info for opened replicas
1 parent 7a1d4da commit 14502d0

File tree

1 file changed

+150
-0
lines changed

1 file changed

+150
-0
lines changed

docs/administrators/policy_cookbook.md

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -798,3 +798,153 @@ The implementation can be modified to apply only to specific resources, collecti
798798
### A note about servers before 4.2.9
799799

800800
The 4.2.9 release of iRODS introduced dramatic changes to the way internal data transfers and data object finalization occur, and could behave differently from what is documented here. **Step 2b** should be available in servers with version 4.2.8 and older, and should behave similarly to what is documented here. A database operation similar to `data_object_finalize` which is no longer used called `mod_data_obj_meta` can also be leveraged to implement similar policy (although it uses a different, non-JSON interface).
801+
802+
## Retrieve information about opened replicas from `R_DATA_MAIN`
803+
804+
When implementing policy or a rule which deals with opened replicas, you may need more information about the opened replica from the `R_DATA_MAIN` table in the catalog such as size, host resource, or mtime.
805+
806+
This can be achieved by querying the iRODS Catalog using [Language-Integrated GenQuery](../../plugins/irods_rule_language/#language-integrated-general-query) or [GenQuery2 microservices](../../doxygen/msi__genquery2_8hpp.html). However, all the information in the catalog pertaining to an opened replica with a valid L1 file descriptor is already available in memory in the connected iRODS agent - all you have to do is ask.
807+
808+
Note: This recipe only pertains to system metadata for replica information found in `R_DATA_MAIN`. This does not give access to metadata AVUs annotated to the data object of the opened replica.
809+
810+
### How to do it ...
811+
812+
The key to accessing the in-memory information about the opened replica is in the [`msi_get_file_descriptor_info` microservice](../../doxygen/microservices_2src_2get__file__descriptor__info_8cpp.html). This microservice returns a string representing a JSON object containing a `DataObjInfo` with information about the opened replica, and the `DataObjInp` structure which was used to open the data object in the first place.
813+
814+
Here is an example of what this structure looks like:
815+
```javascript
816+
{
817+
"bytes_written": -1,
818+
"checksum": "",
819+
"checksum_flag": 0,
820+
"copies_needed": 0,
821+
"create_mode": 384,
822+
"data_object_info": {
823+
"backup_resource_name": "",
824+
"checksum": "",
825+
"collection_id": 10010,
826+
"condition_input": [
827+
{
828+
"key": "resc_hier",
829+
"value": "demoResc"
830+
},
831+
{
832+
"key": "selected_hierarchy",
833+
"value": "demoResc"
834+
}
835+
],
836+
"data_access": "",
837+
"data_access_index": 0,
838+
"data_comments": "",
839+
"data_create": "01740005951",
840+
"data_expiry": "00000000000",
841+
"data_id": 10015,
842+
"data_map_id": 0,
843+
"data_mode": "384",
844+
"data_modify": "01740006513",
845+
"data_owner_name": "rods",
846+
"data_owner_zone": "tempZone",
847+
"data_size": 12,
848+
"data_type": "generic",
849+
"destination_resource_name": "",
850+
"file_path": "/var/lib/irods/Vault/home/rods/foo",
851+
"flags": 0,
852+
"in_pdmo": "",
853+
"is_replica_current": true,
854+
"next": null,
855+
"object_path": "/tempZone/home/rods/foo",
856+
"other_flags": 0,
857+
"registering_user_id": 0,
858+
"replica_number": 0,
859+
"replica_status": 1,
860+
"resource_hierarchy": "demoResc",
861+
"resource_id": 10013,
862+
"resource_name": "demoResc",
863+
"special_collection": null,
864+
"status_string": "",
865+
"sub_path": "",
866+
"version": "",
867+
"write_flag": 0
868+
},
869+
"data_object_input_replica_flag": 1,
870+
"data_object_input": {
871+
"condition_input": [
872+
{
873+
"key": "resc_hier",
874+
"value": "demoResc"
875+
},
876+
{
877+
"key": "selected_hierarchy",
878+
"value": "demoResc"
879+
}
880+
],
881+
"data_size": -1,
882+
"in_pdmo": "",
883+
"in_use": true,
884+
"l3descInx": 3,
885+
"lock_file_descriptor": 0,
886+
"number_of_threads": 0,
887+
"object_path": "/tempZone/home/rods/foo",
888+
"offset": 0,
889+
"open_flags": 0,
890+
"open_type": 2,
891+
"operation_status": 0,
892+
"operation_type": 0,
893+
"other_data_object_info": null,
894+
"plugin_data": null,
895+
"purge_cache_flag": 0,
896+
"remote_l1_descriptor_index": 0,
897+
"remote_zone_host": null,
898+
"replica_status": 1,
899+
"replica_token": "",
900+
"replication_data_object_info": null,
901+
"source_l1_descriptor_index": 0,
902+
"special_collection": null
903+
},
904+
"stage_flag": 0
905+
}
906+
```
907+
908+
In order to access this information via the microservice, we need a handle to a JSON structure and a JSON pointer indicating which key's value we want. For example, if we wanted to know the creation time of the replica, we would use `"/data_object_info/data_create"`.
909+
910+
In this example, we will implement a dynamic PEP for the DataObjRead API, which uses an `OpenedDataObjInp` structure as part of its signature. In this PEP, we do not know the logical path of the data object which is being read and the `OpenedDataObjInp` structure does not have this information. However, it does give us access to an L1 file descriptor. Using this, we can access information about the opened replica which already resides in the iRODS agent's memory. Here is the implementation:
911+
```python
912+
pep_api_data_obj_read_pre(*instance_name, *comm, *opened_data_obj_inp, *data_obj_read_out_bbuf)
913+
{
914+
# Extract the L1 descriptor from the OpenedDataObjInp.
915+
*fd = *opened_data_obj_inp.l1descInx;
916+
917+
# Get the file descriptor information using the L1 descriptor as a string representing a JSON object.
918+
msi_get_file_descriptor_info(int(*fd), *json_output_str);
919+
920+
# Specify a JSON pointer to the desired information.
921+
*object_path_json_ptr = "/data_object_info/object_path"
922+
923+
# Parse the JSON string to get a handle to a proper JSON object.
924+
msiStrlen(*json_output_str, *json_output_strlen);
925+
msi_json_parse(*json_output_str, int(*json_output_strlen), *json_handle);
926+
927+
# Use the JSON pointer in the JSON handle to obtain the value therein.
928+
msi_json_value(*json_handle, *object_path_json_ptr, *object_path);
929+
930+
# Don't forget to free the JSON handle before exiting. Failing to do so may result in memory leaks.
931+
msi_json_free(*json_handle);
932+
933+
# ... Use the value which was retrieved above ...
934+
writeLine("serverLog", "[*object_path] is about to be read!");
935+
}
936+
```
937+
938+
### Let's see it in action!
939+
940+
The above dynamic PEP is not terribly interesting as it just prints a message to the server log. Here is one way to trigger this PEP:
941+
```bash
942+
$ istream read /tempZone/home/rods/foo
943+
hello everyone
944+
```
945+
946+
In the server log should be a message like this:
947+
```bash
948+
$ tail -n1 /var/log/irods/irods.log | jq '.log_message'
949+
"writeLine: inString = [/tempZone/home/rods/foo] is about to be read!\n"
950+
```

0 commit comments

Comments
 (0)