11"""
2- ExternalResources
3- =================
2+ HERD: HDMF External Resources Data Structure
3+ ==============================================
44
55This is a user guide to interacting with the
6- :py:class:`~hdmf.common.resources.ExternalResources ` class. The ExternalResources type
6+ :py:class:`~hdmf.common.resources.HERD ` class. The HERD type
77is experimental and is subject to change in future releases. If you use this type,
88please provide feedback to the HDMF team so that we can improve the structure and
99access of data stored with this type for your use cases.
1010
1111Introduction
1212-------------
13- The :py:class:`~hdmf.common.resources.ExternalResources ` class provides a way
13+ The :py:class:`~hdmf.common.resources.HERD ` class provides a way
1414to organize and map user terms from their data (keys) to multiple entities
1515from the external resources. A typical use case for external resources is to link data
1616stored in datasets or attributes to ontologies. For example, you may have a
1717dataset ``country`` storing locations. Using
18- :py:class:`~hdmf.common.resources.ExternalResources ` allows us to link the
18+ :py:class:`~hdmf.common.resources.HERD ` allows us to link the
1919country names stored in the dataset to an ontology of all countries, enabling
2020more rigid standardization of the data and facilitating data query and
2121introspection.
2222
2323From a user's perspective, one can think of the
24- :py:class:`~hdmf.common.resources.ExternalResources ` as a simple table, in which each
24+ :py:class:`~hdmf.common.resources.HERD ` as a simple table, in which each
2525row associates a particular ``key`` stored in a particular ``object`` (i.e., Attribute
2626or Dataset in a file) with a particular ``entity`` (i.e, a term of an online
2727resource). That is, ``(object, key)`` refer to parts inside a
2828file and ``entity`` refers to an external resource outside the file, and
29- :py:class:`~hdmf.common.resources.ExternalResources ` allows us to link the two. To
29+ :py:class:`~hdmf.common.resources.HERD ` allows us to link the two. To
3030reduce data redundancy and improve data integrity,
31- :py:class:`~hdmf.common.resources.ExternalResources ` stores this data internally in a
31+ :py:class:`~hdmf.common.resources.HERD ` stores this data internally in a
3232collection of interlinked tables.
3333
3434* :py:class:`~hdmf.common.resources.KeyTable` where each row describes a
4545 :py:class:`~hdmf.common.resources.ObjectKey` pair identifying which keys
4646 are used by which objects.
4747
48- The :py:class:`~hdmf.common.resources.ExternalResources ` class then provides
48+ The :py:class:`~hdmf.common.resources.HERD ` class then provides
4949convenience functions to simplify interaction with these tables, allowing users
50- to treat :py:class:`~hdmf.common.resources.ExternalResources ` as a single large table as
50+ to treat :py:class:`~hdmf.common.resources.HERD ` as a single large table as
5151much as possible.
5252
53- Rules to ExternalResources
53+ Rules to HERD
5454---------------------------
55- When using the :py:class:`~hdmf.common.resources.ExternalResources ` class, there
55+ When using the :py:class:`~hdmf.common.resources.HERD ` class, there
5656are rules to how users store information in the interlinked tables.
5757
58581. Multiple :py:class:`~hdmf.common.resources.Key` objects can have the same name.
5959 They are disambiguated by the :py:class:`~hdmf.common.resources.Object` associated
6060 with each, meaning we may have keys with the same name in different objects, but for a particular object
6161 all keys must be unique.
62- 2. In order to query specific records, the :py:class:`~hdmf.common.resources.ExternalResources ` class
62+ 2. In order to query specific records, the :py:class:`~hdmf.common.resources.HERD ` class
6363 uses '(file, object_id, relative_path, field, key)' as the unique identifier.
64643. :py:class:`~hdmf.common.resources.Object` can have multiple :py:class:`~hdmf.common.resources.Key`
6565 objects.
7474 Use the format provided by the resource. For example, Identifiers.org uses the ID ``ncbigene:22353``
7575 but the NCBI Gene uses the ID ``22353`` for the same term.
76768. In a majority of cases, :py:class:`~hdmf.common.resources.Object` objects will have an empty string
77- for 'field'. The :py:class:`~hdmf.common.resources.ExternalResources ` class supports compound data_types.
77+ for 'field'. The :py:class:`~hdmf.common.resources.HERD ` class supports compound data_types.
7878 In that case, 'field' would be the field of the compound data_type that has an external reference.
79799. In some cases, the attribute that needs an external reference is not a object with a 'data_type'.
8080 The user must then use the nearest object that has a data type to be used as the parent object. When
8585 has :py:class:`~hdmf.common.resources.File` along the parent hierarchy.
8686"""
8787######################################################
88- # Creating an instance of the ExternalResources class
88+ # Creating an instance of the HERD class
8989# ----------------------------------------------------
9090
9191# sphinx_gallery_thumbnail_path = 'figures/gallery_thumbnail_externalresources.png'
92- from hdmf .common import ExternalResources
92+ from hdmf .common import HERD
9393from hdmf .common import DynamicTable , VectorData
94- from hdmf import Container , ExternalResourcesManager
94+ from hdmf import Container , HERDManager
9595from hdmf import Data
9696import numpy as np
9797import os
9898# Ignore experimental feature warnings in the tutorial to improve rendering
9999import warnings
100- warnings .filterwarnings ("ignore" , category = UserWarning , message = "ExternalResources is experimental*" )
100+ warnings .filterwarnings ("ignore" , category = UserWarning , message = "HERD is experimental*" )
101101
102102
103103# Class to represent a file
104- class ExternalResourcesManagerContainer (Container , ExternalResourcesManager ):
104+ class HERDManagerContainer (Container , HERDManager ):
105105 def __init__ (self , ** kwargs ):
106- kwargs ['name' ] = 'ExternalResourcesManagerContainer '
106+ kwargs ['name' ] = 'HERDManagerContainer '
107107 super ().__init__ (** kwargs )
108108
109109
110- er = ExternalResources ()
111- file = ExternalResourcesManagerContainer (name = 'file' )
110+ er = HERD ()
111+ file = HERDManagerContainer (name = 'file' )
112112
113113
114114###############################################################################
115115# Using the add_ref method
116116# ------------------------------------------------------
117- # :py:func:`~hdmf.common.resources.ExternalResources .add_ref`
117+ # :py:func:`~hdmf.common.resources.HERD .add_ref`
118118# is a wrapper function provided by the
119- # :py:class:`~hdmf.common.resources.ExternalResources ` class that simplifies adding
120- # data. Using :py:func:`~hdmf.common.resources.ExternalResources .add_ref` allows us to
119+ # :py:class:`~hdmf.common.resources.HERD ` class that simplifies adding
120+ # data. Using :py:func:`~hdmf.common.resources.HERD .add_ref` allows us to
121121# treat new entries similar to adding a new row to a flat table, with
122- # :py:func:`~hdmf.common.resources.ExternalResources .add_ref` taking care of populating
122+ # :py:func:`~hdmf.common.resources.HERD .add_ref` taking care of populating
123123# the underlying data structures accordingly.
124124
125125data = Data (name = "species" , data = ['Homo sapiens' , 'Mus musculus' ])
@@ -165,7 +165,7 @@ def __init__(self, **kwargs):
165165 entity_uri = 'http://www.informatics.jax.org/marker/MGI:1343464'
166166)
167167
168- # Note: :py:func:`~hdmf.common.resources.ExternalResources .add_ref` internally resolves the object
168+ # Note: :py:func:`~hdmf.common.resources.HERD .add_ref` internally resolves the object
169169# to the closest parent, so that ``er.add_ref(container=genotypes, attribute='genotype_name')`` and
170170# ``er.add_ref(container=genotypes.genotype_name, attribute=None)`` will ultimately both use the ``object_id``
171171# of the ``genotypes.genotype_name`` :py:class:`~hdmf.common.table.VectorData` column and
@@ -197,12 +197,12 @@ def __init__(self, **kwargs):
197197)
198198
199199###############################################################################
200- # Visualize ExternalResources
200+ # Visualize HERD
201201# ------------------------------------------------------
202- # Users can visualize `~hdmf.common.resources.ExternalResources ` as a flattened table or
202+ # Users can visualize `~hdmf.common.resources.HERD ` as a flattened table or
203203# as separate tables.
204204
205- # `~hdmf.common.resources.ExternalResources ` as a flattened table
205+ # `~hdmf.common.resources.HERD ` as a flattened table
206206er .to_dataframe ()
207207
208208# The individual interlinked tables:
@@ -216,13 +216,13 @@ def __init__(self, **kwargs):
216216###############################################################################
217217# Using the get_key method
218218# ------------------------------------------------------
219- # The :py:func:`~hdmf.common.resources.ExternalResources .get_key`
219+ # The :py:func:`~hdmf.common.resources.HERD .get_key`
220220# method will return a :py:class:`~hdmf.common.resources.Key` object. In the current version of
221- # :py:class:`~hdmf.common.resources.ExternalResources `, duplicate keys are allowed; however, each key needs a unique
221+ # :py:class:`~hdmf.common.resources.HERD `, duplicate keys are allowed; however, each key needs a unique
222222# linking Object. In other words, each combination of (file, container, relative_path, field, key)
223- # can exist only once in :py:class:`~hdmf.common.resources.ExternalResources `.
223+ # can exist only once in :py:class:`~hdmf.common.resources.HERD `.
224224
225- # The :py:func:`~hdmf.common.resources.ExternalResources .get_key` method will be able to return the
225+ # The :py:func:`~hdmf.common.resources.HERD .get_key` method will be able to return the
226226# :py:class:`~hdmf.common.resources.Key` object if the :py:class:`~hdmf.common.resources.Key` object is unique.
227227genotype_key_object = er .get_key (key_name = 'Rorb' )
228228
@@ -232,18 +232,18 @@ def __init__(self, **kwargs):
232232 container = species ['Species_Data' ],
233233 key_name = 'Ursus arctos horribilis' )
234234
235- # The :py:func:`~hdmf.common.resources.ExternalResources .get_key` also will check the
235+ # The :py:func:`~hdmf.common.resources.HERD .get_key` also will check the
236236# :py:class:`~hdmf.common.resources.Object` for a :py:class:`~hdmf.common.resources.File` along the parent hierarchy
237- # if the file is not provided as in :py:func:`~hdmf.common.resources.ExternalResources .add_ref`
237+ # if the file is not provided as in :py:func:`~hdmf.common.resources.HERD .add_ref`
238238
239239###############################################################################
240240# Using the add_ref method with a key_object
241241# ------------------------------------------------------
242242# Multiple :py:class:`~hdmf.common.resources.Object` objects can use the same
243243# :py:class:`~hdmf.common.resources.Key`. To use an existing key when adding
244- # new entries into :py:class:`~hdmf.common.resources.ExternalResources `, pass the
244+ # new entries into :py:class:`~hdmf.common.resources.HERD `, pass the
245245# :py:class:`~hdmf.common.resources.Key` object instead of the 'key_name' to the
246- # :py:func:`~hdmf.common.resources.ExternalResources .add_ref` method. If a 'key_name'
246+ # :py:func:`~hdmf.common.resources.HERD .add_ref` method. If a 'key_name'
247247# is used, a new :py:class:`~hdmf.common.resources.Key` will be created.
248248
249249er .add_ref (
@@ -258,7 +258,7 @@ def __init__(self, **kwargs):
258258###############################################################################
259259# Using the get_object_entities
260260# ------------------------------------------------------
261- # The :py:class:`~hdmf.common.resources.ExternalResources .get_object_entities` method
261+ # The :py:class:`~hdmf.common.resources.HERD .get_object_entities` method
262262# allows the user to retrieve all entities and key information associated with an `Object` in
263263# the form of a pandas DataFrame.
264264
@@ -269,7 +269,7 @@ def __init__(self, **kwargs):
269269###############################################################################
270270# Using the get_object_type
271271# ------------------------------------------------------
272- # The :py:class:`~hdmf.common.resources.ExternalResources .get_object_entities` method
272+ # The :py:class:`~hdmf.common.resources.HERD .get_object_entities` method
273273# allows the user to retrieve all entities and key information associated with an `Object` in
274274# the form of a pandas DataFrame.
275275
@@ -285,9 +285,9 @@ def __init__(self, **kwargs):
285285# column/field is associated with different ontologies, then use field='x' to denote that
286286# 'x' is using the external reference.
287287
288- # Let's create a new instance of :py:class:`~hdmf.common.resources.ExternalResources `.
289- er = ExternalResources ()
290- file = ExternalResourcesManagerContainer (name = 'file' )
288+ # Let's create a new instance of :py:class:`~hdmf.common.resources.HERD `.
289+ er = HERD ()
290+ file = HERDManagerContainer (name = 'file' )
291291
292292data = Data (
293293 name = 'data_name' ,
@@ -307,28 +307,28 @@ def __init__(self, **kwargs):
307307)
308308
309309###############################################################################
310- # Write ExternalResources
310+ # Write HERD
311311# ------------------------------------------------------
312- # :py:class:`~hdmf.common.resources.ExternalResources ` is written as a zip file of
312+ # :py:class:`~hdmf.common.resources.HERD ` is written as a zip file of
313313# the individual tables written to tsv.
314314# The user provides the path, which contains the name of the directory.
315315
316316er .to_norm_tsv (path = './' )
317317
318318###############################################################################
319- # Read ExternalResources
319+ # Read HERD
320320# ------------------------------------------------------
321- # Users can read :py:class:`~hdmf.common.resources.ExternalResources ` from the tsv format
321+ # Users can read :py:class:`~hdmf.common.resources.HERD ` from the tsv format
322322# by providing the path to the directory.
323323
324- er_read = ExternalResources .from_norm_tsv (path = './' )
324+ er_read = HERD .from_norm_tsv (path = './' )
325325os .remove ('./er.zip' )
326326
327327###############################################################################
328- # Using TermSet with ExternalResources
328+ # Using TermSet with HERD
329329# ------------------------------------------------
330330# :py:class:`~hdmf.term_set.TermSet` allows for an easier way to add references to
331- # :py:class:`~hdmf.common.resources.ExternalResources `. These enumerations take place of the
331+ # :py:class:`~hdmf.common.resources.HERD `. These enumerations take place of the
332332# entity_id and entity_uri parameters. :py:class:`~hdmf.common.resources.Key` values will have
333333# to match the name of the term in the :py:class:`~hdmf.term_set.TermSet`.
334334from hdmf .term_set import TermSet
0 commit comments