Skip to content

Commit babaebf

Browse files
committed
updated docs and user/dev examples with better notes
1 parent 5337aa0 commit babaebf

File tree

14 files changed

+40
-42
lines changed

14 files changed

+40
-42
lines changed

docs/backends.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,12 @@ DuckDB
3838
:members:
3939
:special-members: __init__
4040

41-
SQLAlchemy
42-
-----------
41+
.. SQLAlchemy
42+
.. -----------
4343
44-
.. automodule:: dsi.backends.sqlalchemy
45-
:members:
46-
:special-members: __init__
44+
.. .. automodule:: dsi.backends.sqlalchemy
45+
.. :members:
46+
.. :special-members: __init__
4747
4848
GUFI
4949
------

docs/core.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,9 @@ by creating a DSI database in the process, or retrieving an existing DSI databas
6060
Examples
6161
--------
6262
Examples below display various ways users can incorporate DSI into their data science workflows.
63-
They are located in ``examples/developer/`` and must be run from that directory.
63+
They must be executed from their directory in ``examples/developer/``
6464

65-
Most of them either load or refer to data from ``examples/clover3d/``.
65+
To run them successfully, please unzip ``clover3d.zip`` located in ``examples/clover3d/``, and execute ``requirements.extras.txt``.
6666

6767
Example 1: Intro use case
6868
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

docs/introduction.rst

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,16 +35,21 @@ The DSI API is broken into three main categories:
3535
Expected Data Standards
3636
~~~~~~~~~~~~~~~~~~~~~~~~
3737

38-
Before using DSI, users are expected to preprocess their data into a standardized format.
39-
DSI's actions are strict and will not commit any database actions if the data is unstable.
38+
Before using DSI, users should first standardize their data in a format that can be represented as a table.
39+
DSI supports many widely used formats, including, but not limited to, CSV, JSON, YAML, TOML, and in-memory dictionaries.
40+
If the data is structured in a unique format, users can create an external DSI reader by following the steps in :ref:`custom_reader`.
4041

41-
This can be achieved by organizing the data so it can be represented as a table in DSI.
42-
It is also expected that each data point is a discrete value, rather than a complex data structure.
43-
If metadata is crucial to data representation, users should ensure it is stored with the data to be captured by DSI actions.
42+
When using a DSI-supported Reader, each data point is expected to be a discrete value — not a nested structure.
43+
Users must flatten any nested data to ensure compatibility with DSI.
4444

45-
Users expecting to load a complex schema into DSI should also consider which columns in tables will be related to each other.
46-
This requires prior knowledge of primary and foreign keys, and writing a JSON file to represent this schema.
47-
For more information on creating a schema compatible with DSI, please view :ref:`user_schema_example_label`.
45+
Metadata is important for many data workflows and should be stored with the data when relevant.
46+
For example, if simulation parameters are required for future analysis, that metadata should be included in the same table as the data.
47+
48+
Advanced users familiar with database relationships can also load a complex relational schema into DSI alongside their data.
49+
This requires prior knowledge of primary and foreign keys, as well as how columns across tables should be related.
50+
51+
Users must load this relational schema as a JSON into DSI using a Schema Reader.
52+
For more information on formatting the schema file correctly, refer to :ref:`user_schema_example_label`.
4853

4954
DSI Readers/Writers
5055
~~~~~~~~~~~~~~~~~~~~
@@ -62,6 +67,7 @@ Currently, DSI has the following Readers:
6267
- Schema (A complex schema reader)
6368
- YAML1
6469
- TOML1
70+
- Collection (To load a Python dictionary or OrderedDict)
6571
- Bueno
6672
- Ensemble (Reader to ingest ensemble data. Ex: the `Wildfire ensemble dataset <https://github.com/lanl/dsi/tree/main/examples/wildfire>`_ .
6773
Assumes each data row is a separate sim.)

docs/python_api.rst

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,9 +69,9 @@ Examples of each data card standard for the Wildfire dataset can be found in ``e
6969
User Examples
7070
--------------
7171
Examples below display various ways users can incorporate DSI into their data science workflows.
72-
They are located in ``examples/user/`` and must be run from that directory.
72+
They must be executed from their directory in ``examples/user/``
7373

74-
All of them either load or refer to data in ``examples/clover3d/``.
74+
To run them successfully, please unzip ``clover3d.zip`` located in ``examples/clover3d/``, and execute ``requirements.extras.txt``.
7575

7676
Example 1: Intro use case
7777
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -93,15 +93,20 @@ Printing various data and metadata from a DSI backend - number of tables, list o
9393

9494
Example 4: Find data
9595
~~~~~~~~~~~~~~~~~~~~
96-
Finding data from an active DSI backend that matches an input query - a string or a number.
97-
Prints all matches by default. If ``True`` is passed as an additional argument, returns rows of the first table that satisfies the query.
96+
Finding data from an active DSI backend that matches an input object.
97+
98+
If using ``search()``, the input can be a string or number.
99+
If using ``find()``, the input must be a string in the form of a condition - [column] [operator] [value].
100+
101+
By default, all matches are printed. If ``True`` is passed as an additional argument, the matching rows are returned as a DataFrame instead.
98102

99103
.. literalinclude:: ../examples/user/4.find.py
100104

101105
Example 5: Update data
102106
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
103-
Updating data from the edited output of ``find()``. Input can be output of either ``find()``, ``query()``, or ``get_table()``.
104-
Users must NOT change metadata columns starting with **`dsi_`** even if adding new rows.
107+
Updating data from the edited output of ``find()``. Users must NOT modify metadata columns starting with **`dsi_`** even when adding new rows.
108+
109+
The input can be the output of either ``find()``, ``search()``, ``query()``, or ``get_table()``.
105110

106111
.. literalinclude:: ../examples/user/5.update.py
107112

@@ -110,6 +115,8 @@ Example 6: Query data
110115
Querying data from an active DSI backend.
111116
Users can either use ``query()`` to view specific data with a SQL statement, or ``get_table()`` to view all data from a specified table.
112117

118+
By default, all matches are printed. If ``True`` is passed as an additional argument, the matching rows are returned as a DataFrame instead.
119+
113120
.. literalinclude:: ../examples/user/6.query.py
114121

115122
Example 7: Complex schema with data

examples/developer/10.notebook.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,8 @@
44
terminal_notebook = Terminal()
55

66
#read data
7-
# UNZIP clover3d.zip INSIDE EXAMPLES/CLOVER3D/
87
terminal_notebook.load_module('plugin', 'Schema', 'reader', filename="../clover3d/schema.json")
9-
terminal_notebook.load_module('plugin', 'Cloverleaf', 'reader', folder_path="../clover3d/clover3d/")
8+
terminal_notebook.load_module('plugin', 'Cloverleaf', 'reader', folder_path="../clover3d/")
109

1110
#ingest data to Sqlite backend
1211
terminal_notebook.load_module('backend','Sqlite','back-write', filename='jupyter_data.db')

examples/developer/2.ingest.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,7 @@
33

44
terminal_ingest = Terminal()
55

6-
# UNZIP clover3d.zip INSIDE EXAMPLES/CLOVER3D/
7-
terminal_ingest.load_module('plugin', 'Cloverleaf', 'reader', folder_path="../clover3d/clover3d/")
6+
terminal_ingest.load_module('plugin', 'Cloverleaf', 'reader', folder_path="../clover3d/")
87

98
terminal_ingest.load_module('backend','Sqlite','back-write', filename='data.db')
109

examples/developer/3.schema.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,13 @@
33

44
terminal = Terminal()
55

6-
# UNZIP clover3d.zip INSIDE EXAMPLES/CLOVER3D/
76
terminal.load_module('plugin', 'Schema', 'reader', filename="../clover3d/schema.json")
87

9-
terminal.load_module('plugin', 'Cloverleaf', 'reader', folder_path="../clover3d/clover3d/")
8+
terminal.load_module('plugin', 'Cloverleaf', 'reader', folder_path="../clover3d/")
109

1110
terminal.load_module('backend','Sqlite','back-write', filename='schema_data.db')
1211

1312
terminal.artifact_handler(interaction_type='ingest')
1413

15-
# EXECUTE requirements.extras.txt to be able to run this
1614
terminal.load_module('plugin', 'ER_Diagram', 'writer', filename = 'er_diagram.png')
1715
terminal.transload()

examples/developer/4.visualize.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,4 @@
2525
# prints numerical stats for only 'input'
2626
terminal_visualize.summary("input")
2727

28-
# prints numerical stats for only 'input' and prints first 5 rows of the actual table
29-
terminal_visualize.summary("input", 5)
30-
3128
terminal_visualize.close()

examples/developer/5.process.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
terminal_process.load_module('backend','Sqlite','back-read', filename='schema_data.db')
88
terminal_process.artifact_handler(interaction_type="process")
99

10-
# EXECUTE requirements.extras.txt to be able to run this
1110
terminal_process.load_module('plugin', 'ER_Diagram', 'writer', filename = 'er_diagram.png')
1211

1312
terminal_process.load_module('plugin', "Table_Plot", "writer", table_name = "output", filename = "output_plot.png")

examples/user/2.read.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,7 @@
44
read_dsi = DSI("data.db") # Target a backend, defaults to SQLite if not defined
55

66
#dsi.read(path, reader)
7-
# UNZIP clover3d.zip INSIDE EXAMPLES/CLOVER3D/
8-
read_dsi.read("../clover3d/clover3d/", 'Cloverleaf') # Read data into memory
7+
read_dsi.read("../clover3d/", 'Cloverleaf') # Read data into memory
98

109
#dsi.display(table_name)
1110
read_dsi.display("input") # Print the specific table's data from the Cloverleaf data

0 commit comments

Comments
 (0)