You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-2Lines changed: 9 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,9 +59,16 @@ The [`resulting SVG file`][res] of the run shows method chain calls along with i
59
59
60
60
## How it works?
61
61
62
-
`pyjviz`visualization of `pyjanitor` method chains is based on dumping of RDF triples `pyjanitor` method calls into log file. Each call of pyjanitor-registered method (including pandas native method calls) is traced and relevant data is saved into RDF log. Resulting RDF log file contains graph of method calls where user could find the trace of method execution as well as other data useful for visual inspection. The structure of RDF graph saved into log file is desribed in rdflog.shacl.ttl.
62
+
`pyjviz`provides the way to create logfile which contains RDF graph of program behaviour. The visualization features of pyjviz provided in the package itself are based on RDF graph translation to graphviz dot lanuage. `pyjanitor` method chains are represented using certain RDF data schema (ref here to shacl defs). Using `pandas` extentions API `pyjanitor` (and `pandas`) method call arguments and returns are saved into RDF log.
63
63
64
-
NOTE THAT visualisation of pyjviz RDF log is not a main goal of provided package. Graphviz-based visualization avaiable in the package is rather reference implementation with quite limited (but still useful) capablities. RDF log is kept in the files and queried using rdflib-provided SPARQL implementation.
64
+
> **Note**
65
+
> Visualisation of pyjviz RDF graph is not a main goal of provided package. Graphviz-based visualization avaiable in the package is rather reference implementation with quite limited (but still useful) capablities.
66
+
67
+
Python objects from `pyjviz` point of view have `object identity` and `object state`. Both of them are treated as abstract i.e. have no visual representation useful for the user in most cases. However `pyjviz` introduces the notion of `object state _Carbon_Copy_` or ObjectCC. Object carbon copy is well-defined representation of the python object useful for the user as visual primitive - and possibly more than that.
68
+
69
+
E.g. the simplest form of pandas dataframe 'carbon copy' can be obtained via using output of method head() then converted to HTML format - result of df.head().to_html() call. More comprehensive CC would be dataframe plot as generated by .plot method and saved as byte sequence. Note that 'carbon copy' is not necessary capture all details of original object state. If one need to have precise object state she would have to use CC class which guarantee that. CC like that would be based on .to_csv method in example above.
70
+
71
+
The way how particular call argument/return or other python objects are saved into RDF log is specified using CCGlance `carbon copy` class. For pandas dataframe it will save just shape of dataframe and its head() output serialized as HTML. If user wants to have other CC of the object it is always possible to use .cc() method ((ref here, rename .pin() to .cc())
0 commit comments