11Implementation
22==============
3- <!-- High level overview + low level overview -->
3+
4+ The system is implemented in python, using the ` Kivy ` framework for the
5+ frontend and multiple scientific tools such as ` Numpy ` , ` Scipy ` , ` Pandas ` and
6+ most important ` scikit-learn ` for the backend.
47
58
69First Iteration
710---------------
8- ![ Sketch of the first interface] ( images/sketch_1.png )
9-
1011For the first iteration the priority was to get a proof of concept in order to
1112see where the difficulties can appear, with a few simple classifiers and
1213cross-validation techniques. As such a button-based interface with very limited
@@ -20,7 +21,7 @@ Trees, but gives good results in wide variety of problems.
2021All these classifiers have few parameters on their respective sklearn
2122implementations, and for this prototype the interface did not allow modifying
2223any of them, as the it would have cluttered and it was not a necessary feature.
23- Also all of them are classifiers, as it simplies the interface, since
24+ Also all of them are classifiers, as it simplifies the interface, since
2425regressors and clustering have some incompatibilities.
2526
2627Apart from the temporary interface the backend had to be built. Since the
@@ -33,8 +34,6 @@ executed those.
3334
3435Second Iteration
3536----------------
36- ![ Sketch of the second interface] ( images/sketch_2.png )
37-
3837For the second interface the drag and drop feel was the main priority.
3938As such after developing the tab panel draggable boxes were developed, these
4039boxes needed to be connected through pins.
@@ -100,6 +99,121 @@ receives it).
10099
101100For more information about internal package distribution check appendix A.
102101
102+
103+ Making a Connection
104+ -------------------
105+ One of the most complex part is the connection, reconnection and deletion of
106+ connection between blocks, it involves several actors, asynchronous callbacks
107+ and a very strong coupling between all elements.
108+
109+ ![ Widget Tree] ( images/hierarchical.pdf )
110+
111+ In order to understand how connections are made it is necessary to understand
112+ how ` Kivy ` handles input.
113+ At surface level ` Kivy ` follows the traditional event-based input management,
114+ with the event propagating downwards from the root.
115+ However while traditionaly inputs events are only passed down to components
116+ that are on the event position ` Kivy ` passes the events to almost all children
117+ by default, this is done because in phones (one of ` Kivy ` targets is Android)
118+ gestures tend to start outside the actual widget they intend to affect.
119+
120+ On ` Kivy ` there are three main inputs events, ` on_touch_down ` that gets called
121+ when a key is is pressed, ` on_touch_move ` that is notified when the touch is
122+ moved, i.e. a finger moves across the screen, or on this cases when the mouse
123+ moves, and ` on_touch_up ` that is fired when the touch is released.
124+
125+ Lets represent the possible actions as use cases, the \* represents
126+ ` on_touch_down ` , - represents ` on_touch_move ` , and the inner \* ` on_touch_up ` :
127+
128+ * (On pin) Start a connection
129+ * (On connection) Modify a connection
130+ - Follow cursor
131+ - (On pin) Typecheck
132+ * (On a pin) Establish connection if possible
133+ * (Elsewhere) Remove connection
134+
135+ Logic is split in two big cases, creating a connection and modifying an
136+ existing one.
137+ Creating a connection involves creating one end of the connection, both
138+ visually and logically and preparing the line that will follow the cursor.
139+ On the other hand modifying a connection means removing the end that is being
140+ touched.
141+ This two cases can be handled by different classes, pin on the first case and
142+ connection for the last.
143+ Moving and finishing the connection are the same.
144+
145+ Without getting too deep into implementation details ends cannot just be
146+ removed, there are visual binds that have to be unbinded, and when a connection
147+ is destroyed (this only happens inside ` on_touch_up ` , but it can be either
148+ the pins or the blackboard ` on_touch_up ` depending if the connection is
149+ destroyed because the pin violates type safety or there is no pin under the
150+ cursor respectively) it has to unbind the logical connections of the pins
151+ themselves.
152+ For this reason connection has high-level functions that do the unbind, rebind
153+ and deletion of ends, as long as the necessary elements are passed (dependency
154+ injection pattern).
155+
156+ ![ Connections between elements] ( images/logical.pdf )
157+
158+
159+ Intermediate Representation
160+ ---------------------------
161+ The visual blocks represent a visual-dataflow language, however the backend
162+ uses a simpler representation of the relations between the blocks, this in turn
163+ helps decoupling backend and frontend.
164+
165+ The frontend blocks are translated on function ` to_ir ` , which merely performs
166+ trivial transformations to achieve the desired intermediate representation
167+ desired and runs on $\mathcal{O}(n)$ with n being the number of pins.
168+
169+ Let's represent the types on a more strongly typed language than Python.
170+
171+ ~~~ haskell
172+ type Id = Int -- The hash is an integer
173+ data Inputs = Inputs { origin :: Id , block :: Id }
174+ data Blocks = Blocks { inputs :: [Id ], function :: IO a -> IO a ,
175+ outputs :: [Id ]}
176+ data Outputs = Outputs { destinations :: [Id ], block :: Id }
177+ data IR = IR { inputs :: Map Id Inputs , blocks :: Map Id Blocks ,
178+ outputs :: Map Id Outputs }
179+ ~~~
180+
181+ As we can see on the Haskell definition the intermediation representation is
182+ just three Maps , one for blocks, one for input pins and one for output pins.
183+ But the maps do not contains pins themselves, merely unique hashes (Int on
184+ this case ).
185+ This reflects the fact that pins model only relationships, not state.
186+ The only non- hash value on `IR ` are the blocks functions.
187+ This functions are indeed impure, but earlier on the literature review it was
188+ established that dataflow programming was mainly side- effect free, so why do
189+ they involve side effects?.
190+
191+ There are actually first two reasons, first on the actual python programs this
192+ types do not exist, at least not on an enforceable way, so when translating
193+ them to haskell the `function` field represents the " worst case" , that is to
194+ say only a few functions will actually end up producing side- effects.
195+ The second and more important reason is that blocks actually execute
196+ themselves, meaning the block function does not has parameters, it relays on
197+ getting the values from the pins values and sets the values of the output
198+ values, leaving us with the work of setting those input pins and retrieving
199+ results from the output pins.
200+
201+ This goes against the previously stated " pins represent relationships, not
202+ state" , in fact an alternative implementation was created in which the
203+ function returned a tuple of results, and it's the compiler job to now
204+ associate the output pins to each of the elements on the tuple. This was done
205+ using the same current mechanism, saving into a dictionary, the difference
206+ being that while currently the values appear on the output pins and have to be
207+ moved into the dictionary (or otherwise a reference to the pin itself must be
208+ kept on the dictionary) on this case the values were fed directly to the
209+ algorithm.
210+ However this proved limiting, as code became more complex since more checks have
211+ to be done, there was no obvious advantage and side- effects did not disappeared
212+ but merely were harder to do .
213+
214+ <!-- Talk about function composition -->
215+
216+
103217[^ blackboard]: Blackboard is how the canvas where the blocks and connections
104218 are lay down.
105219[^ MVC ]: Model View Controller is a software pattern .
0 commit comments