|
| 1 | +# Refactor of DataFlows to Include Event Type |
| 2 | + |
| 3 | +A large part of DFFML is the concept of a DataFlow. |
| 4 | + |
| 5 | +Chance orchestrator context run so that it yields three objects, context, event |
| 6 | +type, results. |
| 7 | + |
| 8 | +Currently we have a lot of code that looks like this: |
| 9 | + |
| 10 | +```python |
| 11 | +for ctx, results in run(dataflow, [... inputs ...]): |
| 12 | + print("The results of", ctx, "are", results) |
| 13 | +``` |
| 14 | + |
| 15 | +When this project is over, those `for` loops will look like this: |
| 16 | + |
| 17 | +```python |
| 18 | +for ctx, event, data in run(dataflow, [... inputs ...]): |
| 19 | + if event == EventType.OUTPUT: |
| 20 | + print("The results of", ctx, "are", data) |
| 21 | + elif event == EventType.INPUT: |
| 22 | + print("An input entered network for context", ctx, ":", data) |
| 23 | +``` |
| 24 | + |
| 25 | +The way things currently work is that the `run` function `yield`s when the |
| 26 | +context is finished running. It `yield`s the context that was running and the |
| 27 | +results. |
| 28 | + |
| 29 | +We need to add another part to a data flow so we can yield `Input`s. The event |
| 30 | +type would be `INPUT`, and in the DataFlow we should add a section for events. |
| 31 | +In the events section for `INPUT` events we could specify when an |
| 32 | +input should be yielded. We use the inputs section to specify which transitions |
| 33 | +between operations should be yielded. |
| 34 | + |
| 35 | +This will enable us to do things like running a DataFlow and not only `yield`ing |
| 36 | +the results, but data that's moving through the network as the DataFlow is |
| 37 | +running. This allows developers to build applications that show the progress of |
| 38 | +a DataFlow as it's running. |
| 39 | + |
| 40 | +## Skills |
| 41 | + |
| 42 | +- Python |
| 43 | +- Refactoring a large codebase |
| 44 | +- Asyncio knowledge would be very helpful here |
| 45 | + |
| 46 | +## Difficulty |
| 47 | + |
| 48 | +Intermediate/Hard |
| 49 | + |
| 50 | +## Estimated Time Required |
| 51 | + |
| 52 | +350 hours |
| 53 | + |
| 54 | +## Related Readings |
| 55 | + |
| 56 | +- https://intel.github.io/dffml/master/contributing/gsoc/2022/index.html |
| 57 | + |
| 58 | +## Getting Started |
| 59 | + |
| 60 | +- Read the contributing guidelines |
| 61 | + - https://intel.github.io/dffml/master/contributing/index.html |
| 62 | +- Go through the quickstart |
| 63 | + - https://intel.github.io/dffml/master/quickstart/model.html |
| 64 | +- Go through the data flow related docs and tutorials |
| 65 | + - https://intel.github.io/dffml/master/tutorials/dataflows/index.html |
| 66 | + - https://intel.github.io/dffml/master/examples/integration.html |
| 67 | + - https://intel.github.io/dffml/master/examples/shouldi.html |
| 68 | + - https://intel.github.io/dffml/master/examples/dataflows.html |
| 69 | + - https://intel.github.io/dffml/master/examples/mnist.html |
| 70 | + - https://intel.github.io/dffml/master/examples/flower17/flower17.html |
| 71 | + - https://intel.github.io/dffml/master/examples/webhook/index.html |
| 72 | +- Read about what data flows are and how they work |
| 73 | + - https://intel.github.io/dffml/master/concepts/index.html#dataflows |
| 74 | + - https://intel.github.io/dffml/master/concepts/dataflow.html |
| 75 | +- Come up with a basic example where the user will see inputs moving through the |
| 76 | + network. |
| 77 | + - Make it simple and include a few operations. |
| 78 | + - Get the DataFlow running. |
| 79 | +- Look at the code in `dffml/df/memory.py` and understand how it relates to the |
| 80 | + docs covering DataFlows conceptually. |
| 81 | +- Plan out what all needs to change within `dffml/df/memory.py` and the other |
| 82 | + code and examples that would change as a result. |
| 83 | + |
| 84 | +## Potential Mentors |
| 85 | + |
| 86 | +- [John Andersen](https://github.com/pdxjohnny) |
| 87 | +- [Saksham Arora](https://github.com/sakshamarora1) |
| 88 | + |
| 89 | +## Tracking and Discussion |
| 90 | + |
| 91 | +This project is related to the following issues. Please discuss and ask |
| 92 | +questions in the issue comments. Please also ping mentors on |
| 93 | +[Gitter](https://gitter.im/dffml/community) when you post on the following |
| 94 | +issues so that they are sure to see that you've commented. |
| 95 | + |
| 96 | +- https://github.com/intel/dffml/issues/919 |
0 commit comments