-
Notifications
You must be signed in to change notification settings - Fork 3
Batch Learning on task graph
Implement Sparks like batch learning system on the task graph will be a useful exercise for us : first test our implementation, second have a user level package that we can talk about to drum up the interest.
There are two kinds of task that we need to implement: parameter server, and data slaves (each hold a segment of data). The parameter server is responsible for holding the parameters and carry out the optimization. Note the optimization is a iterative process on parameters, say we use the gradient descent. At each epoch, the current parameter (shared variable in Spark) will be passed onto each data slaves through a task topology. Each data slave computes the gradient of loss function on its data shard at current parameter, and reduce the gradient (accumulator, can be implemented via buffered channels easily) from its child back to its parent.
The data slaves can provide RDD support naturally. Application define how their data will be processed as a pipeline from the source. So every time a node recovers from failure, it will naturally rerun the code (in task init) that bring the data from the source and process that into some in memory data structure, before it listens to parent meta ready.