-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hi! I've been using your package, and I've run into an issue with the DRL solvers.
The function get_action passes the state into the neural network policy.μ, but this state is computed in convert_s to be the hash of the ASTState. My impression is that a hash value is not really a meaningful input to a NN, and it seems like it would invalidate much of the learning, effectively reducing the DRL algorithms to a random search.
If the GrayBox interface is extended to allow a Vector{Float64} state to be specified and stored in the ASTState at each update, this can be extracted in convert_s and passed into the NN. I've made those changes in my local copy to get things working, but perhaps there's a solution that's more in line with your vision for the package, in terms of genericity, etc. If you'd like, I can submit a pull request.