SARSA not working in GridWorld_td

Using the default agent parameters, but set spec.update to 'sarsa', the model simply does not converge to the optimal solution.

// agent parameter spec to play with (this gets eval()'d on Agent reset)
var spec = {}
spec.update = 'sarsa'; // 'qlearn' or 'sarsa'
spec.gamma = 0.9; // discount factor, [0, 1)
spec.epsilon = 0.2; // initial epsilon for epsilon-greedy policy, [0, 1)
spec.alpha = 0.1; // value function learning rate
spec.lambda = 0.1; // eligibility trace decay, [0,1). 0 = no eligibility traces
spec.replacing_traces = true; // use replacing or accumulating traces
spec.planN = 0; // number of planning steps per iteration. 0 = no planning

spec.smooth_policy_update = true; // non-standard, updates policy smoothly to follow max_a Q
spec.beta = 0.1; // learning rate for smooth policy update

![image](https://user-images.githubusercontent.com/10457709/29272687-69ce38ec-8101-11e7-9916-1040d6521a5c.png)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SARSA not working in GridWorld_td #24

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

SARSA not working in GridWorld_td #24

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions