Policy tree - double robust scores and rewards

In layman's terms, can you please explain the process through which the policy_tree function inputs the double_robust_scores (i.e., the Gamma.matrix from causal forest that is interpreted as rewards), and uses them to generate the rules? Are positive values or negative values considered desirable in these reward matrices? I generally would think lower or negative treatment effects would be desirable, but I wasn't clear on how these rewards values are generated in a causal forest. 

I am asking, because in my current project where I am using my causal forest's double robust scores for the policy tree, I am noticing that I am getting suboptimal results when evaluating the policy tree on the test subsample (as compared to the truth in that same test subsample). Therefore, I wanted to confirm that the policy tree actually seeks to _reduce_ the incidence of a binary outcome, rather than _increase_ it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Policy tree - double robust scores and rewards #149

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Policy tree - double robust scores and rewards #149

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions