-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Description
Current implementation of SplitHonestExamples is not following the descriptions in "Generalized Random Forests" (Athey et al.).
yggdrasil-decision-forests/yggdrasil_decision_forests/learner/decision_tree/training.cc
Lines 4171 to 4191 in 48e5581
| void SplitHonestExamples( | |
| const absl::Span<const UnsignedExampleIdx> selected_examples, | |
| const float leaf_rate, utils::RandomEngine* random_engine, | |
| std::vector<UnsignedExampleIdx>& leaf_examples, | |
| std::vector<UnsignedExampleIdx>& working_selected_examples) { | |
| std::uniform_real_distribution<float> dist_01; | |
| // Reduce the risk of std::vector re-allocations. | |
| const float error_margin = 1.1f; | |
| leaf_examples.reserve(selected_examples.size() * leaf_rate * error_margin); | |
| working_selected_examples.reserve(selected_examples.size() * | |
| (1.f - leaf_rate) * error_margin); | |
| for (const auto& example : selected_examples) { | |
| if (dist_01(*random_engine) < leaf_rate) { | |
| leaf_examples.push_back(example); | |
| } else { | |
| working_selected_examples.push_back(example); | |
| } | |
| } | |
| } |
In the captions of the GRF paper Algorithm 1, it specifies that:
SplitSample randomly divides a set into two evenly-sized, non-overlapping halves.
However, the current YDF honesty implementation is not accounting for this property.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels