Skip to content

Conversation

@josh-audio
Copy link
Collaborator

@josh-audio josh-audio commented May 2, 2021

This PR adds a Repeater widget. This widget is used to manage the lifecycles of a set of widgets based on a vector in the state tree.

This is not quite done but it works, and I would really appreciate feedback.

9MJo0XOJ9R


This widget requires four things to function:

  1. A user-defined lens to get from the outer state to some im::Vector<InnerState>, where InnerState is the state of each child widget
  2. A closure to get some hashable ID from InnerState. This can just return InnerState if it is already hashable
  3. A closure to create a child widget based on some InnerState
  4. A closure to lay out a provided list of widgets based on OuterState, with the assumption that the im::Vector<InnerState> is trivially accessible. The state vector is guaranteed to have a one-to-one mapping to the widget vector. The user must lay out all children, similar to a regular layout function.

In return, the repeater ensures that one widget exists per data item in the im::Vector<InnerState>, and maintains their paint and event order.

Some general questions I have:

  • Does this have to use im::Vector? It certainly made it easier to wrangle all the generics. What might the barrier be to generalizing it to any iterable collection?
    • Just tried this, see comment below
  • The diff code is not an asymptotic nightmare, but are there obvious ways to improve it?
  • Ergonomics aren't my strong suit. Are there ways to make the constructor less messy?

TODO:

  • Move the layout closure to a builder function and make it optional
  • Detect when the user has duplicate keys and warn them about it
  • Update readme
  • Make sure the function boxes are needed
  • Profile for simpler diff algorithm
  • Clean up bad merge

@josh-audio
Copy link
Collaborator Author

josh-audio commented May 2, 2021

Just tried making this generic over any IntoIterator. It's not trivial. The current implementation requires collection items that can be looked up trivially by index, and IntoIterator does not guarantee this.

I can think of two ways to fix this, neither of which I like:

  • Duplicate the widget and manually change to support both im::Vector and im::HashMap
  • Offload the burden of creating collection <-> item lenses to the user

My use-case has data in HashMaps. I think I can put the data in sorted vectors and have a worst-case o(log(n)) lookup, which is totally reasonable. I still have some unsolved querying issues that might push me toward vectors anyway.

It would be nice to support unordered collections, but that may be necessarily outside the scope of this PR.

/// A separator widget.
pub struct Separator {
size: KeyOrValue<f64>,
width: KeyOrValue<f64>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks fishy, why add an unused field?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a bad merge. Thanks for pointing this out, I'll clean it up.

let mut children_changed = false;

// Start
let mut stale = HashSet::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably have done it that way at first. And maybe what follows is moot because I'm not familiar enough with Rust internals. But assuming C++ knowledge transposes somewhat, I'd say it's unlikely that you will have such a large collection that a linear search into a Vec becomes a bottleneck. And so instead of using a complex HashSet-based add/remove, I'd simply compare 2 sorted vectors, which can be done in linear time. Most likely for moderate-sized collections this will be both simpler and faster.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and if you can use vectors, then it sounds to me like you can use itertools.merge_join_by() to greatly reduce the work on iterating, and just focus on the handling of added/removed items.

Copy link
Collaborator Author

@josh-audio josh-audio May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're likely right that the simplest solution is also the fastest in most cases. It could make sense to branch by collection size here? I don't want this to ever be a bottleneck since update is called a lot, but I also want to avoid unnecessary performance overhead in the average case.

I'd simply compare 2 sorted vectors

The inputs are not sorted, but I do care about their order. The correct ordering of the widgets is determined by the position of each item in the vector from data. Since data represents the most current state, my goal is to get the ordering of my widgets to match that ordering, so in a sense the only valid sorting order is the order that the vector from data is already in. But the order of the vector from old_data is important too, as it should always match the current order of my widgets when the algorithm starts. So in a sense my goal is to find what operations need to be applied to old_data's vector to turn it into data's vector, and then apply those operations to my widget list.

Aside from this pre-existing ordering, I have nothing to sort by. Each data item has an ID that can be fetched using the self.id_getter closure. This ID must be hashable but not necessarily orderable. Maybe it's not too restricting to add Ord as a constraint? But then I still have to allocate memory for two new arrays and eat some CPU and memory cycles to sort them (since I still need the original ordering), and at that point I can't imagine there are performance wins at any collection size since both methods eat memory. The advantage with the current method is that, by creating a HashMap with <ID, target_index>, you gain O(1) lookup of the target index for each item. I need to perform this lookup for the sort at the end. If that lookup becomes O(n), then the algorithm has O(log(n)*n^2) worst-case complexity, whereas it currently has worst-case O(n*log(n)).

As I said above, I think it may still be possible to squeeze out some performance using a somewhat more in-place method, but this is speculative and I do want to profile some before I commit to implementing it.

@richard-uk1
Copy link
Collaborator

Hey @Secondflight, how do you feel about this PR. Are there any blockers in the main druid crates? (If there are I'll make a note of them in the top comment).

@josh-audio
Copy link
Collaborator Author

Hi @derekdreery, there's nothing blocking that I know of. I'm struggling to find bandwidth for this but I'm definitely still interested in finishing it up.

@richard-uk1
Copy link
Collaborator

@Secondflight cool, all sounds good :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants