Feature: Added extra scanning operators to EventStream.#147
Feature: Added extra scanning operators to EventStream.#147SgtSwagrid wants to merge 5 commits intoraquo:masterfrom
Conversation
|
Thanks for the PR! Just to set the expectations straight out – the upcoming V18 release is full, unfortunately I don't have the bandwidth to add more stuff to it (there's more to it than already-made code). With that it mind, some comments:
These specific ones are intermittent and should happen very rarely, I wouldn't worry about it. I probably got the test setup wrong.
The signal version of this exists for a purpose – it allows StrictSignal to return a StrictSignal (and not just Signal) from scanLeft operators. EventStream has no such need, so I'm not sure if extracting these into a similar trait makes sense. Maybe it does, but that's a question for quite a bit later in V19. I'm still unsure of this new *Ops architecture that I just made for V18. I think I might need to try getting rid of the BaseObservable type (replacing Self type param with a type member) to move forward on it, but I haven't even started on that yet. Operators in the PR:
Other operators – not sure. We can't feasibly copy over the entire collections API onto streams, so we need to be selective in where we apply our efforts, mostly for the reasons below. There are three main issues with the design and implementation of these operators (and any operators really):
So as you see it's unfortunately a lot more complicated to add well-behaved operators to Airstream than to just use helpers locally – it's quite possible that your code doesn't care much about starts / stops and errors because it doesn't use either of these, but anything that's in Airstream needs to carefully account for that – when it doesn't, it inevitably becomes a bug report. I've just finished my last sprint for V18 and I still need to write tons of documentation for everything in that release; I won't have the time budget to spend much time on design / review of new operators any time soon, so right now, this is the extent of the feedback I can provide, I'm sorry that it's not more positive, it's just a non-trivial problem to design, and I'm very low on resources for that. I think some of these operators can make it to V19 but they would need a significant amount of work and testing and it would be likely months before I can dedicate the time to design and review it properly. The silver lining is that you don't need my blessing to use these operators, you've implemented them all "in userland" so you can simply add them to your Airstream observables as extension methods. If you have bigger ideas for more Airstream operators you could even publish a library with all of them, if you're ok with the issues I mentioned about error handling etc. (it's perfectly fine to be ok with those issues – I sometimes make quick ad-hoc operators for my own app code as well, when I can't afford the time to implement them more carefully in Airstream core). |
|
Thanks for taking the time to look over my bag of goodies, I appreciate it and your feedback is very helpful!
That's absolutely fine, I don't care, and I'm in no rush here, so V19 or later is perfect.
Don't let me tell you want to do, but I've had some success with 'Ops'-ifying things in the past. I think it's a good direction; even where it's not necessary for creating versions with different return types, it still makes it clear which 'helper' operations rely on which underlying functionality. (Since you also want Signal versions of these ops, it may even make sense to introduce a common 'ScanLeftOps' from which the Signal and Event versions both inherit, if we find that the semantics are indeed the same for some of the ops.)
I can't see why that change would be necessary. Have you tried writing e.g.
For
Nice! I thought For
Remaining are In my eyes, the most unclear operation here would be I would like to include The remaining three (
I had briefly considered this issue. However, I quickly brushed it aside as I assumed the semantics of the underlying So either the counts reset to
Now this is something that I hadn't thought about at all. Your suggestion sounds reasonable to me: errors shouldn't be counted but they shouldn't block everything either. Again, I'll test and document this. We could also consider adding a matching
I'd be happy to do this if/where you think it would be helpful to have. Although, many of the ops don't seem to make sense for signals at all. In fact, I would argue that all of these don't make semantic sense for a signal, unless we think of a signal as a kind of "pseudo-event stream" over its updates, rather than strictly as a time-varying value. I guess there is some precedent for thinking of a signal in this way in Airstream, though. Other ConsiderationsSomething that you didn't mention was overflow. While it seems unlikely to me that someone would ever exceed |
No change in behaviour. Borrowed from #147
Unfortunately the signal version of scanLeft needs a different initial value signature (it's a function), so that operator itself can't be shared. But other operators like
I don't remember why exactly I had to implement BaseObservable like this, but it had something to do with being unable to write a more specific Self type, similar to what you mentioned, that would compile. But I didn't know (enough) about type members back then, so I was blind to the possibility of using them. I have since used type members in Laminar and I think they have a chance of working better for this. But that doesn't really affect your operators aside from simple packaging.
Makes sense, protest approved. Matching collections lib is actually a good heuristic here, and the
At first pass the logic seems reasonable, I'm only unsure about the naming. I try to keep naming pattern sort of consistent with the collections lib, so I originally had
Hmm, I think a single
Let's keep the optional
Aside from implementation details, it does not seem that there is a compelling use-case for the
I'm ok to include the
Currently we only directly expose event indices in
Nah, those are crazy numbers. We don't even need a Long, an Int would do fine. We use Int-s for event indices elsewhere. There's zero chance anyone would actually emit 2 billion events in a single observable, let alone want to read the index of all those events. If Airstream was used on the backend I would review this assumption, but as we're working with client side code, Long is way overkill for this. The downside of Long-s is that it's awkward to define them for values that are typically very small, and they are also not very efficient on Scala.js. Currently Airstream does not use Long-s anywhere except in one interface where it's required for interop with Java's FlowPublisher.
If we have to choose then the prevailing Airstream semantic is to not to reset, so yes, we should follow that. Although it would be good to provide a Error handlingI thought about this for a bit and I think one robust way to solve this is to add two parameters to the existing scanLeft operator:
I'm not sure where to add them – in the first argument list or the second. Probably the first. It'll be kinda ugly anyway, if the user actually specified those params, but... seems unavoidable. if resetOnStop is true, then:
if ignoreErrors is true, then:
This is for the scanLeft public API, but scanLeft is implemented in terms of scanLeftRecover, which should also have the resetOnStop argument, but the public scanLeftRecover method should probably should not expose a ignoreErrors argument, since scanLeftRecover already has built-in logic to handle upstream errors (the Failure cases). So maybe both scanLeft and scanLeftRecover should just call into the ScanLefSignal class. And scanLeftRecover brings me to...
Not a typo. When user provides an A => B, we wrap it into Try: I wouldn't say that there is a good justification for this, I did this as a speculative performance tradeoff, but honestly I don't know if it's even a good tradeoff, especially now years later. I would be open to reviewing this decision throughout Airstream with a careful implementation that is the lowest runtime overhead possible (e.g. ideally don't want to create stray instances of Try on every event, so probably a try-catch based wrapper, unless it notably affects performance in the typical no-catch case), but that's a separate question for which the solution would be an all-encompassing upgrade throughout Airstream. As far as the operators in this PR are concerned, we can keep using the same assumption that the user code will not to throw if provided as a Try[A] => Try[B] function. Signal versions of these operatorsOn a high level, we think of Signals as holders of state, and streams as pipelines transforming events, but the line in the sand can get blurry sometimes, and the implementation details, such as Signals having memory of their last value, or streams having async operators, sometimes dictate their use and integration with each other. I'm trying not to withhold operators from Signals unless they truly don't make sense. I would say that operators that boil down to a variation of
Why no filter on Signals: because signals always need to have a value, so we can't let you filter out the signal's initial value. Why async: because it's essentially a filter, and also because pulling from pull-based signals (which we're increasingly adding support for) doesn't really work the way you'd expect with async delays. "But what if I want filter (or async) ops on a signal and I'm ok with not filter-ing (or async-ing) the signal's initial value:" then you can use
MiscI ran scalafmtAll on master to reduce unrelated diff |
Returning signals vs streamsOne more issue, since we went quite deep already. The But, all the operators that are implemented in terms of Take The other ones, If you look at the operators available on streams, those of them that have any kind of internal state and can be implemented as signals (i.e. don't have filtering or async logic), are all implemented as signals. The only exceptions are:
In general, stream operators that return streams don't tend to track any significant state, and that's for a good reason – streams aren't well designed to manage state, signals are. So with So can we reasonably make I'm guessing |
2b3bba6 to
f90fc3c
Compare
7bded4a to
9139a7a
Compare
|
A quick note on nomenclature: The reduction-style operators of the Scala stdlib differ along two dimensions:
In our case, whether or not this should count as keeping all values or only the final one is open to interpretation. I suppose your use of If Airstream's Some options include:
I still feel like |
If e.g. an I would suggest that the initial value should evaluate eagerly just throw the error up the current call stack. Otherwise the only other option is to make the fold fail forever (until restart). |
This has been implemented in #153. |
Note: Deferred until after #153, which aims to strengthen the underlying
scanLeft.Change 1: Refactor of EventStream
Refactored EventStream to move operations related to scanLeft into a new trait ScanLeftEventOps, in a similar manner as was already done for Signal in the form of ScanLeftSignalOps. In particular, the following methods were relocated:
The motivation for the above was a desire to add some additional operators related to scanning without cluttering EventStream.
Change 2: Addition of scanning operators
Added the following operators to EventStream via the trait ScanLeftEventOps:
Many of the above are directly based on or inspired by equivalents from the Scala standard library (e.g. reduceLeft, zipWithIndex, sliding, grouped). The semantics of each is probably exactly as you'd expect, and is well documented in each case.
The main character here is sliding. This is something I've needed/used many times in my own projects, and I'd be very surprised if others haven't had the same experience. It just seems like a sensible thing to have. The remainder of the functions here are mostly either variants of sliding, or used in the implementation of sliding.
Example Usage
Consider the case where you need to produce a stream of a mouse movement deltas (e.g. for aiming in a first-person game or panning in a map). Unfortunately, MouseEvent (triggered by mouse movement, among other things) will only tell you the current cursor position. Given an
EventStream[MouseEvent], we can use pairs to instead obtain a stream of movements:Testing
Manual Testing
All code was manually inspected and tested to rule out obvious logic errors. The possibility of more subtle errors is considered unlikely but not entirely ruled out.
Automated Tests
Additionally, ScanLeftSpec was created and filled with automated tests that cover the new functions. All of these new tests pass when using
sbt +test. There are some test failures inthree-level from-future flatMapSwitchandresetOnStop=true + drop(1), which appear to be unrelated.Linting
The code was reformatted with
sbt scalafmtAll,which also updated some unrelated files.LLM Usage
All code was written by me by hand, with the exception of ScanLeftSpec which was written entirely by Claude and manually inspected.