Initialize with a list of ZoneIds. Fixes #180#182
Initialize with a list of ZoneIds. Fixes #180#182PhotoKevin wants to merge 4 commits intoRomanIakovlev:masterfrom
Conversation
commit cdab7ad Merge: d025c9f ddcaf60 Author: kevin <kevin@blackholeofphotography.com> Date: Sun Jan 4 19:59:49 2026 -0500 Merge branch 'InitializeByZoneNames' of https://github.com/PhotoKevin/timeshape into InitializeByZoneNames commit d025c9f Author: kevin <kevin@blackholeofphotography.com> Date: Sun Jan 4 19:59:43 2026 -0500 Added an example of the new initializer method. commit ddcaf60 Merge: c54ff64 ff837e7 Author: Kevin <kevin@blackholeofphotography.com> Date: Sun Jan 4 19:48:28 2026 -0500 Merge branch 'RomanIakovlev:master' into InitializeByZoneNames commit c54ff64 Author: kevin <kevin@blackholeofphotography.com> Date: Sat Jan 3 13:34:06 2026 -0500 Parameter order on assertEquals was backwards assertEquals parameters were reversed. Changed to expected, actual. commit 180966d Author: kevin <kevin@blackholeofphotography.com> Date: Sat Jan 3 13:31:18 2026 -0500 New initialization functions that use a list of ZoneIds. Added initialization functions to use a lists of ZoneIds. Users may find it easier to determine which time zones they're interested in instead what the bounding box is for, say, all of Europe. Added documentation comments for parameters.
| ZoneId zoneId = ZoneId.of(zoneIdName); | ||
| getPolygons(f).forEach(polygon -> { | ||
| if (timeZones.contains (zoneId)) { |
There was a problem hiding this comment.
I think you could move the timeZones.contains(zoneId) check outside of the getPolygons(f).forEach loop, because it doesn't depend on the polygon, only on the zoneId.
| ZoneId zoneId = ZoneId.of(zoneIdName); | |
| getPolygons(f).forEach(polygon -> { | |
| if (timeZones.contains (zoneId)) { | |
| ZoneId zoneId = ZoneId.of(zoneIdName); | |
| if (timeZones.contains (zoneId)) { | |
| getPolygons(f).forEach(polygon -> { |
There was a problem hiding this comment.
Good catch. It came from copying the original method.
|
|
||
|
|
||
| @SuppressWarnings("SizeReplaceableByIsEmpty") | ||
| static Index build(Stream<Geojson.Feature> features, int size, List<ZoneId> timeZones, boolean accelerateGeometry) { |
There was a problem hiding this comment.
I think Set works better here.
| static Index build(Stream<Geojson.Feature> features, int size, List<ZoneId> timeZones, boolean accelerateGeometry) { | |
| static Index build(Stream<Geojson.Feature> features, int size, Set<ZoneId> timeZones, boolean accelerateGeometry) { |
| * @return an initialized instance of {@link TimeZoneEngine} | ||
| */ | ||
|
|
||
| public static TimeZoneEngine initialize(List<ZoneId> timeZones, |
There was a problem hiding this comment.
This method looks like mostly a duplicate of the existing initialize method, with the exception of added timeZones parameter.
If you look at other initialize overloads, they don't duplicate the initialization logic, but rather make calls the initialize(double minLat, double minLon, double maxLat, double maxLon, boolean accelerateGeometry, TarArchiveInputStream f) overloaded method with the default values for parameters they do not accept. Would it be possible to do the same for this new overload?
This approach requires providing the default values for the parameters, and for timeZones I can think of 3 options for default values:
- Set of all the time zones supported by JVM, can be done with https://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.html#getAvailableZoneIds--. One downside is that it returns
Set<String>, notSet<ZoneId>, so we'd have to fall back to strings equality check internally, but we should still exposeZoneIdas the parameter type. - Use empty set as default value and treat it specially in the
Index.buildmethod (if it's empty then make no checks on it, just include all the time zone from the data). - Replace inclusion check with exclusion in
Index.build, and use a complement of all the time zone ids as the default value. Advantage compared to option 2 is that there's no need to special-case the empty set inIndex.build.
What do you think of these options? I personally lean towards the option 2, as it incurs the least amount of overhead during initialization, but I'm also open to 1 and 3, since I don't believe checking a set membership requires a lot of compute.
There was a problem hiding this comment.
- It could be also done by converting the Set to Set before passing. More time, more memory, but tiny compared to the overall size of parsing the data files.
2 This seems best to me, although maybe as a null parameter instead of an empty set. I could go either way.
- I'm not sure of the overhead of checking a set membership, probably in the same realm as hoisting that if statement out. I'd have to look at the underlying implementation to be sure. I'd guess it's either a sorted array or a hash table since sets have to check on duplicates and that would hurt. But like option 1, it'd be lost in the noise of decompressing and reading data.tar.zstd.
There was a problem hiding this comment.
Ok let's go for option 2 then, but I'd prefer to avoid nulls and rather use Optional<Set<ZoneId>>.
There was a problem hiding this comment.
As I implemented this, I came to the conclusion that I've missed your point. Why have the time zones be optional? If someone doesn't want to specify a set of zones and instead take in the entire planet then they can call
public static TimeZoneEngine initialize()
or
public static TimeZoneEngine initialize(boolean accelerateGeometry)
Unless you mean to make an initialize that takes both a bounding box and a set of time zones and then decides which to use depending if the time zone set is present, or not.
The 'real' methods are:
public static TimeZoneEngine initialize(double minLat,
double minLon,
double maxLat,
double maxLon,
boolean accelerateGeometry,
TarArchiveInputStream f) {
public static TimeZoneEngine initialize(Optional<Set<ZoneId>> timeZones,
boolean accelerateGeometry,
TarArchiveInputStream f) {
Both of those do have duplicate code on the featureStream Spliternator which could probably be refactored out. But everything else is a layer around those two.
There was a problem hiding this comment.
Yes, I was talking about adding the Optional<Set<ZoneId>> to the main initialize method, so that it looks like:
public static TimeZoneEngine initialize(double minLat,
double minLon,
double maxLat,
double maxLon,
boolean accelerateGeometry,
TarArchiveInputStream f,
Optional<Set<ZoneId>> timeZones) {
However now that I look at this signature, it seems that the bounding box params (min-max lat-lon ones) don't make a lot of sense in the same method as the timeZones, so you're right, they should probably be 2 separate methods as you suggested in your previous comment. So if you can factor out the Spliterator and related code to avoid duplication between the two, it would be the best outcome.
There was a problem hiding this comment.
Yeah, a user would either use a bounding box, or a set of ZoneId but never both at the same time. We're totally on the same page now.
It'll be couple days before I can get to the Spliterator change, I'm heading out of town.
| * @param accelerateGeometry Increase query speed at the expense of memory utilization | ||
| * @return an initialized instance of {@link TimeZoneEngine} | ||
| */ | ||
| public static TimeZoneEngine initialize(List<ZoneId> timeZones, boolean accelerateGeometry) { |
There was a problem hiding this comment.
The same comment regarding the overloads and code duplication as above. If we can figure out a good default value for the timeZones, we won't need this method at all.
There was a problem hiding this comment.
On the subject of duplication, I'm going to put the number of time zones into a constant at the top of the file.
commit cdab7ad
Merge: d025c9f ddcaf60
Author: kevin kevin@blackholeofphotography.com
Date: Sun Jan 4 19:59:49 2026 -0500
commit d025c9f
Author: kevin kevin@blackholeofphotography.com
Date: Sun Jan 4 19:59:43 2026 -0500
commit ddcaf60
Merge: c54ff64 ff837e7
Author: Kevin kevin@blackholeofphotography.com
Date: Sun Jan 4 19:48:28 2026 -0500
commit c54ff64
Author: kevin kevin@blackholeofphotography.com
Date: Sat Jan 3 13:34:06 2026 -0500
commit 180966d
Author: kevin kevin@blackholeofphotography.com
Date: Sat Jan 3 13:31:18 2026 -0500