Skip to content

h3shape_to_cells_experimental takes string containment modes#436

Merged
ajfriend merged 14 commits intomasterfrom
aj/exp_contain
Jan 27, 2025
Merged

h3shape_to_cells_experimental takes string containment modes#436
ajfriend merged 14 commits intomasterfrom
aj/exp_contain

Conversation

@ajfriend
Copy link
Collaborator

@ajfriend ajfriend commented Dec 31, 2024

  • Just use string containment modes for h3shape_to_cells_experimental
  • Add polygon_to_cells and h3shape_to_cells_experimental and polygon_to_cells_experimental to docs
  • Avoid repeating tests when checking coverage

@ajfriend ajfriend changed the title polygon to cells just takes string arguments h3shape_to_cells_experimental takes string containment modes Dec 31, 2024
Comment on lines 573 to 578
containment_modes = {
'center': 0,
'full': 1,
'overlap': 2,
'bbox_overlap': 3,
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on names? I took a stab here to shorten things, but we should also align with upstream. I could go either way with bbox_overlap vs overlap_bbox, but this way each containment mode has a distinct first letter and should be less likely to accidentally select overlap_bbox when going for overlap.

@ajfriend
Copy link
Collaborator Author

Aside: The test configuration is getting a little wonky with things like:

test:
	./env/bin/pytest tests/test_lib --cov=h3 --cov=tests/test_lib --cov-fail-under=100

	./env/bin/pip install cython
	./env/bin/cythonize tests/test_cython/cython_example.pyx
	./env/bin/pytest tests/test_cython --cov=tests/test_cython

It will probably be easy for the makefile and the github actions files to get out of sync. Hopefully we can get back to a single test command when we figure out the Cython compilation issue and can have everything at 100% coverage.

@jmealo
Copy link

jmealo commented Dec 31, 2024

Looks good so far.

I'm reviewing this on mobile, so I likely missed it...

Are we checking that the input geometry is valid? Invalid geometry could lead to invalid memory access (unless h3 fixed this recently).

@ajfriend
Copy link
Collaborator Author

Are we checking that the input geometry is valid? Invalid geometry could lead to invalid memory access (unless h3 fixed this recently).

I don't think we currently do a ton of validation, but it it something we could consider adding more of in the future.

The H3 C library does some, and can halt and return an error code, but I'm not sure of the extent of validation it does. Maybe @nrabinowitz or @isaacbrodsky would be more familiar?

@jmealo, are you aware of any good test cases? Or maybe examples of other libraries that do the kind of validation you're thinking of?

@jmealo
Copy link

jmealo commented Dec 31, 2024

It was my understanding that this specific function was marked experimental in the h3 library because it failed fuzzing/valgrind-type tests.

In PostGIS we're going to be calling either ST_IsValid or ST_MakeValid to make it safe to call. In my testing, the overhead was extremely negligible.

For the memory safety and avoiding undefined behavior benefits alone it seems well worth it.

IIRC that function comes from GDAL. If the validation happens in Python, I would consider adding a boolean flag skipValidation that defaults to false.

This would be safe for folks who don't RTFM and for folks who do, provided they validate geometry elsewhere, no additional performance overhead.

I would add an unsafe_ prefix or explicitly name the function and input parameters to make it abundantly clear to only pass previously validated geometry or use a boolean flag to skip validation that defaults to false.

Edited: Clarified thoughts on safety versus performance.

@isaacbrodsky
Copy link
Collaborator

It was my understanding that this specific function was marked experimental in the h3 library because it failed fuzzing/valgrind-type tests.

The fuzzer issue related to underestimating the size of the output buffer with maxPolygonToCellsExperimentalSize. The function was made safe by passing in the size of the output buffer. This allows the function to detect that it would write past the end of buffer and return an error if so. There should not be any outstanding fuzzer or valgrind failures on this function. (There may be a few timeouts.)

@jmealo
Copy link

jmealo commented Dec 31, 2024 via email

return _out_collection(mv)


def polygon_to_cells_experimental(h3shape, res, flags=0):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the name of the function in the C library and on the docs site. Why is the alias being removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this previously, and the preferred function name in h3-py is h3shape_to_cells since polygon doesn't account for multipolygons. We added the alias polygon_to_cells() in h3-py just as a pointer for folks to discover the preferred function.

As this function is experimental, and we already have the pointer, I think adding an additional alias is redundant. Also, we'll get back to the status quo when we merge the experimental function into the primary function.

The resolution of the output hexagons
flags : int
Polygon to cells flags, such as containment mode.
flag : int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem like an idiomatic change to me. While containment modes cannot be combined, I think the intention was to expand flags to cover other cases such as spherical vs Cartesian.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Cython function are internal and not part of the externally supported API, so we're free to change this in the future without officially breaking anything.

flags = ContainmentMode[flags]
except KeyError as e:
raise ValueError('Unrecognized flags: ' + flags) from e
if isinstance(flags, ContainmentMode):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would prefer to keep the enum option in addition to the stringly-typed form.

Would typing pick up the available string options and offer them as suggestions? I don't see them in type annotations right now.

Copy link
Collaborator Author

@ajfriend ajfriend Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think typing in the future is the more natural way to handle this.

We don't currently have types, but they could be added. We don't use them in the area/length functions, for example, to specify the output units. Thus, the current form is a smaller change that's more consistent with the existing library design, and doesn't preclude adding typing or enums in the future.

Would typing pick up the available string options and offer them as suggestions? I don't see them in type annotations right now.

Yes, that's my understanding, if we were to add the type annotations.

Copy link
Collaborator Author

@ajfriend ajfriend Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually, if you're interested in adding support for IDE tooling like autocomplete/suggestions, I think we should do that in a separate issue or PR. That's outside the scope of this PR (or #432), as it affects the library as a whole.

And I'd definitely welcome that discussion, as I think it is needed!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know what? The function is experimental, so let's just try it and see how it works. I do still think we should separately take a look at typing in the library holistically.

isaacbrodsky added a commit to isaacbrodsky/h3 that referenced this pull request Jan 24, 2025
@ajfriend ajfriend merged commit 1ac484a into master Jan 27, 2025
42 checks passed
@ajfriend ajfriend deleted the aj/exp_contain branch January 27, 2025 00:57
@ajfriend ajfriend mentioned this pull request Jan 27, 2025
isaacbrodsky added a commit to uber/h3 that referenced this pull request Jan 27, 2025
* add docs for polygonToCellsExperimental

* remove

* python flags

* update for uber/h3-py#436
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants