Skip to content

Conversation

@holoskii
Copy link

@holoskii holoskii commented Oct 5, 2025

Summary

Adds the geodesic traversal mode to polygonToCellsExperimental, letting us follow true great-circle edges when filling massive polygons (think continental footprints and polar caps). The PR builds the geodesic stack from the ground up—new vector math helpers, bounding caps, iterator wiring, and plenty of validation coverage—while keeping the default planar behavior unchanged (and safely rejecting geodesic flags on the legacy API).

Highlights

  • New geodesic mode

    • Introduces FLAG_GEODESIC_MASK (with helpers) so callers can opt into geodesic coverage; validates acceptable containment modes (FULL, OVERLAPPING) at both maxPolygonToCellsSizeExperimental and polygonToCellsExperimental.
    • Dispatches a dedicated geodesic iterator path that reuses the existing traversal logic but evaluates cells using great-circle geometry.
  • Core geometry plumbing

    • Adds GeodesicPolygon acceleration structures + helpers (geodesicPolygonCreate/Destroy, containment, boundary/cap intersection tests).
    • Extends the vector toolkit (latLngToVec3, cross/dot/normalize, distance helpers) and new conversions: vec3dToCell, cellToVec3, cellToGeodesicBoundary, cellToSphereCap.
    • Pulls bounding-cap tables into sphereCapTables.h for both prod code and tests; exposes SphereCap/AABB helpers.
  • Iterator & polyfill updates

    • Refactors polygon iterators into polyfill_iterator.h (with _extra context) and wires in geodesic_iterator.c.
    • Ensures geodesic flags short-circuit the legacy planar polygonToCells path.
  • Benchmarks, fuzzers, and tests

    • Benchmarks: geodesic variants for existing benchmark.
    • Fuzzers now exercise geodesic flag permutations.
    • Added new test suites: testGeodesicPolygonToCellsExperimental, testSphereCap, testVec3d, plus extended testBBoxInternal, testPolygonToCells(Experimental) flag validation, and memory-failure coverage.
    • Hooks new tests into CMakeLists.txt/CMakeTests.cmake.
  • Documentation

    • Updates the Regions API docs to explain the geodesic flag, trade-offs, and usage tips.

Attribution

This geodesic polyfill work originated at Yellowbrick Data (where I’m currently employed). We’re upstreaming it to the H3 project so the broader community can benefit and use it for coverage.

@CLAassistant
Copy link

CLAassistant commented Oct 5, 2025

CLA assistant check
All committers have signed the CLA.

@coveralls
Copy link

coveralls commented Oct 6, 2025

Coverage Status

coverage: 97.09% (-1.9%) from 98.946%
when pulling f3df47e on holoskii:geodesic_coverage
into 0ac92fb on uber:master.

@ajfriend
Copy link
Collaborator

ajfriend commented Oct 9, 2025

Thanks @holoskii for the contribution! Geodesic polyfill is an exciting addition that I think a lot of folks would use, myself included!

Please bear with us as we try to find the time to properly review this PR. Some initial questions from my end:

  • Is it fair to characterize this PR as providing two major improvements:
    1. geodesic mode for "polyfill", for polygons that would otherwise already be well-handled in the existing planar mode in polygonToCellsExperimental, and
    2. the ability to do polyfill on polygons larger or tricker (e.g. around poles or the antimeridian) than those that could be handled in the previous algorithm (and, if so, would these improvements also generalize to the existing planar mode?)
  • This may become obvious after I'm able to review more closely, but how much code can we share between the planar and geodesic algorithms?
  • Is it necessary to expose Vec3d and GeodesicCellBoundary in the public api (h3api.h.in), or can we keep those structs internal?
  • You mentioned that this mode is slower than the existing algorithm. How much slower? Do you have any example benchmarks you could share?
  • Any thoughts on what to do about the failing CI checks? Or would you like some input from the team?

Copy link
Collaborator

@isaacbrodsky isaacbrodsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution and opening this PR! Please let me know if you have questions. I am beginning to review the contribution and will add more comments as I read & test.

Comment on lines +508 to +511
void _geoToHex2d(const LatLng *g, int res, int *face, Vec2d *v) {
Vec3d p = latLngToVec3(g);
_vec3dToHex2d(&p, res, face, v);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a code change perspective, I note this is a little surprising because it appears it affects the indexing code path as well. (Not necessarily wrong, just something to review in detail)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted common bits into a helper to remove duplication. The original planar path behavior should be unchanged

Comment on lines 58 to 62
if (flags == CONTAINMENT_FULL || flags == CONTAINMENT_OVERLAPPING) {
uint32_t geodesicFlags = flags;
FLAG_SET_GEODESIC(geodesicFlags);
run(&geoPolygon, geodesicFlags, res);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want geodesic mode to be in a separate fuzzer to avoid fuzzer timeout issues.

Come to think of it, if we have more complicated input for the flags, we may want to also want to begin having a fuzzer for polygons (with or without holes) that also tries various flag values.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The fuzzer timeout was due to a bug, not algorithmic slowness. After the fix, there should be no more timeouts.
  • I kept planar and geodesic in a single fuzzer since they share the same entry point/interface. If runtime becomes an issue, we can first raise the timeout before splitting targets.
  • For flags, there’s a test that enumerates all combinations: TEST(invalidFlagsExperimental).

@ajfriend ajfriend added this to the v4.5 milestone Oct 16, 2025
@holoskii
Copy link
Author

Fixes / changes

  1. Introduce an explicit epsilon for the SphereCap check. The previous iteration relied on tolerance built into precomputed cosine values; in practice, we also need an additional epsilon.
  2. Use GeodesicPolygon* instead of void* in the iterator structure.
  3. Fix the MSVC build by using a macro instead of a constant.
  4. Move GeodesicCellBoundary and Vec3d to their own headers instead of defining them in the public API header.
  5. Move internal headers to the include directory.
  6. Add a new geodesic benchmark.
  7. Rebase on the latest master.

Performance

I added a dedicated geodesic benchmark that exercises small, medium, and large shapes.

  • Small cells: up to 60× slower.
  • Medium cells / large shapes: ~20×.
  • “Primary” geodesic use case (huge shapes + large cells, e.g., NY ↔ London great-circle path): ~2.5×.

Notes:

  • At Yellowbrick Data, we’ve seen ~10× performance improvements by pushing heavy inlining (by moving code to headers), branch hints, etc. I avoided invasive changes here to keep code style intact; the current PR aims for minimal intrusion.

@holoskii
Copy link
Author

Thanks @holoskii for the contribution! Geodesic polyfill is an exciting addition that I think a lot of folks would use, myself included!

Please bear with us as we try to find the time to properly review this PR. Some initial questions from my end:

* Is it fair to characterize this PR as providing two major improvements:
  
  1. geodesic mode for "polyfill", for polygons that would otherwise already be well-handled in the existing planar mode in `polygonToCellsExperimental`, and
  2. the ability to do polyfill on polygons larger or tricker (e.g. around poles or the antimeridian) than those that could be handled in the previous algorithm (and, if so, would these improvements also generalize to the existing planar mode?)

* This may become obvious after I'm able to review more closely, but how much code can we share between the planar and geodesic algorithms?

* Is it necessary to expose `Vec3d` and `GeodesicCellBoundary` in the public api (`h3api.h.in`), or can we keep those structs internal?

* You mentioned that this mode is slower than the existing algorithm. How much slower? Do you have any example benchmarks you could share?

* Any thoughts on what to do about the failing CI checks? Or would you like some input from the team?

1.1 - Yes.

1.2 - The geodesic approach behaves better for large polygons and near poles/antimeridian due to great-circle behavior, but it’s architecturally different from the planar path; there isn’t a practical feature we can “back-port” without re-implementing substantial pieces.

2 - We only share FaceIjk and iterator code between the two implementations. Intersection/containment logic and related optimization structures are written from scratch for the geodesic mode.

3 - Performance metrics are now included in the main PR comment (see “Performance” above) with the new benchmark.

@holoskii holoskii requested a review from isaacbrodsky October 27, 2025 09:48
@holoskii
Copy link
Author

A few small fixes for the CI:

  • examples did not compile because of "constants.h" included in h3api.h. Removed that include to keep api header standalone
  • fuzzer hit a legitimate timeout. Fixed this by limiting resolution and number of points for geodesic runs

@holoskii
Copy link
Author

holoskii commented Nov 3, 2025

@isaacbrodsky was right, separate fuzzer is a way to go. Old fuzzers left not enough time until timeout in each iteration. In the last commit I've added a separate geodesic fuzzer. PR is now fully ready for review

@holoskii
Copy link
Author

Hi @nrabinowitz @dfellis @isaacbrodsky - friendly bump on this.
Current state: fixed CI issues, added benchmarks, answered questions and addressed the comments.

@isaacbrodsky
Copy link
Collaborator

Hi @nrabinowitz @dfellis @isaacbrodsky - friendly bump on this. Current state: fixed CI issues, added benchmarks, answered questions and addressed the comments.

Thanks for updating the PR and for the bump! We appreciate the contribution and know we need to review this, so it's just a matter of finding time to read through it in depth. We would like to bring this in for the next release of H3.

Comment on lines +173 to +177
src/h3lib/include/polyfill_iterator.h
src/h3lib/include/geodesic_iterator.h
src/h3lib/include/geodesic_polygon.h
src/h3lib/include/geodesic_polygon_internal.h
src/h3lib/include/geodesic_cell_boundary.h
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor, but we can rename these files to camelCase as with other source files

Comment on lines +29 to +33
LatLng stateVerts[] = {{0.645778804554910, -1.903190792178713},
{0.645682811446050, -1.780975856637062},
{0.715595465293187, -1.781167842854781},
{0.715578012000667, -1.903262350678044},
{0.645778804554910, -1.903190792178713}};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either these coordinates are inverted compared to the London/NYC coordinates below, or other way around?

Comment on lines +40 to +54
static int populateGeoLoop(GeoLoop *g, const uint8_t *data, size_t *offset,
size_t size) {
if (size < *offset + sizeof(int)) {
return 1;
}
int numVerts = *(const int *)(data + *offset);
*offset = *offset + sizeof(int);
g->numVerts = numVerts;
if (size < *offset + sizeof(LatLng) * numVerts) {
return 1;
}
g->verts = (LatLng *)(data + *offset);
*offset = *offset + sizeof(LatLng) * numVerts;
return 0;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make a follow up issue to deduplicate the populateGeoLoop function between the three files its now in

Comment on lines +77 to +79
if (res == 0) {
res = 1; // resolution 1 tests more code paths compared to 0
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If res=0 is a plausible input we should exercise it under the fuzzer too.

uint32_t flags = CONTAINMENT_OVERLAPPING;
FLAG_SET_GEODESIC(flags);

resetMemoryCounters(1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to test the case that maxPolygonToCellsSizeExperimental fails due to memory allocation failing?

// of a distortion vertex on the last edge
int additionalIteration = length == NUM_HEX_VERTS ? 1 : 0;

// convert each vertex to lat/lng
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// convert each vertex to lat/lng
// convert each vertex to xyz

Comment on lines +365 to +385
* Encodes a Vec3d coordinate to the corresponding icosahedral face and
* squared euclidean distance to that face center.
*
* @param g The spherical coordinates to encode.
* @param res The desired H3 resolution for the encoding.
* @param h The FaceIJK address of the containing cell at resolution res.
* @param v3d The Vec3d coordinates to encode.
* @param face The icosahedral face containing the coordinates.
* @param sqd The squared euclidean distance to its icosahedral face center.
*/
void _geoToFaceIjk(const LatLng *g, int res, FaceIJK *h) {
// first convert to hex2d
Vec2d v;
_geoToHex2d(g, res, &h->face, &v);
static void _vec3dToClosestFace(const Vec3d *v3d, int *face, double *sqd) {
// determine the icosahedron face
*face = 0;
// The distance between two farthest points is 2.0, therefore the square of
// the distance between two points should always be less or equal than 4.0 .
*sqd = 5.0;
for (int f = 0; f < NUM_ICOSA_FACES; ++f) {
double sqdT = vec3DistSq(&faceCenterPoint[f], v3d);
if (sqdT < *sqd) {
*face = f;
*sqd = sqdT;
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this function to the bottom of the file if possible to minimize the diff

Comment on lines +40 to +43
if (iter->_polygon->geoloop.numVerts <= 0) {
iterErrorPolygonCompact(iter, E_DOMAIN);
return NULL;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a zero-vertex geodesic polygon is not allowed?

Comment on lines +44 to +59
if (iter->_polygon->numHoles > 0 && iter->_polygon->holes == NULL) {
iterErrorPolygonCompact(iter, E_DOMAIN);
return NULL;
}
for (int i = 0; i < iter->_polygon->numHoles; i++) {
if (iter->_polygon->holes[i].numVerts <= 0) {
iterErrorPolygonCompact(iter, E_DOMAIN);
return NULL;
}
}
GeodesicPolygon *poly = geodesicPolygonCreate(iter->_polygon);
if (!poly) {
iterErrorPolygonCompact(iter, E_MEMORY_ALLOC);
return NULL;
}
iter->geodesicPoly = poly;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these cases don't seem to be tested. numHoles being set while holes is NULL makes sense to check although if someone mismatched the holse and numHoles there is not a lot we can do about it in general.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding test coverage, do we want to update this PR to cover all cases, or merge this PR and then open additional PRs to complete the coverage of new code?

return E_DOMAIN;
}

*out = (GeodesicLoop){0};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*out = (GeodesicLoop){0};
*out = {0};

I assume this is not needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants