-
Notifications
You must be signed in to change notification settings - Fork 744
[GH-2545] Add ST_Collect_Agg aggregate function #2546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GH-2545] Add ST_Collect_Agg aggregate function #2546
Conversation
|
Agree. Let's all use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new spatial aggregate function ST_Collect_Agg that collects geometries into multi-geometries without dissolving boundaries, differentiating it from the existing ST_Union_Agg function.
Key Changes:
- Implements
ST_Collect_Aggaggregate function that preserves individual geometries and duplicates - Adds DataFrame API support for both Scala and Python
- Includes comprehensive test coverage for various geometry types and edge cases
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/AggregateFunctions.scala |
Core implementation of ST_Collect_Agg aggregator class |
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/st_aggregates.scala |
DataFrame API methods for ST_Collect_Agg |
spark/common/src/main/scala/org/apache/sedona/sql/UDF/Catalog.scala |
Registration of ST_Collect_Agg in function catalog |
spark/common/src/test/scala/org/apache/sedona/sql/aggregateFunctionTestScala.scala |
SQL-based tests for various use cases |
spark/common/src/test/scala/org/apache/sedona/sql/dataFrameAPITestScala.scala |
DataFrame API test for basic functionality |
python/sedona/spark/sql/st_aggregates.py |
Python API implementation with documentation |
python/tests/sql/test_aggregate_functions.py |
Python tests for aggregate function behavior |
python/tests/sql/test_dataframe_api.py |
Python DataFrame API test |
docs/api/sql/AggregateFunction.md |
Documentation with usage examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes #<issue_number>What changes were proposed in this PR?
Add
ST_Collect_Aggas a new spatial aggregate functionCollects all geometries in a column into a multi-geometry (MultiPoint, MultiLineString, MultiPolygon, or
GeometryCollection)
Unlike
ST_Union_Agg, this function does not dissolve boundaries - it simply collects geometriesAdd
ST_Collect_Aggclass inAggregateFunctions.scalaAdd DataFrame API support in
st_aggregates.scalaRegister function in Spark SQL catalog
Add Python API in
st_aggregates.pyAdd Scala and Python tests
Add documentation
How was this patch tested?
Did this PR include necessary documentation updates?