Skip to content

Conversation

@alexeyegorov
Copy link
Contributor

Did you read the Contributor Guide?

Is this PR related to a ticket?

  • No:
    • this is a documentation update. The PR name follows the format [DOCS] my subject

What changes were proposed in this PR?

Major change

This PR adds the documentation based on the discussion about spatial joins in Sedona.
During some experiments, we found a workaround to perform a spatial left join without broadcasting any of the datasets (e.g. due to a size of the datasets).

Minor change

There was a minor typo in the documentation of the distance join which is fixed here.
In case it is not wanted to mix it up, it can be dropped here.

How was this patch tested?

Did this PR include necessary documentation updates?

This PR does not affect any public API.
It is just updating documentation and suggests a solution to perform a spatial left join.

@jiayuasu
Copy link
Member

@alexeyegorov thanks for the PR. Would you please run pre-commit --all-files locally to fix the failed CI?

@alexeyegorov
Copy link
Contributor Author

@jiayuasu Sorry I missed that. Now it is done.

Hope to get some feedback on the content. :)

@jiayuasu
Copy link
Member

@alexeyegorov The pre-commit still failed.

In addition, I think the full outer join and COALESCE is not needed, if the target is a left join from dfa.geom to dfb.geom?

I think below it is what you need

WITH inner_join AS (
    SELECT
        dfA.a_id,
        dfB.b_id
    FROM dfA, dfB
    WHERE ST_INTERSECTS(dfA.geometry, dfB.geometry)
)
SELECT
    dfA.*,
    inner_join.b_id
FROM dfA
LEFT JOIN inner_join
ON dfA.a_id = inner_join.a_id;

@alexeyegorov
Copy link
Contributor Author

@alexeyegorov The pre-commit still failed.

Sorry, noob mistake! Fixed this now.

In addition, I think the full outer join and COALESCE is not needed, if the target is a left join from dfa.geom to dfb.geom?

That's a great hint. I have slightly overcomplicated that. I will update the code after few checks again with the data.

Could a suggestion for a DBT macro be helpful here? There is no dedicated DBT documentation in here.

Copy link
Member

@jiayuasu jiayuasu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. Thank you!

@jiayuasu jiayuasu merged commit 16caf71 into apache:master Dec 16, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants