Skip to content

Conversation

@joto
Copy link
Collaborator

@joto joto commented Oct 3, 2022

Handle the case specially where there is a only a single point/linestring in a multipoint/multilinestring geometry due to missing input data.

joto added 2 commits October 3, 2022 11:10
Handle the case specially where there is a only a single
point/linestring in a multipoint/multilinestring geometry due to missing
input data.
@mboeringa
Copy link

@joto,

Just wondering: is there any technical reason to turn single point/linestring that should have been multi-variant, due to missing objects in e.g. a regional extract, into non-multi type geometries?

"single point/linestring" multipoints and multilinestrings are still valid according to SF, aren't they, or am I wrong?

Of course, there is no real objection either to turn them into point/linestring instead, unless the geometry column's type was set to the multi-variant instead of generic geometry, in which case PostgreSQL would reject the insertion.

Maybe simply documenting this specific behavior on the osm2pgsql manual pages, would be enough to cover the implemented behavior.

@joto
Copy link
Collaborator Author

joto commented Oct 3, 2022

@mboeringa The immediate technical reason is that single geometries are simpler and more efficient to handle than multi-geometries. But this isn't a big deal in this case, because it shouldn't happen that often. But thinking about all the different ways of handling the corner cases the behaviour does make sense:

  1. You can call as_multipoint() on nodes or relations. And as_multilinestring()/as_multipolygon() on ways or relations. That can make user code simpler, because you can handle different cases with the same code.
  2. You want the node/way case to be efficient, i.e. always generate a single geometry even when using those as_multi*() functions.
  3. It is easy to detect that we have a single geometry when writing into a multi-geometry column and just turn the single geom into a multi-geometry (which we do). This doesn't work the other way around, though. So that's another reason to go with the single geom.
  4. If you want to split multi geoms into several rows in the database each only with a single geom, the code for iterating over a multi geom will still do the right thing with a single geom.

And yes, the documentation isn't all that good around these issues. This is mostly, because our thinking on and experience with it is still evolving, so I keep patching the documentation here and there, but at some point I have to do a more systematic writeup of all this.

@mboeringa
Copy link

@joto,

Thanks for the detailed answer. Having your arguments documented here in the repository with the relevant pull request as answer to my questions, is already a start in documenting it... Should be fine for now, and I understand it is still all a bit fluid with ongoing development.

@lonvia lonvia merged commit e39c56b into osm2pgsql-dev:master Oct 4, 2022
@joto joto deleted the fix-multi-with-single branch October 4, 2022 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants