Leverage geospatial type composability in domain objects

Description

I noticed that all of the geospatial type constructors in the Python DSE driver take tuples rather than lower-level domain objects (e.g. LineString takes a tuple of (x,y) tuples rather than a tuple of Point objects; Polygon acts similarly). I get that this is concise, but IMO it also makes the api more complex. Because the points of a line-string are stored as a tuple of (x,y) coords rather than a tuple of Point objects, the end-user doesn't see Point's come out of line_string.coords when semantically that's what they are. In addition, the _str_ impls of all these types effectively repeat logic because they all muck with these coords instead of calling out to contained objects' text-serialization methods.

On the consistency side, Java, NodeJS, and Ruby expose these relationships:

  • A Polygon is defined by an ordered collection of LineString's. When you ask for the exterior_ring, you get a LineString object. When you ask for interior_rings, you get an array of LineString objects.

  • A LineString is defined by an ordered collection of Point's and similarly exposes its collection of Point objects.

  • A Point is defined by its x and y (float) attributes.

Environment

None

Pull Requests

None

Activity

Show:
Adam Holmberg
June 10, 2016, 3:03 AM
Edited

This API was intentionally chosen to interoperate with existing libraries in the Python ecosystem (Shapely being the predominant example). These "domain objects" are only there to materialize results. They return subtypes that can be used to construct instances in other libraries, and they are constructed from similar generic types. There is no reason to return more of these internal types unless we intend to implement the rich functionality found in other libraries. We also do not assume those other libraries are present, and do not change behavior even if they are.

Adam Holmberg
June 10, 2016, 3:07 AM

Also, keep in mind that while other drivers can choose and commit to an internal dependency for implementing the richer features, we have reasons for minimizing base dependencies.

Alan Boudreault
June 11, 2016, 2:04 AM

Since our extension doesn't aim to be GIS feature rich, I agree that is should be as minimal as possible. My understanding is that we should only provide a basic API to deserialize geo types from the DB. There are many GIS libs in Python and they are all used (Shapely, OGR etc.). I prefer to let the users choose one of these feature rich libs and benefit from all GIS capabilities.

Adam Holmberg
June 11, 2016, 2:16 AM

Thanks for the suggestion. We thought this through in the original implementation and revisited for this ticket. I think composing these basic types does not buy anything other than naming the container, and ultimately it reduces ease-of-use with other libraries.

If we find a broad user base for a particular integration, we might be able to provide some conversion utilities with soft dependencies, but I think these basic types will remain generic.

Sandeep Tamhankar
June 11, 2016, 3:48 AM

Got it. I hadn't thought about ease of interoperability with other libraries. I totally get you don't want to force such dependencies and make it easy for the user to take our data and dump it into these third-party libraries (and vice versa).

IMO, you wouldn't lose a lot by having a 'raw' method on the driver's Point, LineString, Polygon that return tuples that you can feed to other libraries (and conversely a constructor or static fromRaw to take such tuples), and then you get a prettier api in the driver (I do like naming my containers!)

In any case, I can see the argument that most users will probably pass geo-data along to third-party libraries for analysis and push geo-data back to DSE from those third-party libraries.

Won't Do

Assignee

Unassigned

Reporter

Sandeep Tamhankar

Fix versions

None

Labels

PM Priority

None

External issue ID

None

Doc Impact

None

Reviewer

None

Size

None

Pull Request

None

Components

Affects versions

Priority

Major