Cluster computing: Location services using spatial data

Monday, July 29, 2013

Location services using spatial data

In applications like the NerdDinner samples posted online, the location data is often represented with latitudes and longitudes which are stored in the database as not nullable doubles or floating points. Stores or restaurants in the neighborhood of a location are found by their cartesian distance between the latitude and longitude of the location and the store / restaurant usually implemented as a database function or stored procedure. The table for the stores or restaurants with their locations are then scanned to find the distances with the input location and ordered . This is then filtered for the top handful and displayed on the maps such as with Google API. Application developers find it easy to write the controllers to use the said database objects or LINQ to SQL to display the appropriate views.
However, there are some limitations with these approach. First this doesn't scale when there are hundreds of stores or restaurants in the radius of interest. Second, it doesn't answer the often repeated queries such as the points in a polygon such as that formed by a zipcode. Queries could also more efficiently find distance between two points if the data type to store location was say GEOGRAPHY and GEOMETRY data type in SQL Server. The Geography data type stores ellipsoidal data such as GPS Latitude and Longitude and the geometry data type stores Euclidean (flat) co-ordinate system One could then have a table such as :
ZipCodes
- ZipCodeId
- Code
- StateID
- Boundary
- Center Point
Boundary could be considered the polygon formed by the zip and the Center point is the central location in this zip. Distances between stores and their membership to a zip can be calculated based on this center point. Geography data type also lets you perform clustering analytics which answer questions such as the number of stores or restaurants satisfying a certain spatial condition and/or matching certain attributes. These are implemented using R-Tree data structures which support such clustering techniques.
Spatial data types such as the Geography data type now enjoys support in Entity Framework 5 as described here and therefore is available in LINQ and explained here. .Net also supports these data types with the SQLGeography and SQLGeometry data types for easy translation and mapping to their equivalent sql server data types.
One approach is to store the co-ordinates in a separate table where the primary keys are saved as the pair of latitude and longitude and then to describe them as unique such that a pair of latitude and longitude does not repeat. Such approach is questionable because the uniqueness constraint for locations has a maintenance overhead. For example, two locations could refer to the same point and then unreferenced rows might need to be cleaned up. Locations also change ownership for example, a store A could own a location that was previously owned by store B but B never updates its location. Moreover, stores could undergo renames or conversions. Thus it may be better to keep the spatial data associated in a repeatable way along with the information about the location.
Map APIs such as Google Maps or Bing Maps lets you work with spatial data types along with their usual caveat of not to store or cache locations independently.
Referred StackOverflow and MSDN

Cluster computing

Monday, July 29, 2013

Location services using spatial data

No comments:

Post a Comment