Location
queries
Location is a datatype. It can be represented either as a
point or a polygon and each helps with answering questions such as getting top
3 stores near to a geographic point or stores within a region. Since it is a
data type, there is some standardization available. SQL Server defines not one
but two data types for the purpose of specifying location: the Geography data
type and the Geometry data type. The Geography data type stores
ellipsoidal data such as GPS Latitude and Longitude and the geometry data type
stores Euclidean (flat) coordinate system. The point and the polygon are
examples of the Geography data type. Both the geography and the geometry data
type must have reference to a spatial system and since there are many of them,
it must be used specifically in association with one. This is done with the
help of a parameter called the Spatial Reference Identifier or SRID for short.
The SRID 4326 is the well-known GPS coordinates that give information in the
form of latitude/Longitude. Translation of an address to a
Latitude/Longitude/SRID tuple is supported with the help of built-in functions
that simply drill down progressively from the overall coordinate span. A table such as ZipCode could have an
identifier, code, state, boundary, and center point with the help of these two
data types. The boundary could be considered the polygon formed by
the zip and the Center point as the central location in this zip. Distances
between stores and their membership to zip can be calculated based on this
center point. Geography data type also lets us perform clustering analytics
which answers questions such as the number of stores or restaurants satisfying
a certain spatial condition and/or matching certain attributes. These are
implemented using R-Tree data structures that support such clustering
techniques. The geometry data type supports operations such as area and
distance because it translates to coordinates.
It has its own rectangular coordinate system that we can use to specify
the boundaries or the ‘bounding box’ that the spatial index covers.
The operations performed with these data types include
the distance between two geography objects, the method to determine a range
from a point such as a buffer or a margin, and the intersection of two
geographic locations. The geometry data type supports operations such as area
and distance because it translates to coordinates. Some other methods supported
with these data types include contains, overlaps, touches, and within.
A note about the use of these data types now follows. One
approach is to store the coordinates in a separate table where the primary keys
are saved as the pair of latitude and longitude and then to describe them as
unique such that a pair of latitude and longitude does not repeat. Such an
approach is questionable because the uniqueness constraint for locations has a
maintenance overhead. For example, two locations could refer to the same point
and then unreferenced rows might need to be cleaned up. Locations also change
ownership, for example, store A could own a location that was previously owned
by store B, but B never updates its location. Moreover, stores could undergo
renames or conversions. Thus, it may be better to keep the spatial data
associated in a repeatable way along with the information about the location.
Also, these data types do not participate in set operations. That is easy to do
with collections and enumerable with the programming language of choice and
usually consist of the following four steps: answer initialization, return an
answer on termination, accumulation called for each row, and merge called when
merging the processing from parallel workers. These steps are like a map-reduce
algorithm. These data types and operations are improved with the help of a spatial
index. These indexes continue to be like indexes of other data types and are
stored using B-Tree. Since this is an ordinary one-dimensional index, the
reduction of the dimensions of the two-dimensional spatial data is performed by
means of tessellation which divides the area into small subareas and records
the subareas that intersect each spatial instance. For example, with a given
geography data type, the entire globe is divided into hemispheres and each
hemisphere is projected onto a plane. When that given geography instance covers
one or more subsections or tiles, the spatial index would have an entry for
each such tile that is covered. The geometry data type has its own
rectangular coordinate system that you define which you can use to specify the
boundaries or the ‘bounding box’ that the spatial index covers. Visualizers
support overlays with spatial data which is popular with mapping applications
that super-impose information over the map with the help of transparent layers.
An example is the Azure Maps with GeoFence as described here.
No comments:
Post a Comment