Predicates are expected to evaluate the same way
regardless of which layer they are implemented in. If we have a set of
predicates and they are separated by or clause as opposed to and clause, then
we will have a result set from each predicate, and they may involve the same
records in the results of each predicate. If we filter based on one predicate
and we also allow matches based on another predicate, the two result sets may
then be merged into one so that the result can then be returned to the caller.
The result sets may have duplicates so the merge may have to return only the
distinct elements. This can easily be done by comparing the unique identifiers
of each record in the result set.
The selection of the result is required prior to
determining the section that needs to be returned to the user. This section is
determined by the start, offset pair in the enumeration of the results. If the
queries remain the same over time, and the request only varies in the paging
parameters, then we can even cache the result and return only the paged
section. The API will persist the predicate, result sets in cache so that
subsequent calls for paging only results the same responses. This can even be
done as part of predicate evaluation by simply passing the well-known limit and
offset parameter directly in the SQL query. In the enumerator we do this with
Skip and Take. The OData Client calls with client-driven paging using $skip and
$top query options.
When the technology involved merely wants to
expose the database to the web as popularly used with OData albeit incorrectly,
then each SQL object is exposed directly over the web API as a resource. Some
queries are difficult to write in OData as opposed to others. For example,
oDataClient.Resource.Where(x =>
x.Name.GetHashCode() % ParallelWorkersCount == WorkerIndex).ToList()
will not achieve the desired partition of a
lengthy list of resources for faster, efficient parallel data access
and must be rewritten as something like:
oDataClient.Resource.Where(x =>
x.Name.startsWith(‘A’)).ToList()
:
oDataClient.Resource.Where(x =>
x.Name.startsWith(‘Z’)).ToList()
The system query options from these are $filter,
$select, $orderby, $count, $top, and $expand where the last one helps with
joins. Although a great deal of parity can be achieved between SQL and OData
with the help of these query options, the REST interface does not form a
replacement for the analytical queries possible with purely language options
such as those available from U-SQL, LINQ or Kusto. Those have their own place
higher up in the stack at the business or application logic layer but at the
lower levels close to the database, a web interface separation of concerns
between the stored data and its access, the primitives provide a challenge as
well as an opportunity.
Let us look at how the OData is written. We begin
with a database that can be accessed with a connection string that stores data
in the form of tables for entities in a database. A web project with an entity
data model is then written to prepare a data model from the database. The web
project can be implemented with a SOAP-based WCF or REST based webAPIs and
EntityFramework. Each API is added by creating an association between the
entity and the API. Taking the example of WCF further since it provides
terminology for all parts of the service albeit not obsolete, a type is
specified with the base DataService and an InitializeService method, the
config.SetEntitySetAccessRule is specified. Then the JSONPSupportBehaviour
attribute is added to the service class so that the end users can get the data
in the well-known format that makes it readable. The service definition as say http://<odata-endpoint>/service.svc can be expected in
json or xml format to allow clients to build applications using those objects
representing entities. The observation here is that it uses a data model which
is not limited to SQL databases, so the problem is isolated away from the
database and narrowed down to the operations over the data model. In fact,
OData has never been about just exposing the database on the web. We choose
which entities are accessed over the web and we can expand the reach with OASIS
standard. OASIS is a global consortium that drives the development,
convergence, and adoption of web standards. Another observation is that we need
not even use the Entity Framework for the data model. Some experts argue that
OData main use case is the create, update, and delete of entities over the web
and the querying should be facilitated by APIs from web services where rich
programmability already exists for writing queries. While it is true that there
are language-based options that can come in the compute layer formed by the web
services, the exposure remains a common theme to the REST API design for both
the REST API over a service or the REST API over a database. The filter
predicate used in those APIs will eventually try to push it into the data
persistence layer. In our case, we chose an example of a GetHashCode() operator
that is more language based rather than a notion for the database. As
demonstrated with the SQL statement example above, the addition of a hash to an
entity involves adding a compute column attribute to its persistence. Once that
is available, the predicate can automatically be pushed into the database for
maximum performance and scalability.
The manifestation of data to support simpler
queries and their execution is not purely a technical challenge. The boundary
between data and compute is complicated by claims to ownerships,
responsibilities, and jurisdictions. In fact, clients writing OData
applications are forced to work without any changes to master data. At this
point, there are two options for these applications. The first involves
translating the queries to those that can work on existing data such as the
example shown above. The second involves the use of scoping down the size of
the data retrieved by techniques such as incremental update polling, paging,
sorting etc. and then performing the complex query operations in-memory on that
limited set of data. Both these options are sufficient to alleviate the problem
encountered.
The strategic problem for the case with the data
being large and the queries being arbitrarily complex for OData clients can be
resolved with the help of a partition function and the use of a scatter-gather
processing by the clients themselves. This can be compared to the partition
that is part of the URI path qualifier for REST interfaces to the CosmosDB
store.
OData also provides the ability to batch requests. The
HTTP specification must be followed when sending a response. A new batch
handler must be created and passed when mapping routing for OData service.
Batch or response consolidation will be enabled.
No comments:
Post a Comment