This is a
continuation of an article that describes operational considerations for hosting
solutions on Azure public cloud.
There are
several references to best practices throughout the series of articles we wrote
from the documentation for the Azure Public Cloud. The previous article focused
on the antipatterns to avoid, specifically the cloud readiness antipatterns.
This article focuses on the extraneous fetching antipattern.
When services call datastores, they retrieve data
for a business operation, but they often result in unnecessary I/O overhead and
reduced responsiveness. This antipattern
can occur if the application is trying to save on the number of requests by
fetching more than required. This is a form of overcompensation and is commonly
seen with catalog operations because the filtering is delegated to the middle
tier. For example, user may need to see
a subset of the details and probably does not need to see all the products at
once yet a large dataset from the catalog is retrieved. Even if the user is browsing the entire
catalog, paginating the results avoids this antipattern.
Another example of this problem is the
inappropriate choices in design or code where for example, a service gets all
the product details via the entity framework and then filters only a subset of
the fields while discarding the rest. Yet another example is when the
application retrieves data to perform an aggregation that could be done by the
database instead. The application calculates total sales by getting every
record for all orders sold instead of executing a query where the predicates
are pushed down to the store. Similarly other manifestations might come about
when the EntityFramework uses LINQ to entities. In this case, the filtering is
done in memory by retrieving the results from the table because a certain
method in the predicate could not be translated to a query. The call to
AsEnumerable is a hint that there is a problem because the filtering based on
IEnumerable is usually done on the client side rather than the database. The
default for LINQ to Entities is IQueryable which pushes the filters to the data
source.
Fetching only the relevant columns from a table
as compared to fetching all the columns is another classic example of this
antipatterns and even though this might have worked when the table was only a
few columns wide, it changes the game when the table adds several more columns.
Similarly, aggregation performed in the database overcomes this antipattern
instead of doing it in memory on the application side.
As with data access best practice, some
considerations for performance holds true here as well. Partitioning data
horizontally may reduce contention. Operations that support unbounded queries
can implement pagination. Features that are built right into the data store can
be leveraged. Some calculations need not be repeated especially with summation
forms. Queries that return a lot of results can be further filtered. Not all
operations can be offloaded to the database but those where the database is
highly optimized can be offloaded.
A few ways to detect this antipattern include
identifying slow workloads or transactions, behavioral patterns exhibited by
the system due to limits, correlating the instances of slow workloads with
those patterns, identifying the data stores being used, identify any slow
running queries that reference these data source and performing a resource
specific analysis of how the data is used and consumed.
These are some of the ways to mitigate this
antipattern.
Some of the metrics that help with detecting and mitigation
of extraneous fetching antipattern include total bytes per minute, average
bytes per transaction and requests per minute.
No comments:
Post a Comment