This is a continuation of a series of articles on crowdsourcing application and including the most recent article. The original problem statement is included again for context.
Social engineering applications provide a wealth
of information to the end-user, but the questions and answers received on it
are always limited to just that – social circle. Advice solicited for personal
circumstances is never appropriate for forums which can remain in public view.
It is also difficult to find the right forums or audience where the responses
can be obtained in a short time. When we want more opinions in a discrete
manner without the knowledge of those who surround us, the options become fewer
and fewer. In addition, crowd-sourcing the opinions for a personal topic is not
easily available via applications. This document tries to envision an
application to meet this requirement.
The previous article continued the elaboration on
the usage of the public cloud services for provisioning queue, document store
and compute. It talked a bit about the messaging platform required to support
this social-engineering application. The problems encountered with social
engineering are well-defined and have precedence in various commercial
applications. They are primarily about the feed for each user and the
propagation of solicitations to the crowd. The previous article described
selective fan out. When the clients wake up, they can request their state to be
refreshed. This perfects the write update because the data does not need to be
sent out. If the queue sends messages back to the clients, it is a fan-out
process. The devices can choose to check-in at selective times and the server
can be selective about which clients to update. Both methods work well in
certain situations. The fan-out happens in both writing as well as loading. It
can be made selective as well. The fan-out can be limited during both pull and
push. Disabling the writes to all devices can significantly reduce the cost.
Other devices can load these updates only when reading. It is also helpful to
keep track of which clients are active over a period so that only those clients
get preference.
In this section, we talk about
monolithic persistence antipattern that must be avoided. This antipattern
occurs when a single data store hurts performance due to resource contention.
Additionally, the use of multiple data sources can help with virtualization of
data and query.
A specific example of this
antipattern is when the crowdsourced application gets transactional records,
logs, metrics and events to the same database. The online transaction
processing benefits from a relational store but logs and metrics can be moved
to a log index store and time-series database respectively. Usually, a single
datastore works well for transactional data but this does not mean documents
need to be stored in the same data store. A blob store or document database can
be used in addition to a regular transactional database to allow individual
documents to be shared without any impact to the business operations. Each
document can then have its own web accessible address.
This antipattern can be fixed in one
of several ways. First, the data types must be listed, and their corresponding
data stores must be assigned. Many data types can be bound to the same database
but when they are different, they must be passed to the data stores that
handles them best. Second, the data access patterns for each data type must be
analyzed. If the data type is a document, a CosmosDB instance is a good
choice. Third, if the database instance is not suitable for all the data access
patterns of the given data type, it must be scaled up. A premium sku will
likely benefit this case.
Detection of this antipattern is
easier with the monitoring tools and the built-in supportability features of
the database layer. If the database activity reveals significant processing,
contention and very low data rate, it is likely that this antipattern is
manifesting.
Examine the work performed by the
database in terms of data types which can be narrowed down by callers and
scenarios, may reveal just the culprits that are likely to be causing this
antipattern
Finally, periodic assessments must be
performed on the data storage tier.
No comments:
Post a Comment