Introduction
History of data is often important as much as the data
itself. For example, Finance, healthcare and insurance industries often track
histories of portions of the data for audit purposes, and reporting. CosmosDB
forms the storage layer for many microservices in Azure. This article explains
the ‘change-feed’ feature associated with this storage.
Description:
CosmosDB exposes an API for the underlying log of changes
regarding the documents in its collection. For users familiar with the SQL
Server relational store, this is the equivalent of the change data capture. The
changes are recorded incrementally and can be distributed across one or more
consumers for parallel processing, enabling a variety of applications. The
change feed works for updates and other forms of writes but not deletions.
Usually only the most recent change is available. Intermediate changes are not
visible.
The change feed is not targeted at solving all the
versioning requirements from the CosmosDB store. That requires a Document
Versioning Pattern which involves the following:
1.
Intent – This ensures that each entity in
collections, when updated maintains the history of changes.
2.
Motivation – This tracks the history of entities
throughout their lifecycle
3.
Applicability – This covers the usages such as
auditing, reporting and analysis
4.
Structure – In order to keep the state of the
objects, every update must be turned into an append operation.
5.
Participants - A materialized view is made possible
with the change feed
6.
Consequences – This should work for short and
long histories. If it suffers performance degradation, it might not apply to
all use cases for versioning.
Change feed allows the use of a “soft marker” on the items
for the updates and the filter based on that when the processing items in the
change feed. This enables the recording of deletes since deletes are not
supported. Inserts and updates are recorded by the change feed automatically.
Change feed items come in the order of their modification
time. This sort order is guaranteed per logical partition key.
In a multi-region Azure Cosmos DB account, the failover of a
write region will be supported where the change feed will work across the
manual failover operation and will remain contiguous.
Conclusion:
This approach solves the capture of data changes for its
applicability to auditing, reporting and analysis.
No comments:
Post a Comment