Events as a measurement of Cloud Database performance
In the previous post, we talked about cloud databases. They come with the benefits of
the cloud and can still offer at par performance with on-premise database and perhaps
better Service-Level Agreements. Today we talk about a test case to measure performance of a cloud database, say Azure Cosmos
Database using it as a queue. I could not find a comparable study to match
this test case.
Test case: Let us say, I stream millions of events per minute to CosmosDB
and each bears a tuple <EventID, Byte[] payload> The payload is
irrelevant other than forcing a data copy operation for each event thereby trying
to introduce processing time so I use the same payload with each event.
The EventIDs range from 1 to Integer.MaxValue. I want to
store this in a big table where I
utilize the 255 byte row key to support horizontal scaling of my table.
Then I want to run a query every few seconds to tell me the
top 50 events by count in the table.
The Query will be :
SELECT events.ID id,
COUNT(*) OVER ( PARTITION BY events.ID ) count,
ORDER BY count DESC;
Which semantically has the same meaning as GROUP BY COUNT(*)
This is a read-only query so the writes to the table should
not suffer any performance impact. In other words, the analysis query is
separate from the transactional queries to insert the events in a table.
In order to tune the performance of the Cloud Database, I set
the connection policy to
·
Direct mode
·
Protocol to TCP
·
Avoid startup latency on first request
·
Collocate clients in same Azure region for
performance
·
Increase number of threads/tasks
This helps to squeeze the processing time of the events as they
make their way to the table. The number of Request Units (RU) will be attempted
to exceed 100,000
This high throughput will facilitate a load that the
analysis query performance may become slower over time with the massive
increase in data size.
Conclusion – This test case and its pricing will help determine
if it is indeed practical to store high volume traffic in the database such as
from mobile fleet or IoT devices.
This blog is more effective and it is very much useful for me.we need more information please keep update more.
ReplyDeleteHadoop Training in Chennai
Big data training in chennai
Big Data Training in Anna Nagar
JAVA Training in Chennai
Python Training in Chennai
Android Training in Chennai
hadoop training in Annanagar
big data training in chennai anna nagar
Big data training in annanagar