Cluster computing

Tuesday, May 28, 2013

Cloud computing gives us the ability to develop applications that are virtualized across hardware and software stacks. Applications are no longer monolithic but sharded into different modules, each of which can reside on a different VM with its own software and hardware stack. Virtual machines, operating systems, server products and hosts, can be different for each module. These modules can still enable the same experience for a user as if the user was interacting with a single application. Sign on for example could be only once while the user visits different modules. Application storage, caching and services are now supported on dedicated resources.
If we want to provide APIs for our services, then they can be scoped to services and different services can meet different needs. APIs can be REST based and these will expand its reachability.
Let us take the example of provisioning a stack trace service that iterates over the dump files in a collection folder and populates a data store with stack traces read from each dump. In this case, we could expect the following APIs from the stackTrace service
IEnumerable<string> GetStackTrace(stream dumpFileStream); // retuns the stack trace associated with a path
IEnumerable<string> ResolveSymbols(IEnumerable<string> stackTrace, IEnumerable<string> symbolPath) to pretty print
IEnumerable<string> GetStackTrace(string pathToDumpAndSymbols); that combines the above operations

Next for the datatable that we populate called StackTraces, we will have attributes such as source information, bucket information and stack trace.

So we can enable all LINQ based operations on this entity.

This entity will be displayed by a service or front end that is independent from the stack trace population service. The front end could be read only that allows users to aggregate, search and sort stack traces from dumps.

In this case we have therefore separated out the producer consumer modules of our system and they are ready to be hosted on different VMs. For example, the producer service could sit on the same server as the collection folder and have a large storage since the dumps can be in the order of Gigabytes and there collections could be arbitrarily large. The consumer is more web appication tier 3 solution and can be hosted on an app server. The data table can be in a cloud datastore on a yet another VM or storage account.

Two services one table can scale to add other functionalities but together they have adequate information shared in the data table for diagnostics, audit and tracking.

Cluster computing

Tuesday, May 28, 2013

No comments:

Post a Comment