Monday, December 6, 2021

Big Compute vs Big Data architectural styles for implementing a cloud service

 

A web service for the cloud must be well suited for the business purpose its serves not only in its functionality but also in the non-functional aspects which are recorded in the Service-Level Agreements. The choice of architecture for a web service has a significant contribution to this effect. We review the choices between Big Compute and the Big Data architectural style.

The Big Compute architectural style refers to the requirements for many cores to handle the compute for the business such as for image rendering, fluid dynamics, financial risk modeling, oil exploration, drug design and engineering stress analysis. The scale out of the computational tasks is achieved by their discrete, isolated and finite nature where some input is taken in raw form and processed into an output. The scale out can be adjusted to suit the demands of the workload and the outputs can be conflated as is customary with map-reduce problems.  Since the tasks are run independently and in parallel, they are tightly coupled. Network latency for message exchanges between tasks is kept to a minimum. The commodity VMs used from the infrastructure is usually the higher end of the compute in that tier. Simulations and number crunching such as for astronomical calculations involve hundreds if not thousands of such compute.

Some of the benefits of this architecture include the following: 1) high performance due to the parallelization of tasks. 2) ability to scale out to arbitrarily large number of cores, 3) ability to utilize a wide variety of compute units and 4) dynamic allocation and deallocation of compute.

Some of the challenges faced with this architecture include the following: Managing the VM architecture, the volume of number crunching, the provisioning of thousands of cores on time and getting diminishing returns from additional cores.

Some of the best practices demonstrated by this code include It exposes a well-designed API to the client. It can auto scale to handle changes in the load. It caches semi-static data. It uses a CDN to host static content. It uses a polyglot persistence when appropriate. It partitions data to improve scalability, it reduces contention, and optimizes performance.

Some of the examples with this architectural style include applications that leverage the Azure Batch managed service to use a VM pool with uploaded code and data artifacts. In this case, the Azure Batch provisions the VMs, assigns the tasks and monitors the progress. It can automatically scale out the VMs in response to the workloads. When an HPC pack is deployed to Azure, it can burst the HPC cluster to handle peak workload.

No comments:

Post a Comment