Cluster computing

Saturday, February 11, 2017

A tutorial on asynchronous programming for DevOps:
Introduction – DevOps engineers increasingly rely on writing services for automation and integrating new functionality on underlying systems. These services that involve chaining of one or more operations often incur delay that exceeds user tolerance on a web page. Consequently, they are faced with the challenge of giving an early response to the user even when the resource requested by the user may not be available. The following some of the techniques commonly employed by these engineers:
1) Background tasks – Using a database and a transactional behavior, the engineers used to chain one or more actions within the same transaction scope. However, when each action takes a long time to complete, the chained actions amount to a significant delay. Although semantically correct, this does not lend itself to reasonable delay. Consequently, this is split into a foreground and a background task where a database entry implies a promise that will be fulfilled subsequently by a background task or invalidated on failures. Django-background-tasks is an example of this functionality and it involves merely decorating a method to register as a background task. Additionally, these registrations can specify the schedule in terms of time period as well as the queues on which these tasks are stored. Internally they are implemented with their own background tasks table that allows retries and error handling.
2) Async tasks – Using task parallel library such as django-utils, these allow tasks to merely be executed on a separate thread in the runtime without the additional onus of persistence and formal handling of tasks. While the main thread can service the user and be responsive, the async registered method attempts to parallelize execution without guarantees.
3) Threads and locks – Using concurrent programming, developers such as those using python concurrent.futures look for ways to partition the tasks or stitch execution threads based on resource locks or time sharing. This works well to reduce the overall task execution time by identifying isolated versus shared or dependent actions. The status on the objects indicates their state. Typically this state is progressive so that errors are minimized and users can be informed of intermediate status and the progress made. A notion of optimistic concurrency control goes a step further by not requiring locks on those resources.
4) Message Broker – Using queues to store jobs, this mechanism allows actions to be completed later often by workers independent of the original system that queued the task. This is very helpful to separate queued based on the processors handling the queues and for scaling out to as many workers as necessary to keep the jobs moving on the queue. Additionally, these message brokers come with options to maintain transactional behavior, hold dead letter queues, handle retries and journal the messages. Message brokers can also scale to beyond one server to handle any volume of traffic
5) Django celery – Using this library, the onerous tasks associated with a message broker are internalized and the developers are given very nifty and clean interface to perform their tasks in an asynchronous manner. This is a simple, flexible, reliable and distributed task queue that can process vast amounts of messages with a focus on real-time processing and support for task scheduling. While previously it was available in its own library, it has now become standard with the django framework.
Conclusion – Most application developers choose one or more of the strategies above depending on the service level agreements, the kind of actions, the number of actions on each request, the number of requests, the size and scale of demand and other factors. There is a tradeoff between complexity and the layering or organization of tasks beyond the synchronous programming.

Cluster computing

Saturday, February 11, 2017

No comments:

Post a Comment