Cluster computing

Monday, November 28, 2022

Application modernization for massively parallel applications

Part 6 of this article on Application Modernization covered the migration process. This section focuses on specialty applications.

Every organization must determine its own roadmap to application modernization. Fortunately, patterns and best practices continue to help and provide guidance. This section describes the application modernization for a representative case study for those applications that do not conform to cookie cutter web applications.

When we take a specialty application that involves massively parallel compute intensive applications that provide predictions, the initial approach is one of treating the model as a black box and working around the dependencies it has. But the modernization effort does not remain constrained by the technology stack that are the dependencies of the model. Instead, this is an opportunity to refine the algorithm and describe it with a class and an interface that lends itself to isolation and testing. This has the added benefit of providing testability beyond those that were available until now. The algorithm can also be implemented with design patterns like the Bridge design pattern so that the abstraction and the implementation can vary or the Strategy design pattern which facilitates a family of algorithms that are interchangeable.

Microservices developed in an agile manner with Continuous Integration and Continuous Deployment pipeline provide an unprecedented opportunity to compare algorithms and fine tune them in a standalone manner where the investments for infrastructure and data preparation need to be considered only once.

Algorithms for massively parallel systems often involve some variation of batched map-reduce summation forms or a continuous one-by-one record processing streaming form. In either of these cases, the stateless form of the microservice demonstrates superior scalability and reduced execution time than other conventional forms of software applications. The leap between microservice to serverless computing is one that can be taken for lightweight processing or where the model has already been trained so that it can be hosted with little resources.

Parallel computing works on immense size of data and the considerations for data modernization continue to apply independent of the application modernization.

Sunday, November 27, 2022

Part 6: Application Migration process

There are two ways at least that are frequently encountered for the migration process itself. In some cases, the migration towards microservices architecture is organized in small increments, rather than a big overall migration project. In those cases, the migration is implemented as an iterative and incremental process. They might also be referred to as phased adoption. This has been the practice even for the migration towards Service Oriented Architecture. There are times when the migration has a predefined starting point but not necessarily a defined upfront endpoint.

Agility is a very relevant aspect when moving towards a microservices architecture. New functionalities are often added during the migration. This clearly shows that the preexisting system was hindering development and improvements. New functionalities are added as microservices, and existing functionalities are reimplemented also as microservices. The difficulty is only in getting the infrastructure ready for adding microservices. Domain-driven design practices can certainly help here.

Not all the existing functionality is migrated. It does not align with the “hide the internal implementation detail” principle of microservices nor does it align with the typical MSA characteristic of decentralized data management. If the data is not migrated, it may hinder the evolving of independent services. Both the service and the data scalability are also hindered. If the scalability is not a concern, then the data migration can be avoided altogether.

The main challenges in architecture transformation are represented by (i) the high level of coupling, (ii) the difficulties in identifying the boundaries of services (iii) and system decomposition. There could be some more improvement and visibility in this area with the use of architecture recovery tools so that the services are well-defined at the architectural level.

Some good examples of microservices have consistently shown a pattern of following the “model around business concepts”.

The general rule of thumb inferred from various microservices continues to be 1) First, to build and share reusable technical competence/knowledge which includes (i.) kickstarting a MSA and (ii.) reusing solutions, 2) Second, to check business-IT alignment which is a key concern during the migration and 3) Third, to monitor the development effort and migrate when it grows too much which would show a high correlation between migration to microservices and increasingly prohibitive effort in implementing new functionalities in the monolith.

Saturday, November 26, 2022

Part 5: Application Modernization and the migration towards Microservices architecture

The path towards a microservice-based architecture is anything but straightforward in many companies. There are plenty of challenges to address from both technical and organizational perspectives. The performed activities and the challenges faced during the migration process are both included in this section.

The migration to microservices is sometimes referred to as the “horseshoe model” comprising three steps: reverse engineering, architectural transformations, and forward engineering. The system before the migration is the pre-existing system. The system after the migration is the new system. The transitions between the pre-existing system and the new system can be described via pre-existing architecture and microservices architecture.

The reverse engineering step comprises the analysis by means of code analysis tools or some existing documentation and identifies the legacy elements which are candidates for transformation to services. The transformation step involves the restructuring of the pre-existing architecture into a microservice based one as with reshaping the design elements, restructuring the architecture, and altering business models and business strategies. Finally, in the forward engineering step, the design of the new system is finalized.

Many companies will say that they are in the early stages of the migration process because the number and size of legacy elements in their software portfolio continues to be a challenge to get through. That said, these companies also deploy anywhere from a handful to hundreds of microservices while still going through the deployment. Some migrations require several months and even a couple of years. The management is usually supportive of migrations. The business-IT alignment comprising of technical solutions and business strategies are more overwhelmingly supportive of migrations.

Microservices are implemented as small services by small teams that suits Amazon’s definition of Two-Pizza Team. The migration activities begin with an understanding of both the low-level and the high-level sources of information. The source code and test suites comprise the low-level. The higher-level comprises of textual documents, architectural documents, data models or schema and box and lines diagrams. The relevant knowledge about the system also resides with people and in some extreme cases as tribal knowledge. Less common but useful sources of information include UML diagrams, contracts with customers, architecture recovery tools for information extraction and performance data. Very rarely but also found are cases where the pre-existing system is considered so bad, that their owners do not look at the source code.

Such an understanding can also be used towards determining whether it is better to implement new functionalities in the pre-existing system or in the new system. This could also help with improving documentation, or for understanding what to keep or what to discard in the new system.

Friday, November 25, 2022

Part 3 discussed microservices. This one focuses on maintainability, performance, and security. The maintainability of microservices is somewhat different from conventional software. When the software is finished, it is handed over to the maintenance team. This model is not favored for microservices. Instead, a common practice for microservices development is for the owning team to continue owning it for its lifecycle. This idea is inspired by Amazon’s “you build it, you run it” philosophy. Developers working daily with their software and communicating with their customers creates a feedback loop for the improvement of the microservice.

Microservices suffer a weakness in their performance in that the communication happens over a network. Microservices often send requests to one another. The performance is dependent on these external request-responses. If a microservice has well-defined bounded contexts, it will experience less performance hit. The issues related to microservice connectivity can be mitigated in two ways – making less frequent and more batched calls as well as converting the calls to be asynchronous. Parallel requests can be issued for asynchronous calls and the performance hit is that of the slowest call.

Microservices have the same security vulnerabilities as any other distributed software. Microservices can always be targeted for denial-of-service attack. Some endpoint protection, rate limits and retries can be included with the microservices. Requests and responses can be encrypted so that the data is never in the clear. If the “east-west” security cannot be guaranteed, at least the edge facing microservices must be protected with a firewall or a proxy or a load balancer or some such combination. East-West security refers to the notion that the land connects the east and the west whereas the oceans are external. Another significant security concern is that a monolithic software can be broken down into many microservices which can increase the surface area significantly. It is best to perform threat modeling of each microservice independently. Threat modeling can be done with STRIDE as an example. It is an acronym for the following: Spoofing Identity – is the threat when a user can impersonate another user. Tampering with data- is the threat when a user can access resources or modify the contents of security artifacts. Repudiation – is the threat when a user can perform an illegal action that the microservice cannot deter. Information Disclosure – is the threat when, say a guest user can access resources as if the guest was the owner. Denial of service – is the threat when say a crucial component in the operations of the microservice is overwhelmed by requests so that others experience outage. Elevation of privilege – is the threat when the user has gained access to the components within the trust boundary and the system is therefore compromised.

Migration of microservices comes with three challenges: multitenancy, statefulness and data consistency. The best way to address these challenges involves removing statefulness from migrated legacy code, implementing multitenancy, and paying increased attention to data consistency.

Thursday, November 24, 2022

Part 3: The refactoring of old code to new microservices

Part 2 of this article earlier, described microservices versus monolithic architecture. With the introduction of microservices, it became easy to host not only a dedicated database but also a dedicated database server instance and separate the concerns for each functionality that the user interface comprised of. When we use microservices with Mesos-based clusters and shared volumes, we can even have many copies of the server for high availability and failover. This is possibly great for small and segregated data but larger companies often require massive investments in their data, often standardizing tools, processes, and workflows to better manage their data. In such cases, consumers of the data don't talk to the database directly but via a service that sits behind say even a message bus. If the consumers proliferate, they end up creating and sharing many different instances of services for the same data each with its own view rather than the actual table. APIs for these services are more domain-based rather than implementing a query-friendly interface that lets you directly work with the data. As services are organized, data may get translated or massaged as it makes its way from one to another. It is possible to have a ring of microservices that can take care of most data processing for business requirements. Data may even be at most one or two fields of an entity along with its identifier for such services. This works very well to alleviate the onus and rigidity that comes with organization, the interactions between the components, and the various chores that need to be taken to keep it flexible to suit changing business needs. The microservices are independent so they stand by themselves as if spreading out from data for their respective functionalities. This is already business-friendly because each service can now be modified and tested independently of others.

The transition to microservices from legacy monolithic code is not straightforward. The functionalities must be separated beyond components. And in the process of doing so, we cannot risk regression. Tests become a way to scope out behavior at boundaries such as interface and class interactions. Adequate coverage of tests will guarantee backward compatibility for the system as it is refactored. The microservices are independently testable both in terms of unit tests as well as end-to-end tests. Services usually have a REST interface which makes it easy to invoke them from clients and comes with the benefits of using browser-based developer tools. The data store does not need to be divided between services. In some cases, only a data access service is required which other microservices can call. The choice and design of microservices stem from the minimal functionalities that need to be separated and articulated. If the services don’t need to be refactored at a finer level, they can remain encapsulated in a singleton.

The rule of thumb for the refactoring of the code is the follow-up of the Don’t Repeat Yourself or (DRY) principle which is defined as “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”. This calls for every algorithm or logic that is cut and pasted for different usages to be consolidated at a single point of maintenance. This improves flexibility because enhancements such as the use of a new data structure can be replaced in one place and it also reduces the bugs that come by when similar changes must be made in several places. This principle also reduces the code when it is refactored especially if the old code had several duplications. It provides a way to view the minimal skeleton of the microservices when aimed at the appropriate scope and breadth. Even inter-service calls can be reduced with this principle.

Good microservices are not only easy to discover from their APIs but also easy to read from their documentation which can be autogenerated from the code with markdowns. Different tools are available for this purpose and both the approach of using microservices as well as the enhanced comments describing the APIs provide sufficient information for the documentation.

Wednesday, November 23, 2022

Part 1 of this article describes application modernization. This section deals with microservices architecture that suits application modernization very well. Microservices break away from the monolithic architecture which has been the norm in legacy systems for a while. Monolithic applications tend to grow indefinitely which also increases the complexity. Finding bugs and creating new features take a long time. If a part of the application needs to be updated, the whole application must be restarted which can mean a considerable down time for large systems. Monolithic applications are harder to deploy since some parts of the application might have different requirements. Some parts are computationally heavy, and others are memory heavy. The one-size-fits-all environment to satisfy all requirements of the application is usually expensive and suboptimal. They are not scalable. A peak in traffic can lead to failures from various components. If the number of instances of the entire application is increased, it wastes resources. These systems do not evolve fast because they are locked in technology. The same programming language and framework must be used from the first to the last module.

Microservices are self-sufficient processes that can interact with other microservices to form a distributed application. Generally, a ring of microservices is developed that are small independent services that have their own isolated environment with operating systems, databases and other support software. Each microservice might be dedicated to a distinct set of resources that it supports with create, update and delete operations. They often use message passing via web requests to communicate with one another. Each microservice can be built with different programming languages and different environments depending on the requirements.

Microservices facilitate cross-functional team organization and based on business capabilities. This leads to faster delivery and higher quality due to testability and focused effort. This avoids the immense cross-team interactions from component-based software development. It also avoids developers from writing logic in the layer that is closest to them be it user-interface, service, or database.

Cloud service platforms have made operating and deploying microservices based applications easier and cheaper. It allows teams to build microservices using continuous integration and continuous delivery. The pipeline automates testing, building, developing, deploying and delivering the microservices. Updates to one microservice does not affect the others. But when a single microservice goes down, it can have a cascading effect on other services because they have high fault density. This is true also for components that grow in size. This is generally overcome by keeping microservices focused and small.

The reliability of microservices is dependent on the reliability of the communication between them. Http and protocol buffers are the communication protocols of choice. Development and deployment are also owned by the same team. This idea is inspired by Amazon’s “you build it, you run it” philosophy. The transition to microservices from legacy monolithic code is not straightforward. The functionalities must be separated beyond components. And in the process of doing so, we cannot risk regression. Tests become a way to scope out behavior at boundaries such as interface and class interactions. Adequate coverage of tests will guarantee backward compatibility for the system as it is refactored. The microservices are independently testable both in terms of unit tests as well as end-to-end tests. The choice and design of microservices stem from the minimal functionalities that need to be separated and articulated. If the services don’t need to be refactored at a finer level, they can remain encapsulated in a singleton.

The rule of thumb for the refactoring of the code is the follow up of the Don’t Repeat Yourself or (DRY) principle which is defined as “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”. This calls for every algorithm or logic that is cut and pasted for different usages to be consolidated at a single point of maintenance. This improves flexibility because enhancements such as the use of a new data structure can be replaced in one place and it also reduces the bugs that come by when similar changes must be made in several places. This principle also reduces the code when it is refactored especially if the old code had several duplications. It provides a way to view the minimal skeleton of the microservices when aimed at the appropriate scope and breadth. Even inter-service calls can be reduced with this principle.

- courtesy Kristian Tuusjärvi

Tuesday, November 22, 2022

Application Modernization:

Software used by companies is critical to their business and will continue to provide return on investment. Companies will try to maximize this for as long as possible. Some maintenance is required to these software systems which satisfy business and customer needs and address technical debt that accrues over time. Maintenance works well for short term needs but as time progresses, the systems become increasingly complex and out of date. Eventually maintenance will no longer be efficient or cost-effective. At this point, modernization is required to improve the system’s maintainability, performance, and business value. It takes much more effort to accomplish compared to maintenance. If a software can no longer be maintained or modernized, it will need to be replaced.

The risks of modernizing legacy systems primarily come from missing documentation. Legacy systems seldom have a complete documentation specifying the whole system with all its functions and use cases. In most cases, the documentation is badly missing which makes it hard to rewrite a system that would function identically to the previous one. Companies usually couple their legacy software with their business processes. Changing legacy software can cause unpredictable consequences to the business processes that rely on it. The nature of replacing legacy systems with new ones is risky, since the new system can be more expensive on a total cost of ownership basis and there can be problems with its schedule of delivery.

There are at least three strategies for dealing with legacy systems: scrap the legacy system, keep maintaining the system or replace the whole system. Companies generally have limited budgets on the legacy systems, so they want to get the best return on the investment. Scrapping the system can be an option if the value has diminished sufficiently. Maintenance can be opted into when it is cost-effective. Some improvement is possible by adding new interfaces to make the system easier to maintain. Replacement can be attempted when the support has gone, the maintenance is too expensive, and the cost of the new system is not too high.

Both technical and business perspectives are involved. If a legacy system has low quality and low business value, the system should be removed. Those with low quality but high business value must be maintained or modernized depending on the expense. Systems with high quality can be left running.

Modernization is a more extensive process than maintenance because modernization often incorporates restructuring, functional changes, and new software attributes. Modernization can be either white-box or black-box depending on the level of abstraction. White box modernization requires a lot of information about the internals of the legacy system. Contrary to that, the black box modernization only requires external interfaces and compatibility. Replacement is an option when neither approach works.

Software modernization is also an evolution of systems. White box systems are more popular than black box systems which might be counter-intuitive to the notion that black-box modernization is easier than white-box modernization. The tool for whitebox methods could have become better to help with the shift. Legacy systems are harder to integrate. Software integration allows companies to better control their resources, remove duplicate business rules, re-use existing software, and reduce cost of development. The effort needed to keep legacy systems running often takes resources away from other projects. Legacy systems also suffer from diminishing ownership and knowledge base which makes changes difficult to make. On the other hand, their business value makes them appear like rare diamonds even when they cost a lot.