Cluster computing

Saturday, September 17, 2022

The language of templates

Introduction: Many infrastructure providers including the Azure public cloud favor templates that manifest infrastructure as a code. Azure offers a control plane for all resources that can be deployed to the cloud and services take advantage of them both for themselves and their customers. While Azure Functions allow extensions via new resources, Azure Resource provider and ARM APIs provide extensions via existing resources. This eliminates the need to have new processes introduced around new resources and is a significant win for reusability and user convenience. Resources and their extensions can be written only in Bicep and ARM templates. Bicep provides more concise syntax and improved type safety, but they compile to ARM templates which is the de facto standard to declare and use Azure resources and supported by the unified Azure Resource Manager. Bicep is a new domain-specific language that was recently developed for authoring ARM templates by using an easier syntax. Bicep is typically used for resource deployments to Azure. It is a new deployment-specific language that was recently developed. Either or both JSON and Bicep can be used to author ARM templates and while JSON is ubiquitous, Bicep can only be used with Resource Manager Templates. In fact, Bicep has tooling that converts Bicep templates into standard Json Templates for ARM Resources by a process called transpilation. This conversion happens automatically but it can also be manually invoked. Bicep is succint so it provides a further incentive. The use of builtin functions, conditions and loops for repetitive resources infuses logic into the ARM templates.

With the standardization of the template, it can bring consistency across services and their resources with added benefits like policy as a code and repeated deployments across clouds and regions. The need for region agnostic deployments cannot be over-emphasized for foundational services that struggle with limitations. There are many clouds and regions to support, and the task of deployment could have significant cost when the services groan without the availability of suitable ARM Templates.

Other infrastructure providers like Kubernetes have a language that articulates state so that its control loop can reconcile these resources. The resources can be generated and infused with specific configuration and secret using a configMap generator and a secret generator respectively. For example, it can take an existing application.properties file and generate a configMap that can be applied to new resources. Kustomization allows us to override the registry for all images used in the containers for an application. There are two advantages to using it. First, it allows us to configure the individual components of the application without requiring changes in them. Second, it allows us to combine components from different sources and overlay them or even override certain configurations. The kustomize tool provides this feature. Kustomize can add configmaps and secrets to the deployments using their specific generators respectively. Kustomize is static declaration. It allows adding labels across components. We can choose the groups of Kubernetes resources dynamically using selectors, but they must be declared as yaml. This kustomization yaml is usually stored as manifests and applied on existing components so they refer to other yamls. Arguably, yaml is the most succint format of templates.

Azure Blueprints can be leveraged to allow an engineer or architect to sketch a project’s design parameters, define a repeatable set of resources that implements and adheres to an organization’s standards, patterns and requirements. It is a declarative way to orchestrate the deployment of various resource templates and other artifacts such as role assignments, policy assignments, ARM templates, and Resource Groups. Blueprint Objects are stored in the CosmosDB and replicated to multiple Azure regions. Since it is designed to setup the environment, it is different from resource provisioning. This package fits nicely into a CI/CD pipeline and handles both what should be deployed and the assignment of what was deployed.

Yet none of the formats discussed so far make use of the technique popularized by dockerfiles that represent container images. An image is a collection of layers where each layer represents a set of file-system differences over a base but are themselves stored as folders and files. This not only makes the building of an image incremental, programmatic and re-usable but also interchangeable instead of merely a composite. Perhaps, that is the next evolutionary step to templates.

Friday, September 16, 2022

A note about supportability and tuning of databases

Introduction: This is a continuation of the article titled tuning of databases and linked here.

There is a lot of documentation on performance tuning of a relational database, but this article continues to cover some of those and supportability for a multitenant application. 

Monitoring is a critical aspect for determining how a server is performing. Effective monitoring involves taking periodic snapshots to isolate runaway processes. Optimal performance can be achieved by minimizing response times and maximizing throughput. Efficient network traffic, disk I/O, and CPU usage are key to peak performance.

There is a performance dashboard available for whether a database is experiencing a performance bottleneck. If the bottleneck is found, additional diagnostic data can be easily captured to resolve the problem. The dashboard can help with common problems pertaining to

CPU bottlenecks,

I/O bottlenecks
Missing indexes
Blocking
Resource contention

The performance dashboard also helps to identify expensive queries that may have been executed before, and several metrics that include CPU, Logical writes, Logical Reads, Duration, Physical reads, and CLR time.

The dashboard is divided into the following sections and sub-reports: System CPU utilization, Current waiting requests, Current activity, Historical information, and miscellaneous information.

The database also provides dynamic management views and functions.

There are server scoped dynamic management views and functions which require “VIEW SERVER STATE” permission on the server and then there are database scoped dynamic management views and functions which require view database state permission on the database.

These dynamic management views include Always On availability groups, Change Data Capture, Change Tracking Related, Common Language Runtime, Database Mirroring, Database related, Execution related,   Extended events, Filestream and Filetable, Full-text search and semantic search, Geo replication, Index related, I/O related, PolyBase, Memory-Optimized, Object Related, Query Notifications related, Replication related, Resource Governor related, Security-related, Server-Related, Service Broker, Spatial Data, Azure Synapse analytics and Parallel data warehouse, SQL Server operating system related, stretch database and transaction related.    

Thursday, September 15, 2022

A note about performance tuning the database server:

There is a lot of documentation on performance tuning of a relational database, but this article covers some of those for a multitenant application.

Multitenant databases tend to grow rapidly in number of instances. The purchasing model matters significantly for cloud SQL Databases.

There are two purchasing models: 1) Virtual v-core based purchasing model - this model provides a choice between a provisioned compute tier and a serverless compute tier. With the provisioned compute tier, the amount of computer resources is always provisioned for the workload. With the serverless compute tier, the autoscaling of the compute resources is specified instead. With autoscaling, the databases are paused and resumed with charges only for storage during period of inactivity.

2) the database transaction unit-based purchasing model – this model provides bundled compute and storage packages balanced for common workloads. The compute sizes are declared in DTUs for single databases and elastic DTUs for elastic pools. The vCore based model allows us to independently choose compute and storage resources. Customers prefer DTU-based for simple, preconfigured resource options and vCore-based for value flexibility, control, and transparency.

Storage costs are calculated differently based on each purchasing model. Storage is included in the price of the DTU. It is possible to add extra storage in the standard and premium tiers of a cloud SQL database.

Application characteristics determine both the purchasing model and the performance tuning of existing instances. Chatty applications make excessive data access operations that are sensitive to network latency. Batching ad-hoc queries or moving queries to stored procedures is an effective remediation in this case. These will reduce the number of data access operations to the database.

Another case where remediation is justified is when databases have an intensive workload that can’t be supported by an entire single machine. Scaling out of the workload is a suitable remediation in this case.

Applications that have suboptimal queries might not respond well to increased compute size. This is easy to spot with such queries in the data access layer and show symptoms such as missing ‘where’ clauses, missing indexes, or outdated statistics. These applications benefit from standard query performance-tuning. Besides query, applications may have inherent data access concurrency issues such as deadlocking and starvation. Reducing roundtrips between the database and the application by caching data on the client side is an effective remediation in this case.

Query tuning and usage of hints are standard query optimizer techniques. These apply to databases as well. Tuning queries might reduce aggregate resource demands. The query optimizer sniffs parameters and evaluates the current value of a parameter to determine whether it can generate a more optimal query plan. When this does not work sufficiently, query hints or directives can be specified and the default behavior via parameter sniffing can be overridden.

Customers generally find this on a case-by-case basis. In general, they hit capacity limits for individual databases more often. These capacity limits still exist for pooled databases and instance databases but there are two options to solve problems with large databases.

Cross-database sharding is a technique to spread database operations over multiple databases. The data is split on a single dimension into multiple databases. Splitting different functions of the application into different databases is another such technique. As the application becomes busier, the compute sizes for each function can vary because the load is split across multiple machines.

Reference: https://1drv.ms/w/s!Ashlm-Nw-wnWhLMfc6pdJbQZ6XiPWA?e=fBoKcN

Wednesday, September 14, 2022

There is a lot of documentation on performance tuning of a relational database, but this article covers some of those for a multitenant application.

Setting the SQL compatibility level to optimize database performance is just one of those. This will equip the database with the latest optimization features of the Azure SQL database or SQL server. It is particularly relevant for demonstration databases because the default compatibility level is usually old.

Keys are another consideration. When we define keys to improve performance, we must make note of the time it takes to complete a loop through the subset of records. Different sized subset of records will take different times but the performance deteriorates if the subset of records cannot be located or read efficiently. The best selection of keys will be such that it supports the code that needs to run on the data. Proper selection of such keys will maximize the performance.

Data access is another. Data that the client needs goes from the database server through the multitenant application to the client. If the data is cached, it is returned from the cache instead of the database and some time and cost can be saved. If the data isn’t cached, it is fetched from the database over the network and then saved in the cache. There might be more than one cache – say a global and a local where the global cache is for all users connected to the multitenant solution and a local cache is tenant-specific.

Connections to the database are yet another performance consideration. The multitenant solution provider must maintain a connection pooling which can dramatically decrease memory consumption by the multitenant application. The database connection pooling also simplifies deployment of the multitenant application as a three-tier architecture where the tiers are installed independently. Administrators are no longer required to manually create the SPNs and setup the delegation when the client, application and database server are hosted independently.

Data read/write performance requires significant investment. Record identifiers and SQL variant columns don’t prevent the use of BULK inserts. Filtering can usually be done with a single predicate. The exceptions in the multitenant application in which the filtering does not involve a single predicate is when the filtering is based on a computed column that requires external logic or there is a second predicate provided.

Adding constraints on a SQL column can also be useful to performance. The name of the constraint is not important if it is not used by another column in the database.

Clients can also be made performant and not just the servers. The hardware of the computer running the client, the choice of the browser, and the network bandwidth and latency between the client and the server are all important to boost the performance of the clients.

Tuesday, September 13, 2022

Performance for multitenant application

This is a continuation of the articles on multitenancy with the most recent one linked https://1drv.ms/w/s!Ashlm-Nw-wnWhLZnYUBoDUNcjAHNwQ?e=NAo7vM. This article focuses on performance.

The multitenant application discussed so far has an application server and a database. Performance is improved by 1. Writing efficient pages, 2. Efficient web services, 3. Efficient reports, 4. performance patterns, 5. Efficient data access, 6. Testing and validating performance, 7. Tuning the development environment 8. And using the profiler to analyze performance.

Efficient pages are written by using patterns that get a page to load faster. These include: avoiding unnecessary recalculation, 2. Doing less work, 3. And offloading the UI thread. Caching the data and refreshing the cache regularly avoids recalculation. This saves time each time the page is loaded. Querying objects are notorious for recalculation since they reach the database each time. Caching the results from an API works significantly better.

Reducing the amount of work also speeds things up. A simple page with few UI elements can also be ease of use and navigation. Removing calculated fields from lists if they aren’t needed and removing the field definition or page extension definition improves loading of pages that list data.

Creating dedicated lookup pages instead of the normal pages when dropdown like logic is involved, and removing triggers and fact boxes will help because a default page will render all controls

Offloading the UI thread with say page background tasks can get a more responsive and faster UI. Custom controls that require heavy duty logic can also be avoided.

Avoiding expose of calculated fields, avoiding heavy duty logic in pre and post handlers of getting records, refactoring the page and its code so that values are persisted can reduce performance hits. It is not recommended to use temp tables if there are many records. Fetching and inserting each record in a temp table without caching data can be detrimental to performance. If the number of records exceeds a hundred, this antipattern is easy to detect.

Parent and child records need not be inserted in parallel. This condition causes locks on parent and integration record tables because parallel calls try to update the same parent record. It is best to do it incrementally by allowing one to finish before another or by putting them in a transaction batch.

A deprecated protocol can be avoided. OData version 4 and APIs have best performance. API queries and pages are faster with newer technology stacks.

API pages and API queries are better than exposing ui pages as web service endpoints. If the latter must be implemented, then triggers need to run for all the records returned from the server. If we want OData endpoints that work as data readers, we can use API queries. OData has a few performance callouts such as limiting the set with $filter and $top if there’s an expensive $expand, using a transaction batch and read-only data access intent.

Large volumes of web service calls can cause stability and performance issues. It is important to understand the operational limits and to scale such that the load always falls under the limit. External applications can handle the HTTP Status codes 429 for too many requests and 504 for gateway timeout.

Handling status code 429 requires the client to adopt a retry logic while providing a cool off period. Retries can be regular interval, incremental interval, exponential backoff, and randomization. Status code 504 requires the client to refactor the long running request to execute within the time limit by splitting the request into multiple requests. Then the potential 429 codes can be handled by a backoff strategy. A common pattern is to implement a queue in the external application to flatten the spikes in the traffic. If the request gets a 429, it is put back in the queue and one of the retry strategies is applied.

Testing and validating the multitenant solution for performance is necessary prior to its planned deployment to production environments. Performance unit testing comes helpful in this regard and can include unit-tests that track the number of SQL statements or rows read. This can be used before and after the code to be tested. Also, assert statements can check for normal behavior. Performance toolkits are available to simulate the number of resources that tenants use in realistic scenarios to compare performance between builds of their solution. This toolkit might not measure throughput but given the time to execute key scenarios and the guidance provided by operational limits, it is possible to determine say how many orders are processed per hour. If there are queues involved, advanced analysis can be enabled. Performance telemetry includes database locks, long running operations, long running queries, page views, reports, sessions started, and web service requests.

Monday, September 12, 2022

This is a continuation of the articles on multitenancy with the most recent one linked https://1drv.ms/w/s!Ashlm-Nw-wnWhLZnYUBoDUNcjAHNwQ?e=NAo7vM. This article focuses on performance.

Reports can be efficiently written. They can pertain to a single instance of an entity like an invoice or they can be more analytical in nature by performing joins across multiple instances. Faster reports can be enabled by performing read scale-out to read data from a read-only copy of the database, by using partial records to reduce the data loaded from the database, by using AL queries to optimize the way data is read from the database, and using word layouts and RDL layouts which can result in slower performance with document reports, especially for actions related to the user interface.

If a data lake or a data warehouse is involved, there are typically two types of data extractions: 1. a historical load with all the data from a given point-of-time and 2. delta loads on what changed since the historical load. The fastest and least disruptive way to get ahistorical load is usually a SQL BACPAC file that can be restored on a private database server. The fastest or the least disruptive way to get delta loads is to setup API queries configured with read scale-out and use of data audit field LastModifiedOn.

Efficient data access is at the heart of performance improvements. There are a lot of parallels that can be established with data access in a database using SQL, even if the multitenant application makes differences.

Many performance issues can be tracked down to missing indexes but index design does not always fall on the tenant or the multitenant solution developer. The indexes must be designed to conform the way data is accessed.

Keys are used to identify rows in table and can combine one or more columns in a table. Indexes speed up data retrieval. They serve different purposes. Records are stored sequentially in ascending order and sorted by the primary keys in clustered indexes involving primary keys. Records are sorted dynamically so the data is always structurally correct. Sorting allows faster data access via binary search.

Non-clustered column store indexes are also stored in tables. This index type can also achieve significant performance gains over traditional row-oriented storage. It also achieves data compression over the uncompressed data size on normal tables.

If the number of secondary keys that are marked as active are increased, performance can improve when the data is retrieved in several different sorting sequences because the data is already sorted. But this slows down insertion and deletion because indexes for each secondary key must be maintained.

IF the number of keys are small, performance increases because a minimal number of indexes are maintained. It slows when the data is retrieved because of missing secondary keys to get the appropriate sorting.

Schema synchronization errors can be minimized when primary keys are not deleted, the order of primary keys is not changed, more unique keys are not added and more clustered keys are not added.

SumIndexField Technology or SIFT for short, helps to quickly calculate the sum of numeric data type columns in tables even with thousands of records. Any field that has a numeric data type can be associated with a key as a SumIndexField. With a materialized view per SIFT key, this speeds up retrieval of data.

Databases are big about locking and logging. Performance issues that are not due to resource starvation can often be attributed to workers waiting on other workers to release locks on shared objects. Some tips to avoid locking include reading information before starting write operations, limiting the time locks are held, limiting the transaction size by dividing it into smaller number of operations and making sure indexes are present on ranges that are updated. Multitenant databases generally have less locking contention due to isolation.

Sunday, September 11, 2022

The art and science of software code reviews:

The practice of reviewing software code written by colleagues in an organization is called software code reviews. Code is one of the most important digital assets for any software company and the bearer for all innovations and value offerings that reach the end customer. As software projects vary widely in size and complexity from organization to organization, the practice of reviewing code also varies. Some teams can be lenient with this practice while others are more exacting. This article provides a peek into the world of software professionals as they realize their contributions.

A code review is done just before submitting the code to a pipeline that delivers the improvements to the end users via a series of staged progression through quality gates and environments. It provides an opportunity for colleagues to inspect each-other’s work for potential concerns that can be cheaply and effectively addressed before putting into a pipeline. If bad code gets checked in, software development managers will find their quality measuring dashboards tank which effectively brings a freeze to simultaneous contributions from many teams. A code freeze impedes organizational productivity and increases costs due to lost time and increased overhead among engineers. In addition, backing out a bad code change requires detection and remediation investigations and reports that might be more involved than reverting a single commit which is the unit of code changes made to the master version of the source code in the repository.

It is typical to see dozens of commits a day from an organization that contributes to a source code which exceeds several hundred thousand lines of code. Efficiency is increased when the commits are as independent of one another as possible. For these reasons, managers insist on ‘baking’ the feature completely in a branch and with a feature flag before it is merged as a single commit into the master. A single commit is easier to revert as well than a series of commits spread across folders and their history. The feature flag is an option to disable the feature at runtime in the production environment so that the code becomes inactive and reduces surface areas that would otherwise have required investigations, troubleshooting and workaround education for end-users when potential issues crop up. Often the operations engineering team works independent of the software development team, so defects introduced into the code that make their way to the customers is almost like throwing it over a wall for someone else to take care.

Seasoned engineers in a team that is well-oiled will usually get a code review done with little change to the initial proposal made by its author. But most teams usually draw fresh blood to keep their source healthy, the widely varying practice of code reviews can evoke fear, fatigue from iterations and even friction for the uninitiated regardless of their experience and background. When there is a lot of back and forth, it can get overwhelming and hair raising to anyone. Some manage to not make the same mistake twice while others try not to contact the same individual for another code review request!

The process of code reviewing is highly structured and invested heavily by organizations. Most provide one or another productivity tool for individuals to review the code while also providing a DevOps pipeline to vet the code via running multiple code analysis tools before even a code review request is made. Engineers can view the exact changes to each line before and after an iteration and provide comments during a code review. Experienced ones will be able to spot defects and improve the code style and organization that the tools might fail to provide. Some of the traits they look for include readability, re-usability, testability and refactoring. Unit-tests are a particular favorite for both bots and humans to call out during a code review. The most skilled reviewers will often point at writing tests that will discourage incorrect usage of the proposed code changes in the future in addition to testing the code for both success and failure. Style is another big contention between engineers and often code reviews iterations can go from a couple of iterations up to double digits just for style and conformance.

The savvy author of a code review would have done a lot of home-work prior to turning it in for review such as informing about the proposed changes via a design document and getting general acceptance to the intent behind the form of implementation. Establishing relationships with peers that can smoothen and expedite the code review process is another popular technique. Managers tend to take note of the number and quality of code reviews by their direct reports, so individuals help themselves by working one or two code reviews at a time.

Finally, the less the better for any code changes and reviews. Engineers will vouch for this universally.