Cluster computing

Wednesday, December 15, 2021

Azure Maps and GeoFence

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing. In this article, we continue on the discussion on Azure Maps which is a full-fledged general availability service that provides similar Service Level Agreements as expected from others in the category.

Azure Maps is a collection of geospatial services and SDKs that fetches the latest geographic data and provides it as a context to web and mobile applications. Specifically, it provides REST APIs to render vector and raster maps as overlays including satellite imagery, provides creator services to enable indoor map data publication, provides search services to locate addresses, places, and points of interest given indoor and outdoor data, provides various routing options such as point-to-point, multipoint, multipoint optimization, isochrone, electric vehicle, commercial vehicle, traffic influenced, and matrix routing, provides traffic flow view and incidents view, for applications that require real-time traffic information, provides Time zone and Geolocation services, provides elevation services with Digital Elevation Model, provides Geofencing service and mapping data storage, with location information hosted in Azure and provides Location intelligence through geospatial analytics.

Azure Maps can be helpful for tracking entry and exit into a geographical location such as the perimeters of a construction area. Such tracking can be used to generate notifications by email. A Geofencing GeoJSON data is uploaded to define the construction area we want to monitor. The Data Upload API will be used to upload geofences as polygon coordinates to the Azure Maps Account. Two logic apps can be written to send email notifications to the construction site operations when say a equipment enters or exit the construction site. An Azure Event Grid will subscribe to enter and exit events for Azure Maps geofence. Two webhook event subscriptions will call the HTTP endpoints defined in the two logic applications. The Search GeoFence Get API is used to receive notifications when a piece of equipment enters or exits the geofence areas.

The Geofencing GeoJSON data contains a FeatureCollection which consists of two geofences that pertain to distinct polygonal areas within the construction site. The first has no time expirations or restrictions and the second can only be queried during business hours. This data can be uploaded with a POST method call to the mapData endpoint along with the subscription key. Once the data is uploaded, we can retrieve its metadata to ascertain the created timestamp.

The Logic App will require a resource group and subscription to be deployed. A common trigger function to respond when an HTTP request is received, is sufficient for this purpose. Then the Azure Maps event subscriptions is created. It will require name, event schema, system topic name, filter to event types, endpoint type and endpoint. The Spatial Geofence Get API will send out the notifications on the entry to and exit from the geofence. Each equipment has a device id which is unique so both the entry and exit can be noted. The get method also returns a location in the form of x,y distance from the geofence. A negative distance will imply that the data will lie directly within the polygon.

Tuesday, December 14, 2021

Azure Maps:

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing. In this article, we take a break to discuss a location service named Azure Maps. This is a full-fledged general availability service that provides similar Service Level Agreements as expected from others in the category.

SDKs are also available with flavors suited for desktop and mobile applications. Both the SDKs are quite powerful and enhance programmability. They allow customization of interactive maps that can render content and imagery specific to the publisher. The interactive map uses WebGL map control that is known for rendering large datasets with high performance. The SDKs can be used with JavaScript and TypeScript.

Location is a datatype. It can be represented either as a point or a polygon and each helps with answering questions such as getting top 3 stores near to a geographic point or stores within a region. Since it is a data type, there is some standardization available. SQL Server defines not one but two data types for the purpose of specifying location: the Geography data type and the Geometry data type. The Geography data type stores ellipsoidal data such as GPS Latitude and Longitude and the geometry data type stores Euclidean (flat) coordinate system. The point and the polygon are examples of the Geography data type. Both the geography and the geometry data type must have reference to a spatial system and since there are many of them, it must be used specifically in association with one. This is done with the help of a parameter called the Spatial Reference Identifier or SRID for short. The SRID 4326 is the well-known GPS coordinates that give information in the form of latitude/Longitude. Translation of an address to a Latitude/Longitude/SRID tuple is supported with the help of built-in functions that simply drill down progressively from the overall coordinate span. A table such as ZipCode could have an identifier, code, state, boundary, and center point with the help of these two data types. The boundary could be considered the polygon formed by the zip and the Center point as the central location in this zip. Distances between stores and their membership to zip can be calculated based on this center point. Geography data type also lets us perform clustering analytics which answers questions such as the number of stores or restaurants satisfying a certain spatial condition and/or matching certain attributes. These are implemented using R-Tree data structures that support such clustering techniques. The geometry data type supports operations such as area and distance because it translates to coordinates. It has its own rectangular coordinate system that we can use to specify the boundaries or the ‘bounding box’ that the spatial index covers.

Mapping the spatial data involves rendering the data as a layer on top of images. These overlays enhance the display and provide visual aid to the end-users with geographical context. The Azure Maps Power BI provides this functionality to visualize spatial data on top of a map. An Azure Maps account is required to create this resource via the Azure Portal.

Thanks

Monday, December 13, 2021

Azure Object Anchors:

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing. In this article, we take a break to discuss a preview feature named Azure Object anchors. An Azure preview feature is available for customers to use but it does not necessarily have the same service level as the services that have been released with general availability.

Azure Object Anchors is a service under the mixed-reality category. It detects an object in the physical world using a 3D model. This model may not apply a rendering directly to the physical counterpart without some translation and rotation. This is referred to as a pose of the model and comes with the acronym 6DoF also called the 6 Degrees of Freedom. The service accepts a 3D object model and outputs an Azure Object Anchors model. The generated model can be used alongside a runtime SDK to enable a HoloLens application to load an object model, detect and track instances of that model in the physical world.

Some examples use cases enabled by this model include:

1) Training which creates a mixed reality training experience for workers without the need to place markers or adjust hologram alignments. It can also be used to augment Mixed Reality training experiences with automated detection and tracking.

2) It can be used for task guidance where a set of tasks can be simplified when using Mixed Reality.

This is different from object embedding that finds salient objects in a vector space. This is overlay or superimposition of a model on top of the existing physical world video which requires a specific object that has been converted to the vector space. A service that does automated embedding and the overlay together, is not available yet.

Conversion service is involved in transforming a 3D asset into a vector space model. The asset can be described as a Computer-Aided Design diagram or scanned. It must be in one of the supported file formats as fbx, ply, obj, glb, and gltf format. The unit of measurement for the 3D model must be one of Azure.MixedReality.ObjectAnchors.Conversion.AssetLengthUnit enumeration. A gravity vector is provided as an axis. A console application is available in the samples to convert this 3D asset into An Azure Object Anchors model. An upload will require Account ID Guid, the account domain that is the named qualifier for the resource to be uploaded, and an Account Key

The converted model can also be downloaded. It can be visualized using its mesh. Instead of building a scene to visualize the converted model, we can simply open the “VisualizeScene” and add it to the scene build list. Only that VisualizeScene must be included in the Build Settings. All other scenes shouldn’t be included. Next, from the hierarchy panel, we can select the Visualize GameObject and then select the Play button on top of the Unity editor. Ensure that the Scene view is selected. With the Scene View’s navigational control, we can then inspect the Object Anchors model.

After the model has been viewed, it can be copied to a HoloLens device that has the runtime SDK for Unity which can then assist with detecting physical objects that match the original model.

Thanks

Sunday, December 12, 2021

This is a continuation of an article that describes operational considerations for hosting solutions on the Azure public cloud.

There are several references to best practices throughout the series of articles we wrote from the documentation for the Azure Public Cloud. The previous article focused on the antipatterns to avoid, specifically the noisy neighbor antipattern. This article focuses on the performance tuning of CosmosDB usages

An example of an application using CosmosDB is a drone delivery application that runs on Azure Kubernetes Service. When a fleet of drones sends position data in real-time to Azure IoT Hub, a functions app receives the events, transforms the data into GeoJSON format, and writes it to CosmosDB. The geospatial data in CosmosDB can be indexed for efficient spatial queries which enables a client application to query all drones within a finite distance of a given location or find all drones in a certain polygon. Azure Functions is used to write data to CosmosDB because it can be lightweight and there is no requirement to require a full-fledged stream processing engine that joins streams, aggregates data, or processes across time windows and CosmosDB can support high write throughput.

Monitoring data for CosmosDB can show 429 error codes in responses. Cosmos DB would throw this error when it is temporarily throttling requests and usually when the caller is consuming more resource units than provisioned. It can also be thrown when the items to be created are already existing in the store.

When the 429-error code and is accompanied with a wait of about 600 ms before the operation is retried, it points to waits without any corresponding activity. Another chart for resource unit consumption per partition versus provisioned resource units per partition will help with the original cause for the 429-error preceding the wait. This may show that the resource unit consumption has exceeded the provisioned resource units.

Another likely case for CosmosDB errors is the incorrect usage of partition keys. Cross-partition queries may result when queries do not include a partition key, and this is quite inefficient. It might even lead to high latency when multiple database partitions are queried in serial. On the opposite side, hot write partitions may result when the documents are being written and a partition key is missing. A partition heat map can assist in this regard because it will show the head room between allocated and consumed resource units.

CosmosDB provides snapshot isolation. So, it is important to include version string with the operations. There is a system defined _eTag property that is automatically generated and updated by the server every time the item is updated. _eTag can be used with the client supplied if-match request header to allow the server to decide whether an item can be conditionally updated. This property value changes every time it is updated and this can be relied upon as a signal to the application to reapply updates and retry the original client request.

Saturday, December 11, 2021

Creating Git Pull Requests (PR):

Introduction: This article focuses on some of the advanced techniques used with git pull requests that are required for reviewing code changes made to the source code for a team. The purpose of the pull request is that it allows reviewers to see the differences between the current and the proposed code on a file by file and line by line basis. They are so named because it is opened between two branches that are different where one branch is pulled from another usually the master branch and when the request is completed, it is merged back into the master. When a feature is written, it is checked into the feature branch as the source and merged into the master branch as target. Throughout this discussion, we will refer to the master and the feature branch.

Technique #1: Merge options

Code is checked into the branch in the form of commits. Each commit represents a point of time. A sequence of commits is a linear history for that branch. Code changes overlay one on top of the other. When there is a conflict between current and proposed code snippets, they are referred to as HEAD or TAIL because only one of them is accepted. ‘Rebase’ and ‘Merge’ are two techniques by which changes made in master can be pulled into the feature branch. The new changes from master appear as TAIL or HEAD respectively in the feature branch. Rebase preserves history of commits while merge creates a new commit.

There are four ways to merge the code changes from the feature to the master branch. These include:

Merge (no fast forward) – which is a non-linear history preserving all commits from both branches.

Squash commit - which is a linear history with only a single commit on the target.

Rebase and fast forward – which is a rebase source commits onto target and fast-forward

Semi-linear merge – which rebases source commits onto target and create a two-parent merge.

Prefer the squash commit as the merge to master because the entire feature can be rolled back if the need arises.

Technique #2 Interactive rebase

This allows us to manipulate multiple commits so that the history is modified to reflect only certain commits. When commits are rebased, we can pick and squash those that we want to keep or fold respectively so that the timeline shows only the history required. A clean history is readable which reflects the order of the commits and for narrowing down the root cause for bugs, creating a change log and to automatically extract release notes.

Technique #3: No history

Creating a pull request without history by creating another branch enables easier review. If a feature branch has a lot of commits that is hard to rebase, then there is an option to create a PR without history. This is done in two stages:

First, the target branch for the feature_branch merge is selected as a new branch say feature_branch_no_history and merge all the code changes with the “squash commit” option.

Second, a new PR is created that targets the merging of the feature_branch_no_history into the master.

The steps to completely clear history would be:

-- Remove the history from

rm -rf .git

-- recreate the repos from the current content only

git init

git add .

git commit -m "Initial commit"

-- push to the github remote repos ensuring you overwrite history

git remote add origin git@github.com:<YOUR ACCOUNT>/<YOUR REPOS>.git

git push -u --force origin master

A safer approach might be:

git init

git add .

git commit -m 'Initial commit'

git remote add origin [repo_address]

git push --mirror --force

Conclusion: Exercising caution with git pull request and history helps with a cleaner, readable and actionable code review and merge practice.

Friday, December 10, 2021

Azure Blueprint usages

As a public cloud, Azure provides uniform templates to manage resource provisioning across several services. Azure offers a control plane for all resources that can be deployed to the cloud and services take advantage of them both for themselves and their customers. While Azure Functions allow extensions via new resources, Azure Resource provider and ARM APIs provide extensions via existing resources. This eliminates the need to have new processes introduced around new resources and is a significant win for reusability and user convenience. New and existing resources are not the only way to write extensions, there are other options such as writing it in the Azure Store or via other control planes such as container orchestration frameworks and third-party platforms. This article focuses on Azure Blueprints.

Azure Blueprints can be leveraged to allow an engineer or architect to sketch a project’s design parameters, define a repeatable set of resources that implements and adheres to an organization’s standards, patterns, and requirements. It is a declarative way to orchestrate the deployment of various resource templates and other artifacts such as role assignments, policy assignments, ARM templates, and Resource Groups. Blueprint Objects are stored in the Cosmos DB and replicated to multiple Azure regions. Since it is designed to set up the environment, it is different from resource provisioning. This package fits nicely into a CI/CD pipeline and handles both what should be deployed and the assignment of what was deployed.

Azure Blueprints differ from ARM templates in that the former helps environment setup while the latter helps with resource provisioning. It is a package that comprises artifacts that declare resource groups, policies, role assignments, and ARM Template deployments. It can be composed and versioned and included in continuous integration and continuous delivery pipelines. The components of the package can be assigned to a subscription in a single operation, audited, and tracked. Although the components can be individually registered, the Blueprint facilitates a relationship to the template and an active connection.

There are two categories within the Blueprint – definitions for deployment that explain what should be deployed and the definitions for assignments that explain what was deployed. A previous effort to author ARM Templates become reusable in Azure Blueprint. In this way, Blueprint becomes bigger than just the templates and allows reusing an existing process to manage new resources.

A Blueprint focuses on standards, patterns, and requirements. The design can be reused to maintain consistency and compliance. It differs from an Azure policy in that it supports parameters with policies and initiatives. A policy is a self-contained manifest that governs resource properties during deployment and for already existing resources. Resources within a subscription adhere to the requirements and standards. When a Blueprint comprises resource templates and Azure policy along with parameters, it becomes holistic in cloud governance.

Thursday, December 9, 2021

Designing a microservices architecture a service on the public cloud

Microservices is great for allowing the domain to drive the development of a cloud service. It fits right into the approach to do “one thing” for the company and comes with a well-defined boundary for that service. Since it fulfils business capabilities, it does not focus on horizontal layers as much as it focuses on end-to-end vertical integration. It is cohesive and loosely coupled with other services. The Domain Driven Design provides a framework to build the services. It comes with two stages – strategic and tactical. The steps to designing with this framework includes 1. analyzing domain, 2. defining bounded context, 3. defining entities, aggregates and services and 4. Identifying microservices.

The benefits of this service include: This is a simple architecture that focuses on end-to-end addition of business capabilities. They are easy to deploy and manage. There is a clear separation of concerns. The front end is decoupled from the worker using asynchronous messaging. The front end and the worker can be scaled independently.

Challenges faced with this service include: Care must be taken to ensure that the front end and the worker do not become large, monolithic components that are difficult to maintain and update. It hides unnecessary dependencies when the front end and worker share data schemas or code modules.

Some examples of microservice include: The microservices are best suited for expanding the backend service portfolio such as for eCommerce. Works great for transactional processing and deep separation of data access. Useful to work with application gateway, load balancer and ingress.

Few things to consider when deploying these services include the following:

1. Availability – Event sourcing components allow system components to be loosely coupled and deployed independently of one another. Many of the Azure resources are built for availability.

2. Scalability – Cosmos DB and Service Bus provide fast, predictable performance and scale seamlessly as the application grows. The event sourcing microservices based architecture can also make use of azure functions and Azure container instances to scale horizontally.

3. Security features are available from all Azure resources, but it is also possible to include Azure monitors and Azure Sentinels.

4. Fault zones and update zones are already tackled by the Azure resources so the resiliency comes with the use of these resources and the architecture can improve the overall order processing system.

5. Azure advisor provides effective cost estimates and improvements.

These are only a few of the considerations. Some others follow from the choice of technologies and their support in Azure.