Tuesday, February 15, 2022

 

Azure Well-Architected Framework

This is a continuation of a series of articles on Azure services from an operational engineering perspective with the most recent introduction of this topic with the link here. The previous article discussed the Microsoft Graph Data Connect used with Microsoft Graph. This article discusses cloud data governance and the Azure well-architected framework for data workloads.

The Cloud Adoption Framework helps to create an overall cloud adoption plan that guides programs and teams in their digital transformation. The plan methodology provides templates to create backlogs and plans to build necessary skills across the teams. It helps rationalize the data estate, prioritize the technical efforts, and identify the data workloads. Its important to adhere to a set of architectural principles which help guide development and optimization of the workloads. The Azure Well-architected framework lays down five pillars of architectural excellence which include:

-          Reliability

-          Security

-          Cost Optimization

-          Operational Excellence

-          Performance efficiency

 The elements that support these pillars are Azure well-architected review, azure advisor, documentation, patterns-support-and-service offers, reference architectures and design principles.

This guidance provides a summary of how these principles apply to the management of the data workloads.

Cost optimization is one of the primary benefits of using the right tool for the right solution. It helps to analyze the spend over time as well as the effects of scale out and scale up. The Azure Advisor can help improve reusability, on-demand scaling, reduced data duplication, among many others.

Performance is usually based on external factors and is very close to customer satisfaction. Continuous telemetry and reactiveness are essential to tuned up performance. The shared environment controls for management and monitoring create alerts, dashboards, and notifications specific to the performance of the workload. Performance considerations include storage and compute abstractions, dynamic scaling, partitioning, storage pruning, enhanced drivers, and multilayer cache.

Operational excellence comes with security and reliability. Security and data management must be built right into the system at layers for every application and workload. The data management and analytics scenario focus on establishing a foundation for security. Although workload specific solutions might be required, the foundation for security is built with the Azure landing zones and managed independently from the workload. Confidentiality and integrity of data including privilege management, data privacy and appropriate controls must be ensured. Network isolation and end-to-end encryption must be implemented. SSO, MFA, conditional access and managed service identities are involved to secure authentication. Separation of concerns between azure control plane and data plane as well as RBAC access control must be used.

The key considerations for reliability are how to detect change and how quickly the operations can be resumed. The existing environment should also include auditing, monitoring, alerting and a notification framework.

In addition to all the above, some consideration may be given to improving individual service level agreements, redundancy of workload specific architecture, and processes for monitoring and notification beyond what is provided by the cloud operations teams.

Monday, February 14, 2022

Continuous Encoder

BERT is an algorithm for natural language processing that interprets search queries as almost humans do because it tries to understand the context of the words that constitute the query so results match better than without it. It was proposed by Google and stands for Bidirectional Encoder Representations from Transformers. To understand BERT, we must first understand the meaning of the terms Encoder and Bidirectional. These terms come from the machine learning neural network techniques where the term encoding and decoding refer to states between words in a sequence. A short introduction to neural networks is that it comprises of layers of sensors that calculate probabilities of the inputs, in this case these are words, with weighted probabilities across a chosen set of other inputs and are also called features. Each feature gets a set of weights as probabilities in terms of how likely it is to appear together with other words chosen as features. A bag of words from the text is run through the neural network and gets transformed into a set of output that resemble some form of word associations with other words but, in this process, it computes the weighted matrix of words with its features which are called embeddings. These embeddings are immensely useful because they represent words and their context in terms of the features that frequently co-occur with these words bringing out the latent meanings of the words. With this additional information on the words from their embeddings, it is possible to find how similar two words are or what topics the keywords are representing especially when a word may have multiple meanings.  

In the above example, the transformation was forward only with associations between the left to the right context for a layer, but the calculations performed in one layer can jointly utilize the learnings from both sides. This is called bidirectional transformation and since a neural network can have multiple layers with the output of one layer performing as input to another layer, this algorithm can perform the bidirectional transformations for all layers. When the input is not just words but a set of words such as from a sentence, it is called a sequence. Search terms form a sequence. BERT can unambiguously represent a sentence or a pair of sentences in the question/answer form. The state between the constituents of a sequence is encoded in some form that helps to interpret the sequence or to generate a response sequence with the help of decodings. This relationship that is captured between an input and output sequence in the form of encodings and decodings helps to enhance the language modeling and improve the search results.

Natural language processing relies on encoding-decoding to capture and replay state from text.  This state is discrete and changes from one set of tokenized input texts to another. As the text is transformed into vectors of predefined feature length, it becomes available to undergo regression and classification. The state representation remains immutable and decoded to generate new text. Instead, if the encoded state could be accumulated with the subsequent text, it is likely that it will bring out the topic of the text if the state accumulation is progressive. A progress indicator could be the mutual information value of the resulting state. If there is information gain, the state can continue to aggregate, and this can be stored in memory. Otherwise, the pairing state can be discarded. This results in a final state aggregation that continues to be more inclusive of the topic in the text.

State aggregation is independent of BERT but not removed from it. It is optional and useful towards topic detection. It can also improve the precision and relevance of the text generated in response by ensuring that their F-score remains high when compared to the aggregated state. Without the aggregated state, the scores for the response was harder to evaluate.

Sunday, February 13, 2022

 Standard enterprise governance guide and multi-cloud adoption

Cloud governance is a journey not a destination. Cloud governance creates guardrails that keep the company on a safe path throughout the journey of adopting the cloud along the way there are clear milestones and tangible business benefits. Processes must be put in place to ensure adherence to the stated policies. There are five disciplines of cloud governance which support these corporate policies. Each discipline protects the company from potential pitfalls. These include cost management discipline, security baseline discipline, resource consistency discipline, identity baseline discipline, and deployment acceleration discipline.

The actionable governance guide is an incremental approach of the cloud adoption framework governance model. It can be established with an agile approach to cloud governance that will grow to meet the needs of any scenario.

This governance guide serves as a foundation for an organization to quickly and consistently at garb governance guardrails across their subscriptions. Initially, an organization hierarchy may be created to empower the cloud adoption teams. It will consist of one management group for each type of environment, two subscriptions, one for production workloads and another for non-production workloads, consistent nomenclature to be applied at each level of this grouping hierarchy, resource groups to be deployed in a manner that considers its contents lifecycle and region selection such that networking, monitoring and auditing can be in place. These patterns provided room for growth without complicating the hierarchy.

 A set of global policies and RBAC roles will provide a baseline level of governance enforcement. Identifying the policy definitions, creating a blueprint definition, and applying policies and configurations globally are required to meet the policy requirements.

Controls can be added for multi-cloud adoption when customers adopt multiple clouds for specific purposes. All of the IT operations can be run on a different cloud provider.   

In a multi cloud identity could be specific to a cloud or it could be hybrid, facilitated through replication to say Azure Active Directory from an on-premises instance of Active Directory. Each cloud may also have its own identity provider, membership directory as well as authentication and authorization models. Its operations can be managed by monitoring and related automated processes. Disaster recovery and business continuity can be controlled by recovery services and their vaults. Monitoring security violations and attacks as well as enforcing governance of the cloud can be done with the same service. All of these above are used to automate compliance with policy

The changes required to monitor new corporate policy statements include the following: connecting the networks, consolidating identity providers, adding assets to the recovery services, adding assets for cost management and billing, adding assets to the monitoring services and adopting governance enforcement tools.

Saturday, February 12, 2022

 

Microsoft Graph 

This is a continuation of a series of articles on Azure services from an operational engineering perspective with the most recent introduction of this topic with the link here. The previous article discussed the Microsoft Graph Data Connect used with Microsoft Graph. This article discusses the API. Microsoft Graph enables integration with the best of Microsoft 365, Windows 10 and Enterprise mobility and security services in Microsoft 365, using REST APIs and client libraries

Microsoft Graph provides a unified programmability model by consolidating multiple APIs into one. As Microsoft’s cloud services have evolved, the APIs to reference them has also changed. Originally, when cloud services like Exchange Online, Sharepoint, OneDrive and others evolved, the API to access those services was launched too. The list for SDKs and REST APIs for these services started growing for developers to access content. Each endpoint also required Access Tokens and returned status code that were unique to each individual service. Microsoft Graph brought a consistent simplified way to interact with these services.

The data virtualization platform that Microsoft Graph presents also supports querying relationships between:

·        Azure Active Directory

·        Exchange Online – including mail, calendar and contacts.

·        Sharepoint online including file storage

·        OneDrive

·        OneDrive for business

·        OneNote and

·        Planner

As a collaborative app development platform Microsoft Graph is not alone. Microsoft Teams, Slack, Google Workspace are applications with collaboration as their essence and designed for flexibility of hybrid work. For example, Teams toolkit for Visual studio code lets us use existing web development framework to build cross platform Team applications against any backend. Microsoft Graph provides both the seamlessness and the data for realtime collaboration.

Connectors and Microsoft Data Connect round up the data transfer mechanisms. Connectors offer a simple and intuitive way to bring content from external services to Microsoft Graph which enables external data to power Microsoft 365 experiences. It does this with the help of REST APIs that are used to 1. Create and manage external data connections, 2. Define and register the schema of the external data type(s), 3. Ingest external data items into Microsoft Graph and 4. Sync external groups.  Microsoft Graph Data Connect augments Microsoft Graph’s transactional model with an intelligent way to access rich data at scale. It is ideal to connect big data and for machine learning. It uses Azure Data Factory to copy Microsoft 365 data to the application’s storage at configurable intervals. It provides a set of tools to streamline the delivery of this data into Microsoft Azure. It allows us to manage the data and see who is accessing it, and it requests specific properties of an entity. It enhances the Microsoft Graph model, which grants or denies applications access to entire entities.

 

Sample code for enriching user information:

        public static void AddUserGraphInfo(this ClaimsPrincipal claimsPrincipal, User user)

        {

            var identity = claimsPrincipal.Identity as ClaimsIdentity;

            identity.AddClaim(

                new Claim(GraphClaimTypes.DisplayName, user.DisplayName));

            identity.AddClaim(

                new Claim(GraphClaimTypes.Email,

                    claimsPrincipal.IsPersonalAccount()? user.UserPrincipalName : user.Mail));

            identity.AddClaim(

                new Claim(GraphClaimTypes.TimeZone,

                    user.MailboxSettings.TimeZone ?? "UTC"));

            identity.AddClaim(

                new Claim(GraphClaimTypes.TimeFormat, user.MailboxSettings.TimeFormat ?? "h:mm tt"));

            identity.AddClaim(

                new Claim(GraphClaimTypes.DateFormat, user.MailboxSettings.DateFormat ?? "M/d/yyyy"));

        }

 

   Sample delta query for mail folders

   Public async Task<IMailFolderDeltaCollectionPage> GetIncrementalChangeInMailFolders()

{

           IMailFolderDeltaCollectionPage deltaCollection;

              deltaCollection = await _graphClient.Me.MailFolders

                .Delta()

                .Request()

                .GetAsync();

return deltaCollection;

}

Friday, February 11, 2022

 

Microsoft Graph 

This is a continuation of a series of articles on Azure services from an operational engineering perspective with the most recent introduction of this topic with the link here. The previous article discussed the connectors used with Microsoft Graph. This article introduces the Microsoft Graph Data Connect. Microsoft Graph enables integration with the best of Microsoft 365, Windows 10 and Enterprise mobility and security services in Microsoft 365, using REST APIs and client libraries

Microsoft Graph provides a unified programmability model and similar in its utility as a Kusto cluster and database. The Microsoft Graph model allows Microsoft Graph Connectors to access data from different data sources and provides a common way to query the data. It is the gateway to data and intelligence in Microsoft 365. It can also act as a source for downstream Azure data stores that require data to be delivered. The Microsoft Graph Data Connect provides a set of tools to streamline secure and scalable delivery of Microsoft Graph Data.

The emphasis is on heterogeneity of data in the form of files, messages, meetings, user, people, devices, groups, calendar, coworkers, insights, chats, teams, and tasks. The unified programming access it provides can access data in and across Microsoft services including the Cloud, the hybrid cloud and the third-party cloud. A thin aggregation layer is used to route incoming requests to their corresponding destination services. This pattern of data virtualization is not uncommon, but the collection of data and the unified model provides an incredible opportunity for developers.

Microsoft Graph Data Connect augments Microsoft Graph’s transactional model with an intelligent way to access rich data at scale. It is ideal to connect big data and for machine learning.

It allows us to develop applications for analytics, intelligence, and business process optimization by extending Microsoft 365 data into Azure. It uses Azure Data Factory to copy Microsoft 365 data to the application’s storage at configurable intervals. It provides a set of tools to streamline the delivery of this data into Microsoft Azure. It allows us to manage the data and see who is accessing it, and it requests specific properties of an entity. It enhances the Microsoft Graph model, which grants or denies applications access to entire entities. The granular data consent model allows applications to access only specific properties in an entity and opens new use cases on the same data without compromising security and isolation. It supports all Azure native capabilities such as encryption, geo fencing, auditing and policy enforcement.

 

 

Thursday, February 10, 2022

 

Microsoft Graph

This is a continuation of a series of articles on Azure services from an operational engineering perspective with the most recent introduction of this topic with the link here. This article continues to elaborate on the connectors used with the Microsoft Graph. Microsoft Graph enables integration with the best of Microsoft 365, Windows 10 and Enterprise mobility and security services in Microsoft 365, using REST APIs and client libraries.

It uses the concepts of users and groups to elaborate on these functionalities.  A user is an individual who uses Microsoft 365 cloud services and for Microsoft Graph, it is the focus for which the identity is protected, and access is well managed. The data associated with this entity and the opportunities to enrich the context, provide real-time information, and deep insights are what makes Microsoft Graph so popular. A group is the fundamental entity that lets users collaborate and integrate with other services which enable scenarios for task planning, teamwork, education and more.

Since Microsoft Graph is the data fabric that empowers intelligent experiences, it needs mechanisms to bring content from external services to Microsoft Graph which enables external data to power Microsoft 365 experiences.

Connectors offer a simple and intuitive way to do just that. For example, the data brought in from the organization can appear in Microsoft Search results. This expands the type of content sources that are searchable in Microsoft 365 productivity applications and the broader ecosystem.

There are over a hundred connectors that are currently available from Microsoft and partners which include Azure Services, Box, ServiceNow, Salesforce, Google services, MediaWiki, and more. An example of writing a custom connector will explain the details of its working.

There is a set of connector REST APIs available from Microsoft Graph. These are used to 1. Create and manage external data connections, 2. Define and register the schema of the external data type(s), 3. Ingest external data items into Microsoft Graph and 4. Sync external groups.

A connection is a logical unit for the external data that can be managed as a single unit. It can be used to create, update, and delete connections in Microsoft Graph. The Connection API provides the connection resource. The connection schema determines how the content will be used in various Microsoft 365 experiences. Schema is a flat list of all properties that can be added to the connection along with the attributes, labels, and aliases. The schema must be registered before ingesting items into the Microsoft Graph. Items that can be added to the Microsoft Search service are represented by the externalItem resource in Microsoft Graph. Items in the external service can be granted or denied access via ACL to different types of non-Azure Active Directory groups. When the items are ingested into Microsoft Graph, they must honor these ACLs. The External Groups API sets permission on external items ingested into the Microsoft Graph.  The connector must be registered as an application in the Azure AD admin center.

 

 

 

Wednesday, February 9, 2022

 

Microsoft Graph

This is a continuation of a series of articles on Azure services from an operational engineering perspective with the most recent introduction of this topic with the link here. This article continues to elaborate on the best practices in working with the Microsoft Graph. Microsoft Graph enables integration with the best of Microsoft 365, Windows 10 and Enterprise mobility and security services in Microsoft 365, using REST APIs and client libraries. It uses the concepts of users and groups to elaborate on these functionalities.  A user is an individual who uses Microsoft 365 cloud services and for Microsoft Graph, it is the focus for which the identity is protected and access is well managed. The data associated with this entity and the opportunities to enrich the context, provide real-time information, and deep insights are what makes Microsoft Graph so popular. A group is the fundamental entity that lets users collaborate and integrate with other services which enable scenarios for task planning, teamwork, education and more.

The Graph Explorer helps to know the API and is the easiest way to start experimenting with the data available. Proper REST requests can be made and the responses are representative of those encountered programmatically which eliminates surprises and errors during implementation. Authentication for Microsoft Graph is made easier using the Microsoft Authentication Library API, MSAL which acquires an access token.

The best practices include the following:

·        Using least privilege so that the APIs are called only with what permission is necessary.

·        Using the correct permission type based on the scenario which is particularly important for delegated permissions. If the code runs without a signed-in user, it can lead to vulnerability.

·        Configuring the application properly for end-user and administrator experiences.

·        Using multi-tenant applications so that the customer have various application and consent controls in different states.

·        Using Pagination when the responses to the requests made to Microsoft Graph are large and the results must be browsed efficiently.

·        Handling expected errors for robustness and user-convenience. Certain errors are retriable while others need to be translated to the user.

·        Adding members to existing enumerations can break applications. Evolvable enumerations provide a better alternative. They have a common sentinel member called the unknownFutureVaue which demarcates known members that have been defined in the enum initially and unknown members that are added subsequently or will be defined in the future. Members of evolvable enums can be references by their string values.

·        Making calls to Microsoft Graph for real-time data and storing data locally only when required and the terms of use and privacy policy can be upheld.

·        Getting only the minimum amount of data for improving performance, security and privacy.

·        Choosing only the properties that the application needs and no more.

·        Using webhooks to get push notifications when data changes

·        Using delta query to efficiently keep data up to date

·        Using webhooks and delta query together because if only one is used, the right polling interval is required. Using webhook notifications as the trigger to make delta query calls.

·        Batching which enables optimization of application by combining multiple requests into a single JSON object.

·        Combining individual requests into a single batch can save significant network latency and conserve connection requests.

·        Using and honoring TLS to support all capabilities of Microsoft Graph

·        Opening connections to all advertised DNS answers and generating unique Guid for sending in the HTTP request headers and for logging.

These are some of the considerations towards the best practice in working with Microsoft Graph.