Cluster computing

Friday, February 4, 2022

Microsoft Graph:

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most recent discussion on cloud protection. This article describes Microsoft Graph.

Microsoft Graph provides a unified programmability model and similar in its utility as a Kusto cluster and database. The Microsoft Graph model allows Microsoft Graph Connectors to access data from different data sources and provides a common way to query the data. It is the gateway to data and intelligence in Microsoft 365. It can also act as a source for downstream Azure data stores that require data to be delivered. The Microsoft Graph Data Connect provides a set of tools to streamline secure and scalable delivery of Microsoft Graph Data.

The emphasis is on heterogeneity of data in the form of files, messages, meetings, user, people, devices, groups, calendar, coworkers, insights, chats, teams, and tasks. The unified programming access it provides can access data in and across Microsoft services including the Cloud, the hybrid cloud and the third-party cloud. A thin aggregation layer is used to route incoming requests to their corresponding destination services. This pattern of data virtualization is not uncommon, but the collection of data and the unified model provides an incredible opportunity for developers.

There are three challenges that Microsoft Graph aims to solve. These are:

1. The inability to write an application agnostic of any given environment – previous endpoints were defined per user/per-tenant and resulted in applications that had to be rebuilt for each customer.

2. The absence of a unified authentication/authorization model for all Office users requiring developers to build applications to serve different populations.

3. The inconsistency in the format of the data returned that causes difficulty in correlating data across the workloads.

When the programmability model is unified, the cycle of api development involving the steps to think about the scenarios in which the API will be consumed, design the API to meet those scenarios, design and implement authentication and authorization, publish the API in beta form, publish the API documentation, publish SDKs and support the community, become faster. The developers find that they can allow the users of their applications to use a single token to access data across all workloads, use an end-to-end environment with resources such as sample queries, sample data and tools such as Graph Explorer, and explore REST APIs or SDKs to access the platform endpoints to build the applications on Graph.

Thursday, February 3, 2022

Microsoft Graph:

There is a single endpoint https://graph.microsoft.com, to provide access to rich, people-centric data and insights in the Microsoft cloud. REST APIs and SDKs can be used to access the endpoint, and this powers the applications that support Microsoft 365 scenarios that span productivity, collaboration, education, people, and workplace intelligence. It includes services that manage user and device identity, access, compliance, security and helps protect organizations from data leakage or loss.

The Microsoft Graph exposes data from Microsoft 365 services, Enterprise Mobility and Security Services, Windows 10 services and Dynamics 365 Business Central. Microsoft 365 core services include Bookings, Calendar, Delve, Excel compliance eDiscovery, Search, OneDrive, OneNote, Planner, SharePoint, Teams, To Do, and Workplace analytics. The Enterprise Mobility and Security Services include Advanced Threat Analytics, Advanced Threat Protection, Azure Active Directory, Identity Manager, and Intune. Windows 10 services include activities, devices, notifications, and Universal Print. The Dynamics365 Business Central has its own data ecosystem.

The primary use case for Microsoft Graph is to open the Microsoft 365 platform for developers. The graph-explorer helps query and view this data

Data Connect and Graph APIs provide access to the same underlying data but in different ways. Data Connect works with bulk data so that extracting and moving large amounts of data is easy. Microsoft Graph APIs are more suitable for accessing discrete sets of data in real time. So if we want to get all of last year’s emails, then we would run Data Connect but rely on Graph APIs to get specific emails.

Data Connect involves some setup and overhead before the bulk operations on data. This can be about 45 minutes regardless of the data and all pipelines will take at least that long. It might be a negligible cost for large amounts of data but using it for something lightweight is not recommended and the Graph APIs are more suitable for that.

The billing for Graph APIs is on a pay-as-you-go basis and the billing unit is multiples of 1000s of objects, where 1 object maps to 1 individual instance of an entity in Microsoft 365 such as an email, file, or message. There are no charges to use User, MailboxSettings, Manager, and DirectReport.

Service principals are required for Microsoft Graph Data Connect which uses it as an identity for getting authorized access to Microsoft 365 data. Before data connect can copy data, an administrator must approve a Privileged Access Management Request. Either all the users in the user list must have Workplace Analytics license or all those users must not have it. There is no mixed mode user list for Data Connect users.

Wednesday, February 2, 2022

Azure ARM Resource provisioning with secrets:

Introduction: Secrets are passwords, certificates, symmetric keys, managed service identities and other such closely guarded and sensitive information that must be both generated and renewed for accessing resources services on Azure. One of the requirements for using a secret store is the automation of these routines. Some secrets must be deployed with their own resources. For example, virtual machines and virtual machine scale sets can be designated with the use of certain secrets. Similarly, storage accounts can also be provisioned with the use of secrets that can be requested over encrypted web traffic. These must be deployed as part of PaaS V2 offering for these scenarios. PaasV2 is a term denoted for managed applications involving infrastructure such as – Virtual Machines, Virtual Machine Scale Sets (VMSS), Service Fabric, Elastic-AP, Azure Container Service etc. Integration of the secret management routines with the Compute Resource Provider is necessary to enable this PaaS integration. A use case for using secrets with compute is when a PAAS deployment requires VMSS to span one or more fabric controllers with each fabric controller managing a fabric tenant. Secrets are serviced by a service that is available regionally for such use case.

The workflow involves an application deployment using ARM Templates which in turn creates VMs and VMSS using a regional Compute resource provider that relies on say Fabric controllers. The regional secret provider service will push to the fabrics secret cache which improves the scale up/down of resources without relying on the service providing the secrets. Once the resources are created, they can individually poll for updates to their secrets using the endpoint for the service that generates and renews the secrets.

The following components are involved to implement this workflow.

• A configurations layer which provides the implementation for registered features usage. Usage of the secrets provided by the secret management service is supported for subscriptions that have been registered with a specified feature flag. Customers must use approval workflow to register their subscriptions with this feature flag.

• VM & VMSS controllers: which implement the various APIs that are invoked by ARM. The VM/VMSS validations also happen in this layer. The provisioning of resources with secrets must conform to the provisioning of regular resources. Just their ARM Templates will be different because they will have the reference to the provisioning of a secret. Deployments can be parallelized across locations.

• State reconciliation that creates the async operations and persists the goal state for a given definition of VM/VMSS involving the secrets.

• The state reconciliation might involve both a composition of states such as for the VM pipeline and VMSSS pipeline. Each pipeline is further sub divided into multiple blocks responsible for driving the pipeline to its desired state. Blocks can be executed in parallel and when required to be synchronized. A state reconciliation will require a state seeking engine that implements a graph traversal and state machine workflow.

Tuesday, February 1, 2022

Sovereign clouds continued…

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most recent discussion on sovereign clouds. This article talks about Azure AD authentication in National clouds.

National clouds are physically isolated instances of Azure. The difference between Commercial, GCC and GCC High Microsoft 365 environments is important to correctly align the compliance needs of the businesses. Commercial Microsoft 365 is the standard Microsoft 365 cloud used by Enterprise, Academia and even home Office 365 tenants. It has the most features and tools, global availability, and lowest prices. Since it’s the default choice between the clouds, everyone qualifies and there are no validations. Some security and compliance requirements can be met here using tools like Enterprise Mobility and Security, Intune, Compliance Center, Cloud App Security, Azure Information Protection, and the Advanced Threat Protection tools. Some compliance frameworks can also reside in the commercial cloud, and these include HIPAA, NIST 800-53, PCI-CSS, GDPR, CCPA etc. but not government or defense compliance because the cloud shares a global infrastructure and workforce. Even some FedRAMP government compliance can be met in the commercial cloud, but it will be heavily augmented with existing tools and will require finding and patching gaps.

Each cloud instance is separate from the others and has its own environment and endpoints. Cloud specific endpoints include OAuth 2.0 endpoints, OpenID Connect token request endpoints and URLs for app management and deployment which means an entire identity framework is local to the cloud instance. There’s even a separate Azure portal for each national cloud instance.

Applications can continue to use modern authentication in Azure Government cloud but not GCC High. The identity authority can be Azure AD Public and Azure AD Government.

Applications can integrate with the Microsoft identity platform in a national cloud but they are required to register their application separately in each Azure portal that’s specific to the environment.

The workflow for authentication is claims based. A claims challenge is the response sent from an API indicating that an access token sent by a client application has insufficient claims. It could be due to one of many reasons such as conditional access policies not met for the API or the access token has been revoked. A claims request is made by the client application to request the user back to the identity provider to retrieve a new token with claims that will satisfy the additional requirements that were not met. Applications must declare the client capabilities in their calls to the service. Then they can use enhanced security features and must be able to handle claim challenges. This is usually presented via a www-authenticate header returned by the service API.

The MSAL library provides the following sample to communicate the client capabilities:

_clientApp = PublicClientApplicationBuilder.Create(App.ClientId)

.WithDefaultRedirectUri()

.WithAuthority(authority)

.WithClientCapabilities(new [] {"cp1"})

.Build();

An API implementer can receive information about whether client applications can handle claim challenges using the xms_cc optional claim in the application manifest.

Monday, January 31, 2022

Sovereign clouds continued…

The difference between Commercial, GCC and GCC High Microsoft 365 environments is important to correctly align the compliance needs of the businesses. Commercial Microsoft 365 is the standard Microsoft 365 cloud used by Enterprise, Academia and even home Office 365 tenants. It has the most features and tools, global availability and lowest prices. Since it’s the default choice between the clouds, everyone qualifies and there are no validations. Some security and compliance requirements can be met here using tools like Enterprise Mobility and Security, Intune, Compliance Center, Cloud App Security, Azure Information Protection, and the Advanced Threat Protection tools. Some compliance frameworks can also reside in the commercial cloud and these include HIPAA, NIST 800-53, PCI-CSS, GDPR, CCPA etc but not government or defense compliance because the cloud shares a global infrastructure and workforce. Even some FedRAMP government compliance can be met in the commercial cloud but it will be heavily augmented with existing tools and will require finding and patching gaps.

The Government Community cloud is government focused copy of the commercial environment. It has many of the same features as the commercial cloud buth has datacenters within the Continental United States. Compliance frameworks that can be met in the GCC include DFARS 252.204-7012, DoD SRG level 2, FBJ CJIS, and FedRAMP High. It is still insufficient for ITAR, EAR, Controlled Unclassified information and Controlled Defense information handling because the identity component and network that GCC resides on Azure Commercial and is not restricted to US Citizens. That said, GCC does have additional employee background checks such as verification of US Citizenship, verification of seven year employment history, verification of highest degree attained, Seven year criminal record check, validation against the department of treasury list of groups, the commerce list of individuals and the department of state list, criminal history and fingerprint background check.

The Dod Cloud kicks it up a notch and is only usable for the Department of Defense purposes and Federal contractors who meet the stringent cybersecurity and compliance requirements. The GCC High is a copy of the DoD cloud but it exists in its own sovereign environment. The GCC High does not compare to the commercial cloud in terms of feature parity but it does support calling and audio conferencing. Features are added to the GCC High cloud only when they meet the federal approval process, a dedicated staff is available that has passed the DoD IT-2 adjudication and only when the features do not have an inherent design that fails to meet the purpose of this cloud.

Applications can continue to use modern authentication in Azure Government cloud but not GCC High. The identity authority can be Azure AD Public and Azure AD Government

Sunday, January 30, 2022

Sovereign clouds

Public clouds are general purpose compute for all industries and commerce. Most of the service portfolio from the public cloud providers are made available in the public cloud for general acceptance. Some services are also supported in the sovereign cloud. This article discusses the role and purpose of sovereign clouds. Let’s begin with a few examples of Sovereign clouds. These are 1) US Government clouds (GCC) 2) China Cloud and 3) Office 365 GCC High cloud or USDoD. Clearly, organizations must evaluate which cloud is right for them. The differences between them mostly aligns with compliance. The Commercial, GCC, and GCCHigh Microsoft 365 environments must protect their controlled and unclassified data. These clouds offer enclosures within which the data resides and never leaves outside that boundary. It meets sovereignty and compliance requirements with geographical boundaries for the physical resources such as datacenters. The individual national cloud and global Azure cloud are cloud instances. Each instance is separate from the others and has its own environment and endpoints. Cloud specific endpoints can leverage the same OAuth 2.0 protocol and Open ID connect to work with the Azure Portal but even the identities must remain contained within that cloud. There is a separate Azure Portal for each one of these clouds. For example, the portal for Azure government is https://portal.azure.us and the portal for China National Cloud is https://portal.azure.cn

The Azure Active Directory and the Tenants are self-contained within these clouds. The corresponding Azure AD authentication endpoints are https://login.microsoftonline.us and https://login.partner.microsoftonline.cn respectively.

The Regions within these clouds in which to provision the azure resources also come with unique names that are not shared with any other regions in any of the other clouds. Since these environments are unique and different, the registering of applications, the acquiring of tokens and the calls to the services such as Graph API are also different.

Identity models will change with the application and location of identity. There are three types: On-Premises identity, Cloud identity and Hybrid identity

The On-premises identity belongs to the Active Directory hosted on-premises that most customers already use today.

Cloud identities originate, exist and are managed only in the Azure AD within each cloud.

The Hybrid identities originate as on-premise identities but become hybrid through data synchronization to Azure AD. After directory synchronization, they exist both on-premises and in the cloud. This gives the name hybrid identity model.

Azure Government applications can use Azure Government identities but can also use Azure AD public identities to authenticate to an application hosted in Azure Government. This is facilitated by the choice of Azure AD Public or the Azure AD Government.

Saturday, January 29, 2022

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most recent discussion on controlled folder access. This article talks about cloud protection.

Cloud protection is part of the next-generation portfolio of technologies in Microsoft Defender Antivirus that provides near-instant automated protection against new and emerging threats and vulnerabilities. The definitions are kept up to date in the cloud, but their role does not stop there. The Microsoft Intelligent Security Graph includes large sets of interconnected data as well as powerful artificial intelligence systems driven by advanced machine learning models. It works together with Microsoft Defender Antivirus to deliver accurate, real-time intelligent protection.

Cloud protection consists of the following features:

- Checking against metadata in the cloud

- Cloud protection and sample submission

- Tamper protection enforcement

- Block at first sight

- Emergency signature updates

- Endpoint detection and response in block mode

- Attack surface reduction rules

- Indicators of compromise (IoCs)

These are enabled by default. If for any reason, they get turned off, then the organization can enforce turning in back on using the Windows Management Instruction, Group Policy, PowerShell or with MDM configuration service providers.

Fixes for threats and vulnerabilities are delivered in real-time with Microsoft Defender Antivirus, unlike waiting for the next update in its absence.

5 billion threats to devices are caught every month. Windows Defender Antivirus does it under the hood. It uses multiple engines to detect and stop a wide range of threats and attacker techniques at multiple points. They provide industry with the best detection and blocking capabilities. Many of these engines are local to the client. If the threats are unknown, the metadata or the file itself is sent to the cloud service. The cloud service is built to be accurate, realtime and intelligent. While trained models can be hosted anywhere, they are run efficiently in the cloud with the transfer of input and prediction between the client and the cloud. Threats are both common and sophisticated and some are even designed to slip through protection. The earliest detection of a threat is necessary to ensure that not even a single endpoint is affected. With the models hosted in the cloud, protection is even more enriched and made more efficient. The latest strains of malware and attack methods are continuously included in the engines.

These cloud-based engines include:

- Metadata based ML engine – Stacked set of classifiers evaluate file-types, features, sender-specific signatures, and even the files themselves to combine results from these models to make a real-time verdict which allow or block files pre-execution.

- Behavior based ML engine where the suspicious behavior sequences and advanced attack techniques are monitored to trigger analysis. The techniques span attack chain, from exploits, elevation and persistence all the way through to lateral movement and data exfiltration.

- AMSI paired ML engine – where pairs of client-side and cloud side models perform advanced analysis of scripting behavior pre- and post- execution to catch advanced threats like fileless and in-memory attacks

- File-classification ML Engine - where deep neural network examine full file contents. Suspicious files are held from running and submitted to the cloud protection service for classification. The predictions determine whether the file should be allowed or blocked from execution.

- Detonation-based ML Engine - a sandbox is provided where suspicious files are detonated so that classifiers can analyze the observed behaviors to block attacks.

- Reputation ML engine – which utilizes sources with domain expert reputations and models from across Microsoft, to block threats that are linked to malicious URLs, domains, emails, and files.

- Smart rules engine - which features expert written smart rules that identify threats based on researcher expertise and collective knowledge of threats.

These technologies are industry recognized and proven to come with customer satisfaction.