Cluster computing

Sunday, July 18, 2021

One of the most interesting aspects of using keyvault services is that the client application can treat it merely as a resource with little or no maintenance for the scope and lifetime of that resource. This enables it to integrate with existing applications as a pipeline:

For example:

public class KeyVaultProxy : HttpPipelinePolicy, IDisposable

{

private readonly Cache _cache;

public KeyVaultProxy()

{

_cache = new Cache();

}

public void Clear() => _cache.Clear();

public override async ValueTask ProcessAsync(HttpMessage message, ReadOnlyMemory<HttpPipelinePolicy> pipeline) =>

await ProcessAsync(true, message, pipeline).ConfigureAwait(false);

private async ValueTask ProcessAsync(bool isAsync, HttpMessage message, ReadOnlyMemory<HttpPipelinePolicy> pipeline)

{

Request request = message.Request;

if (request.Method == RequestMethod.Get)

{

string uri = request.Uri.ToUri().GetLeftPart(UriPartial.Path);

if (IsSupported(uri))

{

message.Response = await _cache.GetOrAddAsync(isAsync, uri, null, async () =>

{

await ProcessNextAsync(isAsync, message, pipeline).ConfigureAwait(false);

return message.Response;

}).ConfigureAwait(false);

return;

}

await ProcessNextAsync(isAsync, message, pipeline).ConfigureAwait(false);

}

private static async ValueTask ProcessNextAsync(bool isAsync, HttpMessage message, ReadOnlyMemory<HttpPipelinePolicy> pipeline)

{

if (isAsync)

{

await ProcessNextAsync(message, pipeline).ConfigureAwait(false);

}

else

{

ProcessNext(message, pipeline);

}

/// <inheritdoc/>

void IDisposable.Dispose()

{

_cache.Dispose();

GC.SuppressFinalize(this);

}

An HttpPipelinePolicy is one that can mutate the request and the received response enabling it to work just like an Nginx handler. It is called pipelining because multiple HTTP requests can be sent in the same TCP connection. In a pipeline, there is no limitation of a response right after a request in a sequence. The resources that KeyVault supports, are part of "/secrets/", "/keys/", or "/certificates/" path qualifiers which allows policies specific to those.

Saturday, July 17, 2021

Since the secrets can vary, their scope and lifetime can also vary, a new secret can be used for granular purpose if the naming convention for the secrets are maintained so it is easy to locate a secret or use the name to know identify the secret and its intended use.

Another way to use key-vault secret is to use it in conjunction with monitoring and alerting. It provides a a secure way to store keys, secrets and certificates in the cloud, so their access is equally worth monitoring – both from the perspective of whether the key-vault is functioning properly for its clients and to know if the clients are accessing it correctly. If the SLA for key-secrets is not met, then the business suffers a disruption because there are numerous usages of that secret

Monitoring is a very helpful service in many scenarios and deserves its own elaboration but in this section, the emphasis is on the usage of Key-Vault monitoring. The set of events processed by the key-vault monitors include NewVersionCreated, NearExpiry, and Expired. These events are consumed via the event grid by Logic applications, Azure functions and Azure Service Bus. Although Key-vault monitoring provides comprehensive coverage of its functionality, it does not integrate with events raised from hardware layer when key-vault supports hardware security modules. In the software plane, key-vault can integrate with almost any cloud service by virtue of REST calls, SDK and Command-line interface.

The Azure key-vault portal provides the options to setup an event grid, with the help of logic applications, then configure the event grid trigger with the subscription parameter as the one where the key-vault exists, resource type as Microsoft.KeyVault.vaults and with the resource name as the keyvault to be monitored. This can be displayed from the resource group view as an “Event grid system topic”

There are two recovery features that can be enabled with Azure Key-Vault based on expiration time event handling. These are soft-delete and purge protection. The former is like a recycle bin that can be used to reclaim accidentally deleted keys, secrets and certificates. If they need to be removed completely, then they can be purged. The latter option of purge protection increases the retention period so that the permanent delete or purge option cannot occur until the retention period expires.

Friday, July 16, 2021

Using Key-Vault services from Azure Public cloud:

Introduction: The previous article introduced one aspect of using secrets from Azure public cloud. It showed the use of proxy for secret management to add whitelists to folders specified by path. With folder specified for different categories and subscriptionIds added to each folder, the whitelisting provided a way to complement the role-based access control. This article introduces another aspect of key-vault via the use of its SecretClient to get access to the resource directly.

Description. While DSMSProxy usage shown earlier provided categories for organizing whitelists based on SubscriptionId, ServiceId and ServiceTreeId, the use of SecretClient is primarily for the purpose of getting and setting secrets in the vault. These secrets can be credentials, passwords, keys, certificates and other forms of identity that can be persisted safely. A sample of using this client involves the following:

DefaultAzureCredential credential = new DefaultAzureCredential(

new DefaultAzureCredentialOptions

{

ExcludeEnvironmentCredential = true,

ExcludeManagedIdentityCredential = true,

});

SecretClient secretClient = new SecretClient(vaultUri, credential, options);

KeyVaultSecret sasToken = await secretClient.GetSecretAsync($"{storageAccountName}-{sasDefinitionName}", cancellationToken: s_cancellationTokenSource.Token);

Since the secrets can vary, their scope and lifetime can also vary, a new secret can be used for granular purpose as long as the naming convention for the secrets are maintained so it is easy to locate a secret or use the name to know identify the secret and its intended use.

Thursday, July 15, 2021

Azure Secret Management System:

Introduction: Azure KeyVaults store secrets consumed by users, applications, services, and devices without the need to manage it themselves. The documentation on this service offering from the Azure Public cloud helps us review some of the features that can be leveraged for its usage. This article captures one aspect of their usage that is popular with DevOps but does not get much attention. Secrets are used to safeguard access to resources and access to those resources must be whitelisted. Depending on the resources, there can be many whitelists and subscriptions, or domains can be whitelisted for root folders.

Description: We begin with a root folder that can be environment-specific and includes Deployment subscriptions and Storage subscriptions. Adding a subscription to this root folder under one of the categories is equivalent to whitelisting that subscription for access to resources. Similarly, there can be many paths granting access and the subscription may need to be added to all these paths. Even new regions can be part of the path and adding a subscription to the new region grants access based on this whitelist. A whitelist can be followed up with an approval service to complete the addition.

A whitelist can be used together with role-based access control. For example, setting the Azure login context to the given subscription can then be used to find the service principal and the role to which the principal needs to be added. The service principal of an app can be added to the storage key operation service role. Similarly, security group-based role assignments can be created. This completes the access control for the resources.

At the resource level, we can take the example of a storage account, a commonly-used and critical resource for many services. The secret management system may have a path for all storage accounts and there would be a path specifier specifically for this storage account by name.

This specific whitelisting then proceeds with the following steps:

Step 1. Determine the rootPath for the storage account and the subscriptionId that needs to be added.

Step 2. Use the DSMSProxy to check if the rootPath folder exists.

Step 3. If Step 2 succeeds, add the subscriptionId to the rootPath folder.

Internal bool whitelist(string rootPath, Guid subscriptionId) {

If (DSMSProxy.Folders.Exists(rootPath)) {

DSMSProxy.Folders.AddToWhitelist(rootPath, subscriptionId);

return true;

}

throw new Exception(string.format(“{0} Not Found”, rootPath));

}

Thus, we see a Zookeeper like a strategy to maintain whitelists based on folder paths for resources that complement the RBAC control.

Wednesday, July 14, 2021

Lessons Learned from Region buildouts:

Introduction: This article summarizes some of the lessons learned from building new region capabilities for a public cloud. Many public and private cloud providers expand their geographical presence in terms of datacenters. This is a strategic advantage for them because it draws business from the neighborhood of the new presence. A geographical presence for a public cloud is only somewhat different from that of a private cloud. A public cloud lists regions where the services it offers are hosted. A region may have three availability zones for redundancy and availability and each zone may host a variety of cloud computing resources – small or big. Each availability zone may have one or more stadium sized datacenters. When the infrastructure is established, the process of commissioning services in the new region can be referred to as buildouts. This article mentions some of the lessons learned in automating new region buildouts.

Description: First, the automation must involve context switching between the platform and the task for deploying each service to the region. The platform co-ordinates these services and must maintain an order, dependency and status during the tasks.

Second, the task of each service itself is complicated and requires definitions in terms of region-specific parameters to an otherwise region agnostic service model.

Third, the service must manifest their dependencies declaratively so that they can be validated and processed correctly. These dependencies may be between services, on external resources and the availability or event from another activity.

Fourth, the service buildouts must be retry-able on errors and exceptions otherwise the platform will require a lot of manual intervention which increase the cost

Fifth, the communication between the automated activities and manual interventions must be captured with the help of the ticket tracking or incident management system

Sixth, the workflow and the state for each activity pertaining to the task must follow standard operating procedures that are defined independent of region and available to audit

Seventh, the technologies for the platform execution and that for the deployments of the services might be different requiring consolidation and coordination between the two. In such case, the fewer the context switches between the two the better.

Eighth, the platform itself must have support for templates, event publishing and subscription, metadata store, onboarding and bootstrapping processes that can be reused.

Ninth, the platform should support parameters for enabling a region to be differentiated from others or for customer satisfaction in terms of features or services available.

Tenth, the progress for the buildout of new regions must be actively tracked with the help of tiles for individual tasks and rows per services.

Conclusion: Together, these are only some of the takeaways for a new region buildout but they show case some of the issues to be faced and their mitigations.

Tuesday, July 13, 2021

Implementing AAA

Introduction: Implementing authentication, authorization, and auditing for software products.

Description: User interface and application programming interfaces must be secured so that only authenticated and authorized users can use the functionality. There are many protocols, techniques and signin experiences which vary depending on the target audience of the product just as much as the technology stack involved. This article talks about some of the main considerations when designing identity and authentication modules across a variety of products.

WebAPIs often use various authentication methods such as basic authentication, token authentication, or even session authentication. However, applications using the webAPIs may also require sessions to associate the resources for the same user using the same user-agent* (defined in RFC 2616). For example, consider the Cart API. To get the items in the shopping cart of a user the application may have to use the GET method on the Cart API as follows:
GET Cart/ API
“X-Merchant-ID: client_key”
“X-Session: <user session identifier>”
The application can also add or edit items in the shopping cart or delete it. This example illustrates the need for the application to associate an identifier for the scope of a set of webAPI calls.
This session identifier is usually obtained as a hash of the session cookie which is provided by the authentication and authorization server. The session identifier can be requested as part of the login process or by a separate API Call to a session endpoint. An active session removes the need to re-authenticate. It provides a familiar end-user experience and functionality. The session can also be used with user-agent features or extensions to assist with authentication such as password-manager or 2-factor device reader.
The session identifier is usually a hash of the session. It is dynamic in nature, for external use and useful only for identifying something that is temporary. It can be requested based on predetermined client_secrets.
Sessions can time out or be explicitly torn down – either by the user or by the system forcing re-authentication. Therefore, session management must expire/clear the associated cookie and identifiers.
Sessions will need to be protected against csrf attack and clickjacking just as they are for other resources.
Sessions are treated the same as credentials in terms of their lifetime. The user for the session can be looked up as follows:
{HttpCookie cookie = HttpContext.Current.Request.Cookies[FormsAuthentication.FormsCookieName];
FormsAuthenticationTicket ticket = FormsAuthentication.Decrypt(cookie.Value);
SampleIdentity id = new SampleIdentity(ticket);
GenericPrincipal prin = new GenericPrinicipal(id, null);
HttpContext.Current.User = prin;}

Micro-services and their popularity are a demonstration of the power of APIs especially when the callers are mobile applications and personal desktops. Many companies implement microservices as NodeJS or Django web applications with OAuth and SSO. I have not seen the use of CAS in companies, but I have seen it in educational institutions. Django for instance brings the following functionality:

django - social auth - which makes social authentication simpler.
Pinax - which makes it popular for websites.
django-allauth which integrates authentication, addressing, registration, account management as well as 3rd party social account.
django-userena which makes user accounts simpler
django-social registration which combines OpenID, OAuth and FacebookConnect
django-registration which is probably the most widely used for the framework
django-email-registration which claims to be very simple to use and other such packages.
These implementations are essentially to facilitate the user account registration via templated views and a database or other membership provider backends.

There are other implementations as well such as EngineAuth, SimpleAuth and AppEngine-OAuth-Library. EngineAuth does the multiprovider authentication and saves the userid to a cookie.
SimpleAuth supports OAuth and OpenID. AppEngine-OAuth now provides user authentication against third party websites.

NodeJS style implementation evne allows the use of the providers as strategy in addition to bringing some of the functionalities as described above for Django. If we look at a ‘passport’ implementation for example, I like the fact that we can easily change the strategy to direct against the provider of choice. In fact, the interface is something that makes it quite clear.
Methods used are like
app.get('/login', function(req, res, next)) {
passport.authenticate('AuthBackendOfChoice', function (req,res, next) {
:
etc.
Additional methods include:
var passport = require('passport') , OAuthStrategy = require('passport-oauth').OAuthStrategy; passport.use('provider', new OAuthStrategy({ requestTokenURL: 'https://www.provider.com/oauth/request_token', accessTokenURL: 'https://www.provider.com/oauth/access_token', userAuthorizationURL: 'https://www.provider.com/oauth/authorize', consumerKey: '123-456-789', consumerSecret: 'shhh-its-a-secret' callbackURL: 'https://www.example.com/auth/provider/callback' }, function(token, tokenSecret, profile, done) { User.findOrCreate(..., function(err, user) { done(err, user); }); } ));
There seems to be little or no django-passport implementation in the source repositories or for that matter any python-passport implementation.
Netor technologies has a mention for something that's same name and is also an interesting read.
For example, they create a table to keep the application_info and user_info. The application_info is like the client in the OAuth protocol. In that it keeps track of the applications as well as the user information. The user information is keeping track of usernames and passwords. The user_applications is the mapping between the user and the applications.
The authentication is handled using a Challenge Response scheme. The server responds with the user's password salt along with a newly generated challenge salt and a challenge id. The client sends back a response with the hash resulting from hash(hash(password+salt) + challenge). These are read by the server and deleted after use. There is no need to keep them.
The code for create user looks like this:
def create(store, user, application = None):
      if application is None:
          application = Application.findByName(unicode('passport'))
      result = UserLogin.findExisting(user.userid, application.applicationid)
      if result is None:
              result = store.add(UserLogin(user, application))
              store.commit()
      return result
and the authentication methods are handled in the controllers. The BaseController has the method to get user login and the ServiceController has the method to authenticate via a challenge.
This seems a clean example for doing a basic registration of user accounts and integrating with the application.

Regarding audit, most membership providers and their corresponding data stores can help turn on data capture and audit trails.

It is also possible to send tags and key-value pairs in parameters to webAPI calls which significantly enhance any custom logic to be built. For example,

[Fact]

public void TestPost3()

{

var httpContent = new StringContent("{ \"firstName\": \"foo\" }", Encoding.UTF8, "application/json");

var client = new HttpClient();

var result = client.PostAsync("http://localhost:17232/api/Transformation/Post3", httpContent).GetAwaiter().GetResult();

}

[HttpPost]

[ActionName("Post3")]

public void Post3([FromBody]IDictionary<string, string> data)

{

// do something

}

Conclusion: IAM is not a canned component that can be slapped on all products without providing a rich set of integrations and leveraging the technology stacks available. While some vendors make it easy and seamless to use, the technology stack underneath is complex and implements a variety of design requirements.

Monday, July 12, 2021

Addendum:

This is a continuation of the article on NuGet packages. Their sources and resolutions.

The reference to having a single package source to eliminate source that have duplicate packages is enforceable by an organization to bolster the security and integrity of packages used to build the source code. The organizations can include a list of registries behind the feed that can be used to source packages both internal and external and enforcing that the developers use only one feed enables them to consolidate all requests through the controlled feed. This is a desirable pattern and one that alleviates concerns of uncontrolled packages from different sources and eventually polluting the source code asset of the organization.

The developers also have a lot of tools for this purpose. First, the NuGet executable allows listing of packages along with the source. The list command can be used to browse the packages in the remote folder.

Similarly, the local command can be used with the NuGet Executable to view all the local caches to which the packages were downloaded. This is a very useful mechanism for trial and error. A developer can choose to clear all the packages and reinitiate the download. This is useful to try with different package feed and sources and narrows down the problem space in half for troubleshooting package dependencies.

It is also possible to find out the dependency tree for the assemblies referenced via packages. Although this is not directly supported by the tool used to list the packages and their locations, it is easy to walk the dependencies iteratively until all the dependencies have been enumerated. Visited dependencies do not need to be traversed again. The site MyGet.org allows these dependencies to be visualized with reference to their feed but when drawing the dependency tree for a project, neither the compiler nor the NuGet executable provides that option as opposed to those available for other languages.

A sample method that relies on built-in functions to eliminate dependencies already visited looks somewhat like this:

static void OutputGraph(LocalPackageRepository repository, IEnumerable<IPackage> packages, int depth)

{

foreach (IPackage package in packages)

{

Console.WriteLine("{0}{1} v{2}", new string(' ', depth), package.Id, package.Version);

IList<IPackage> dependentPackages = new List<IPackage>();

foreach (var dependencySet in package.DependencySets)

{

foreach (var dependency in dependencySet.Dependencies)

{

var dependentPackage = repository.FindPackage(dependency.Id, dependency.VersionSpec, true, true);

if (dependentPackage != null)

{

dependentPackages.Add(dependentPackage);

}

OutputGraph(repository, dependentPackages, depth += 3);

}

Courtesy: Stackoverflow.com

If the visited needs to be tracked by caller, then the code would follow the conventional depth-first search:

DFS ( V, E)
For each vertex v in V
       V.color=white
       V.d = nil
Time = 0
For each vertex v in V:
       If v.color == white:
              DFS-Visit (V, E)

DFS-VISIT (V,E, u)
time = time + 1
   u.d = time
   u.color = gray
   foreach vertex v adjacent to u
        If v.color == white
           DFS-VISIT(V,E,v)
         Else
               If v.d <= u.d < u.f <= v.f throw back edge exception.
u.color = black
time = time + 1
u.f = time