Cluster computing

Monday, September 7, 2020

A rule of thumb for combining other federated resources such as identity:

Certain functionalities are not core to the stream store in terms of data plane activities and can be offloaded to third party solutions. For example, Identity and Access management (IAM) solution can provide single-sign-on features across all stream stores when they are connected to the active directory. This ability does not have to be part of individual stream stores. Federated security therefore provides a mechanism with which a client needs to register only once and is then free to use any stream store within the federation.

Even if there is no integrated Authentication, Authorization and Auditing (AAA) provided by the IAM solution, the client can merely expect a key-secret pair that can work with each and every stream store in the federation for a given identity. With a key-secret, a stream store can easily perform all permission grants to perform the data and control plane activities.

The delegation of AAA also helps with key rotations. This is the case when an external key manger is involved. A key manager generates a private key and certificate that can be used to secure the storage containers. They are helpful when the data containers have to be encrypted but they do not generally participate in identity.

The point is that the form of identity representation is independent of its provisioning by an entity external to the key stores or their federation but provided by an identity federation service that sits within the realm. The forms can be username-passwords, key-secrets, X.509 certificate requests, one-time passcodes or other options and they are not mutually exclusive. Their operations can be exclusive to the stream store that refers to the procurement of the token to the identity federation service. With a form of valid representation of an identity, each and every participating stream store can then continue with the provisioning of container resources and data traffic.

The federation of services and resources outside the stream store also promotes the functionality of a gateway which enables administrators to allow or deny certain access globally.

Sunday, September 6, 2020

Object and array inlining

Object and array inlining can significantly tune up the performance of a Java application. Array inlining is explored more in this discussion as compared to related work on both Object and Array fields. There are three types of inlining: fixed array inlining, variable array inlining, and dynamic array inlining. Fixed array inlining is one where the array fields can be handled the same way as object fields because the length is constant. Variable array inlining is one where the array fields have different but fixed lengths. Dynamic array inlining is one where the array field is assigned multiple times. The inlining works when the following two conditions are met. First, the parent and child objects must be allocated together and the field store that places the reference of the parent in the child must happen immediately afterwards so that it remains true for the lifetime of the data structures that are collocated this way. Second, the field stores must not overwrite the field with a new value. The inlining saves one instruction at a time which is the load of the inlined field. The tradeoff between the benefits of inlining or not is demonstrated by the length of the time interval between the overwrite of the fields and the garbage collection which restores the optimized field order. The garbage collector is based on aging and compaction of heap allocations. The younger generation is collected frequently since they are not used on a prolonged basis. The stop-and-copy collector copies young objects between two alternating spaces and it increments a field for age in each activity. Objects and array inlining are made on a field by field basis at run time. The analysis overhead has to be small and the speedup has to be big so that the optimization is worth it. Among the approaches in such analysis, a statistic that places this in context of the global total of all the fields in all loaded classes makes the criterion clearer than just local analysis.

Saturday, September 5, 2020

Distributed Stream Stores.

This is in continuation with the previous post.

Appendix:

Federation is not a new concept and its application has varied from federated security via service bus to federated identity via Identity and Access Management solutions. The software industry has yet to formalize a standard for the federated model and therefore it is also yet to be adopted for true federated scenarios in the commercial applications. The benefits are already available to see from the existing deployments.

Benefits of federation:

The federation improves the posture of each and every participating store in terms of a more streamlined access where the traffic comes from a known point.

The federation provides opportunity to bring consistency in the control plane while the data traffic remains unhindered

The federation increases automation capabilities because each and every store will behave the same way for workflows that are well scripted

The federation provides an opportunity to add intelligence to traffic routing with side-effects of load-balancing, improvements in service level agreements and annotations across the board.

The federation is the only way to meet compliance, governance and regulatory standards as set for the industry and adopted by each organization.

In all the advantages mentioned earlier to this, federation provides the opportunity to alleviate the corresponding concerns from each of the store.

Federation goes beyond centralization. It gives new meaning and nomenclature to the storage containers within the storage products.

Federation does not dictate vendor lockin for any one store. It allows those stores to be independent in their technology as long as they support the federation requirements.

Federation does not help with the patching or maintenance of individual stores. It provides new syntax and semantics of both the data and the communication to the stream stores such that they can be differnet from the existing usages of the stream store.

Federation provides an opportunity for the clients to integrate other forms of federation such as federated identity and federated security so that the stores can continue to operate without the onus of managing it by default.

Federations is an opt in feature. It gives an opportunity to correct the shortcomings of the existing.

Benefits of chaining:

Chaining also provides benefits that are cited above even if it does not showcase a centralized entity providing new features to the clients. It is flexible to promote any architecture overlayed on chaining individual stores and it allows fast relaying of handoffs with fallback options. There are ways in which chaining helps each store remain independent in its operations without imposing any restrictions or lock down that federated membership might require.

Friday, September 4, 2020

Distributed stream stores

Federated and Chained Stream stores:

Introduction: A Stream store is a limitless continuous storage for data ingestion traffic from devices. As such, it can scale to large loads with their throughput and latency requirements. It is also convenient for an administrator to setup and manage one. But deployments rarely tend towards global solutions. Instead, different departments want to maintain, own and run their instances. Sometimes a swarm of stores may help split the traffic with some naming or routing convention. These instances must behave in a co-ordinated manner. The following document proposes two different approaches.

1. Federated deployments: In the Federated stores, we can have a matrix deployment model where in the local store is translated to a canonical participation convention and exported such as with an open public view or closed federated view. In the latter case, the view written and maintained by administrators and global in nature can still be visible externally with access to specific user groups.

Federated schema is usually built with consensus. And data interchange standards complement federated schema. As an example, XML is a syntax for federated schema and data in federated schema even if the stores are in a flat organization under the root element. This category has support of major IT vendors.

Query processing also requires some features for distributed stores. Data transfer strategies now need to consider communications cost besides CPU cost and I/O cost. Query decomposition and allocation between stores should also be determined. Consider the case when data is at site 1 and site 2 and query is at site 3. In such cases, we could send both stream events to site 3 or send one stream events to Site 2 and then the result to site 3 or send the other stream events to site 1 and then the result to site 3.

2. Chained deployments: In this case, the deployments have to be chained so that if the processing one store say times out, it can proceed to another. This model is favored for distributed queries because the stores are linked even if they don’t have redundant data. Most interactions between stores is in the form of requests and responses. These may have traverse through multiple layers before they are authoritatively handled by the store and stream. Relays help translate requests and responses between layers. They are necessary for making the request processing logic modular and chained. if the current stream store does not resolve the stream for a event located in its streams, is it possible to distribute the query to another stream store. The resolver merely needs to forward the queries that it cannot answer to a default pre-registered outbound destination. In a chained stream store, the queries can make sense simply out of the naming convention and say if a request belongs to it or not. If it does not, it simply forwards it to another stream store. This is somewhat different from the naming convention technique which may or may not have any interpretable part that can determine the site to which the object belongs. The linked stream store does not even need to take time to determine that the local instance is indeed the correct recipient. It can merely translate the address to know if it belongs to it with the help of a registry. This shallow lookup means a request can be forwarded faster to another linked stream store and ultimately to where it may be guaranteed to be found. The Linked stream store has no criteria for the stream store to be similar and as long as the forwarding logic is enabled, any implementation can exist in each of the stream store for translation, lookup and return. Unlike the matrix approach which might require hashes and finding the store based on the access, the client can be blissfully ignorant of where the data resides or which stores answers the call. Whether we use routing tables or a static hash table, the networking over the chained stream stores can be its own layer facilitating routing of events to correct stream store.

Conclusion: The federated and chained deployments are patterns found in many organizational structures. Their application to storage products and particularly stream store becomes novel.

Thursday, September 3, 2020

Object and array inlining

There are other global optimizations possible for object inlining. One uses a cheap interprocedural analysis for object inlining. For example, arrays are only inlined when the length is known and unchanged during compilation. This was implemented in Swift which is a flavor of Java compiler for Alpha architecture.

Another approach modifies the object copying order of the garbage collector which improves the cache performance but it has little or no effect on field loads. Another approach, known as the online object reordering, optimizes all the fields accessed frequently. Another approach, known as the cache conscious data placement, uses run-time counters but does not distinguish between different fields of the same object. A pointer-analysis approach has an algorithm that guesses the possible run-time values of pointers but this one is unsuitable for object and array inlining because it works only with the location a variable may point to. But such location information is valuable when it is precise and so there is another approach that collects the points-to-mapping for all instances. This can be useful to flatten multi-dimensional arrays.

Array inlining is not possible without a global data flow analysis because it can only optimize array fields.

A summary of the above discussion now follows. Object and array inlining can significantly tune up the performance of a Java application. Array inlining is explored more in this discussion as compared to related work on both Object and Array fields. There are three types of inlining: fixed array inlining, variable array inlining, and dynamic array inlining. Fixed array inlining is one where the array fields can be handled the same way as object fields because the length is constant. Variable array inlining is one where the array fields have different but fixed lengths. Dynamic array inlining is one where the array field is assigned multiple times. The inlining works when the following two conditions are met. First, the parent and child objects must be allocated together and the field store that places the reference of the parent in the child must happen immediately afterwards so that it remains true for the lifetime of the data structures that are collocated this way. Second, the field stores must not overwrite the field with a new value. The inlining saves one instruction at a time which is the load of the inlined field. The tradeoff between the benefits of inlining or not is demonstrated by the length of the time interval between the overwrite of the fields and the garbage collection which restores the optimized field order. The garbage collector is based on aging and compaction of heap allocations. The younger generation is collected frequently since they are not used on a prolonged basis. The stop-and-copy collector copies young objects between two alternating spaces and it increments a field for age in each activity. Objects and array inlining are made on a field by field basis at run time. The analysis overhead has to be small and the speedup has to be big so that the optimization is worth it. Among the approaches in such analysis, a statistic that places this in context of the global total of all the fields in all loaded classes makes the criterion clearer than just local analysis.

Wednesday, September 2, 2020

Object and array inlining

A brief overview of related work is included below. A static compiler for a variation of C++ was extended earlier to perform automatic object inlining. Its algorithm clones the code of methods that accesses optimized objects. That approach is different from the above mentioned in that all objects of a class are inlined when it meets a criterion. Also, that algorithm moves the object headers and pointers to inlined objects which reduces the object size. It can also convert an array of references to an array of object values. The approach described here is from the paper on Automatic Array Inlining in Java Virtual Machines by Wimmer et al and this approach can inline dynamic arrays while the earlier one does not.

Another example is from object inlining in the CoSy compiler construction framework, a static compiler for Java where the replacement of child objects with new one can be detected. The approach described in this document requires that the objects not be references by anything else other than the parent object.

Object inlining also has also shown four different classes of inlinable fields. The related work mentioned above also provide their benchmarks. But array fields are not evaluated. In fact, this approach has differentiated itself in its treatment of arrays with constant and arrays with variable length. Inlining of array elements is seldom evaluated.

There is a more aggressive approach than object inlining which is called object combining. It optimizes unrelated objects if they have the same lifetime. It also allows the garbage collector to free multiple objects simultaneously. It allows the garbage collector to free multiple objects simultaneously. The elimination of the pointer accesses is performed separately. This reduces the overhead of memory allocation and deallocation and is preferred for mark and sweep garbage collectors. In comparison to this approach, the emphasis has shifted from arrays and instructions to allocations and size.

Array inlining is not possible without a global data flow analysis because it can only optimize array fields.

Tuesday, September 1, 2020

Object and array inlining

Objects and array inlining are made on a field by field basis at run time. The analysis overhead has to be small and the speedup has to be big so that the optimization is worth it. A heuristic approach to determining this criterion is not sufficient because it is local to the array in nature. The example discussed earlier that uses the access count and a threshold can determine a selection of candidates. But only with a statistic that places this in context of the global total of all the fields in all loaded classes does the criterion become clearer. A count from the global puts into perspective the usefulness of the repetitions of the eliminating field loads by focusing on where they matter more rather than routinely performing each and every one of them until the ones that matter don’t get the optimization.

There are approaches different from Statistics as well which include neurons and a load embedding layer that can perform even better than statistics using neural net and classifiers since each field is isolated. However, the effort to bootstrap, load and accumulate the passes even when done in batches may amount to prohibitive usage of resources of the runtime. Dynamic grouping on the other hand can be an alternative approach to determine the criterion by evaluating groups not with classifiers but with regressors.

This leads us to a brief overview of related work. A static compiler for a variation of C++ was extended earlier to perform automatic object inlining. Its algorithm clones the code of methods that accesses optimized objects. That approach is different from the above mentioned in that all objects of a class are inlined when it meets a criterion. Also, that algorithm moves the object headers and pointers to inlined objects which reduces the object size. It can also convert an array of references to an array of object values. The approach described here is from the paper on Automatic Array Inlining in Java Virtual Machines by Wimmer et al and this approach can inline dynamic arrays