Saturday, August 22, 2020

Object and array inlining

 Collections are frequently used in classes and they are part of standard library.  This additional layer between the business object and the array causes memory loads. Inlining reduces this overhead by placing the objects and arrays together in the heap and by replacing the memory accesses with offset address arithmetic. Object and array have headers and these are used for inlining. Dynamic arrays are given additional header fields and values. This might increase heap usage but it definitely reduces the execution time.

For example, a Polygon class can be defined as having a java.util.ArrayList of objects which corresponds to a dynamic list of Points. When the new elements are added and the size of the array does not suffice, the array is resized. The array elements reference the points that store the coordinates. There are two lookups for access to a point, first the loading of the field points and second the loading of the elementData. These two loads and one array access are unnecessary. Inlining combines the objects into a larger group such that a point can be loaded with a single access. 

The points reference is initialized once and the elementData can be resized many times. An intelligent technique can handle this inlining which goes above and beyond the regular inlining of fixed value objects.

Objects and array inlining operate on a group of objects and arrays in the heap that are in a parent-child relationship. These nest levels allow the reference of a child to be directly in the parent even if a parent has multiple children. There can be a hierarchy of such levels but the difference is only between object fields and array fields.

The bytecode is the same between array and object fields but the inlining is different between the two. The size of the object is known beforehand. The size of an array is not known until allocation. Hot fields are worth more for optimizing. The Just-in-time compiler is leveraged for this purpose which inserts read barriers that increment field access counters per field and class.

The inlining works when the following two conditions are met. First, the parent and child objects must be allocated together and the field store that places the reference of the parent in the child must happen immediately afterwards so that it remains true for the lifetime of the data structures that are collocated this way. Second, the field stores must not overwrite the field with a new value. 


Friday, August 21, 2020

Performance improvements using Objects and array inlining

Both objects and arrays can be inlined. Object inlining has been recognized for a while and array inlining expands the concepts of inlining to arrays. When we have a group of objects and arrays, that refer to each other, the execution environment has to load them in memory for their access. This is a waste if the objects and arrays can be collocated instead, where they are placed consecutively in memory. Then the memory loads can be replaced by address arithmetic which reduces the costs of field and array accesses.

The benefit with arrays is that the size of this data structure and the number of elements is known beforehand even though it might vary from array to array. The fields referring to those arrays have to be changed whenever the array gets relocated. This can happen all under the hood without the developer having to declare any additional syntax in the language.

A code pattern that detects these changes and allows the optimized access of such array fields would then be a significant innovation to boost the performance of the program. Some have already been published.  Researchers claim that the inlining of array element objects into an array is not possible without a global data flow analysis. This pattern can be integrated into the array bounds check during compilation. A dynamic approach for array inlining can be included into the compilation with little or no analysis overhead.

Collections are frequently used in classes and they are part of standard library.  This additional layer between the business object and the array causes memory loads. Inlining reduces this overhead by placing the objects and arrays together in the heap and by replacing the memory accesses with offset address arithmetic. Object and array have headers and these are used for inlining. Dynamic arrays are given additional header fields and values. This might increase heap usage but it definitely reduces the execution time.

For example, a  Polygon class can be defined as having a java.util.ArrayList of objects  which corresponds to a dynamic list of Points. When the new elements are added and the size of the array does not suffice, the array is resized. The array elements reference the points that store the coordinates. There are two lookups for access to a point, first the loading of the field points and second the loading of the elementData. These two loads and one array access are unnecessary. Inlining combines the objects into a larger group such that a point can be loaded with a single access. 

The points reference is initialized once and the elementData can be resized many times. An intelligent technique can handle this inlining which goes above and beyond the regular inlining of fixed value objects.



Thursday, August 20, 2020

JDK 11 migration

 The deployment stack pertaining to applets has been removed from JDK while this was marked deprecated in Java 9. This eliminates the need to support browsers for this stack. Java FX and Java Mission Control are available separately.

The arrays implementation in Java 11 gained performance from Aarch64 intrinsics. This is not as much as offloading to hardware as it is about inlining instructions so that the overall context swapping operations are reduced. Even mathematical functions like sin, cos and log are significantly faster with JDK 11 than JDK 8

Java 11 provides ‘nests’ This is an access control context which works with the nested types in Java programming language. It allows code that is compiled into different classes to not require bridge method to access each other’s private members. Java nested classes have often been used by developers to group classes in the same file assuming that they are part of the same “entity”.  This technique allows those classes to not require access methods for each other.

The Java 11 compiler generates the bridge methods automatically.  These methods appear in both the outer class as well as the inner class.  These methods appear like this: access$000(). The public methods on the inner class go through this method to call the outer class. An invocation of the private member is is compiled into instructions invoking the bridging method in the target class which in turn invokes the intended private method. Compiler generated bridging methods are not visible to reflection. 

The Compatibility between JDK 11 and JDK 8 has multiple levels. Typically, only the source compatibility is reviewed. If it compiles, it might still not be behaviorally equivalent. At the next level, even if it does have behavioral equivalency, it might not have binary preserving equivalency. Once there is binary preserving equivalency, the migration can be called equivalent. However, it still does not guarantee runtime compatibility.

Behavioral compatibility is defined as the compatibility that preserves the semantics of the code. The source level compatibility was merely about translating java code to classes. Binary compatibility is defined based on linkage. If it links one module and continues to link with another module, then the change made in the other module is binary compatible. Runtime compatibility is when all the modules are loaded and behave the same way.


Wednesday, August 19, 2020

Support for small to large footprint introspection database and query

 Distributed Collection agents

As with any store not just introspection store, data can come from different sources for the destination as the store. Collection agents for each type of source make it convenient for the data to be transferred to the store  

The collection agents do not themselves need to be monitored. The data they send can be lossy but it should arrive at the store. The store is considered a singleton local instance. It may not even be in the same system as the one it serves. The rest of the store may be global and shared but the data transfer from collection agent does not have to go directly to the global shared storage. If it helps to have the introspection store serve as the same local destination for all collection agents, the introspection store can be kept outside the global storage. In either case the streams are managed centrally by the stream store and the storage refers to tier 2 persistence.

Distributed Destinations:

Depending on the mode of deployment, the collection agents can be either lean or bulky. In the latter case, they come configured with their own storage so that all the events are batched under the resource restriction of the site where the agent is deployed. Those batched events can then be periodically pushed to the introspection store. This is rather useful when certain components of the system don’t even share the same cluster or host on which the streamstore is deployed. The collection agents are usually as close to the data source as possible so the design to keep them going regardless of whether the rest of the system is reachable or not is prudent given that certain sites might even be dark. Under such circumstance, the ability to propagate events collected remotely for introspection of data collection agents will be very helpful for administrators to use as and when they like.

Encryption

The operational data from any system are like logs and metrics. They don’t necessarily have any sensitive data as long as the producers are not publishing any data, there is very little effort required to secure the introspection store. But it is also important to ensure that the introspection data remains secure even when it is exported from the system. Key-certificates can help with encrypting the data so that they can be secured at rest and in transit when accessed offline to the system. In all the above examples, we have referred to the introspection store as local to the system. But if the introspection store were to be distributed and the data was propagating across the network, it would be helpful to secure the transit.  Even exports can be encrypted which can be incremental or full depending on the rsync algorithm setting to publish the changes to the store. The export of the data occurs in this case only when the system is online but frequently exporting the data whenever possible help keep consolidate data in a dedicated instance so that introspection store from all instances can then be collected as if it were coming to a network operations center.


Tuesday, August 18, 2020

Logging Configuration with DropWizard.

 

DropWizard is a library that web applications compile with as a dependency. It uses Jetty HTTP library to embed an HTTP server into the application. It uses Jersey for the interface that clients use to talk to the web server. It uses Jackson for JSON format.

It comes with a variety of configuration options that can all be used for setting up properties that the server uses to handle request and responses, logging etc. The logging configuration is especially important since application developers find that this library logs more than what they may have intended. Request logging is perhaps one such aspect and since requests may be frequently used internally by the application, the logs can grow in size by orders of magnitude.

The configurations are described well in the online documentation for the library but the application dependencies and behavior with the configuration cannot really be covered in the documentation. A developer has to resort to trial and errors to get it to work.

Among the configurations for logging, two aspects stand out primarily for the control over the verbosity of logging especially when the number of entries cannot be controlled by the application. These are logging level and logging filters. The level of the entries follows the same convention as for any other framework with graded levels. But the filters are typically implemented by the application and registered as classes so that library can log the entry if it meets a criterion or discard otherwise.

Some filters are automatically provided. These include a filter called URI which helps with the pattern matching of methods used by a client on the server’s interface.  All HTTP based methods are logged under the request logging scheme and are sometimes the biggest culprit in log bloat. The URI type used as the logging filter does not require any classes to be implemented by the Java application.

If this were as easy to follow for each and every application, the description of the caveats in this document might not be necessary. Among them, the first gotcha is that this particular configuration is not available until later versions of the library. Applications are required to upgrade transitive dependencies or force their versions even if one library version is updated. Each such version mismatch and consequent trial and error may result in a new exception when running the application requiring similar updates in other libraries or the inclusion of new ones.

When all the dependencies have been updated to their most recent version, the library may still reject certain configurations from being recognized including the URI logging filter. This is due to the discrepancy between the label used for describing the logging filter versus the implementations available in the classpath. Applications using build directives for compiling and distributing their artifacts may already be familiar with stale jars lingering between builds and particularly when a sub-project is built as opposed to the build from the project root folder. Configuration types and possible values are also not described completely requiring developers to go through the documentation, GitHub issues and fixes to be able to navigate and follow the resolution others may have determined.

These are some of the steps, an application developer can take to reduce the size of the logs.


Monday, August 17, 2020

Programmability using certificates on Kubernetes

 Automations involving certificates on Kubernetes:

The following article describes some of the libraries available to automate workflows involving X509 certificates on Kubernetes. They are typically used to configure endpoints with Transport Layer Security/TLS which adds the https to otherwise http protocol.  Configuring the tls parameters for a service involves a key and a certificate. The generation of certificates including self-signed ones has been tricky even for the common use case.

Problems encountered in secuing TLS include the following:

1) the format required by a Java application might be keystore/truststore based while the key/certificate are standard format

2) the certificates themselves may be self-signed and accepting the certificate on the client side is prerequisite.

3) the generation of a certificate involves a certificate signing request and Kubernetes has core support for creating a request but not a certificate

4) Libraries such as cert-manager that generate custom resource definitions to make the certificates easier to generate require a large number of native Kubernetes to be registered and this is not a straightforward single step in the automation as it is on the command line. A list of these resources can be found under https://github.com/jetstack/cert-manager/releases/download/v0.16.1/cert-manager.yaml

5) Each Kubernetes resource such as service account, role, custom resource and secret require corresponding plain old java objects to be declared for use with the automation. This makes portability of the code rather cumbersome.

5) The certificates generated often do not work because they have attributes that cannot be used in all cases. For example, the class name and subject alternative names are often a source for exceptions in setting up TLS.

6) the generation of key certificate pair has to be such that the domain names for the host on the network can be matched with what is specified in the certificate

7) The key-certificate pair have to be guarded as a Kubernetes secret and the secret cannot be shared across Kubernetes namespaces. This calls for the creation of key-certificate pair in each namespace used with an application.

8) The use of wild card characters to support multiple clients or hosts is not always permitted and requires special case handling.

9) the export of Kubernetes resources used to create a certificate and its import as plain old java objects is faciliated with the combination of Kubernetes openapi library and its corresponding gen library but the applications typically prefer such steps to be one time and manual since the resulting certificate is more portable.

10) Kubernetes makes it easy to change the parameters of a certificate but results in a new signing request and a corresponding redeployment which is easy to do from the command line and harder to do without the cert-manager convenience.

Applications that are restricted from redistributing cert-manager are left to use ready-made certificates. Fortunately, it is easier to pass around certificates within the namespace across pods and services as a Kubernetes secret. 



Sunday, August 16, 2020

Migration to Java 11

 The following section now discusses a few APIs from the JDK 11, their benefits and the preparatory work.

The Arrays api includes a faster and more powerful data structure. All the methods for manipulating arrays including sorting and searching are available as earlier. The sorting algorithm used is Dual-Pivot Quicksort by Vladimir Yaroslavskiy, Jon Bentley and Joshua Bloch. This algorithm has the same O(nlogn) as in the family of quicksort algorithms but is faster than traditional ones for a broader workload.

The traditional algorithm consisted of a single pivot where all elements less than the pivot come before the element and all elements after the pivot come after it and the sub-arrays are recursively sorted.

With the dual pivot variation, there are two pivots which are chosen as say the first and the last element. The pivots have to be sorted otherwise they are swapped.  The range between the pivots is broken down into non-overlapping sub-ranges denoted by sentinels at index L, K and G progressively between the far left and the far right.

The subrange between left+1 to L –1 have elements less than P1

The subrange between L to K-1 have elements >= P1 and <= P2

The subrange between K to G can have any arbitrary remaining elements

The subrange between G+1 to right-1 have elements > P2

This way the arbitrary elements in the third sub-range above is shrunk by placing the element in one of the other three sub-ranges after comparing it with the two pivots. The L, K and G are advanced as the third sub-range is shrunk.

The first pivot element is then swapped with the last element of the first sub-range above.

The second pivot element is then swapped with the first element of the last sub-range above.

The steps are repeated recursively for each sub-range.

The key thing to note here is that for relatively large arrays the complexity remains somewhat similar but everyday programmers generally use this algorithm for small array sizes. The authors took the approach that with a threshold for array length as 27, insertion sort is preferable over all other sorting methods. In JDK 8, this threshold was set to 47.

Arrays are definitely an improvement but the methods are similar to what was available in Java 8.

On the other hand, there are new methods available as well. For example, String methods are available in String methods. For example, IsBlank() method tests if a string is empty. Lines() method splits a string based on new line character. Repeat and strip methods are similar to those other languages 

These methods add more than just convenience because the library works better than in the previous version. 

If a pattern matches with input string, the asPredicate method can create a predicate. 

JavaEE and CORBA modules were deprecated in Java 9 and they have been removed in Java 11. The thread functions of stop and destroy have been removed from Java 11.

Java 10 introduced local ‘var' syntax and Java 11 allows ‘var’ to be used in Lambda expressions.

The deployment stack pertaining to applets has been removed from JDK while this was marked deprecated in Java 9. This eliminates the need to support browsers for this stack. Java FX and Java Mission Control are available separately.