Cluster computing

Sunday, April 19, 2020

Java versus Kotlin continued...

The Kotlin language has plenty of new syntax that follow parallels I other newer development language. For example, we can use var and val keywords where var is used for mutable properties and val is used for read-only properties. The getters and setters are provided by default

Kotlin classes have primary and secondary constructors. The primary constructor does not have any code but may have decorations, visibility modifiers and type parameters. Code can be placed inside the initializer blocks. The secondary constructors have to delegate to the primary constructor usually as the first statement so that the initializer block gets executed before the implementation.

Kotlin allows implementations to be delegated via delegation pattern that replaces implementation inheritance with zero boilerplate code. A derived class can implement an interface by delegating all of its public members to a specified object. This is independent from overrides.

Type inference for variables and property types is automatic. New symbols, methods, keywords and constants make it very easy to declare and use variables.

Slight modification of a class does not require a new subclass. Instead, we can use object expressions and object declarations. Object expressions take an object parameter of an anonymous class usually derived from some type or types and overrides the methods associated with that type or types. Object declarations are used with singletons where the declaration is much simpler than in other languages. It uses the object keyword followed by the class name and the implementation.

Saturday, April 18, 2020

Java versus Kotlin:
Kotlin brings a ton of new features over Java such as Lambda expressions, extension functions, smart casts, String templates, primary constructors, first-class delegation, type inferences, singletons, range expressions, operator overloading, companion objects and coroutines.
Lambda expressions are just like functions. Kotlin functions are first class which allow them to passed like parameters. A function that receives such parameters is a higher order function. A Lambda function can be instantiated within a function literal. An anonymous function has no name. Function types can be instantiated by callable reference.
The compiler can infer the function types for variables. A function type can be invoked the invoke operator. Inline functions provide flexible control.
Together lambda expressions and inline controls provide highly performant control structures. Next, even a class can be extended without having to inherit or using a decorator. This is done via extensions. The Extension functions are easy to spot with the ‘this’ parameter passed in. They are dispatched statically.
Kotlin also provide ‘is’ and ‘as’ operators for type checking and casts. The former operator allows us to check whether an object conforms to a given type. The ‘as’ operator also called the infix operator, is seldom used and often used implicitly with the ‘is’ operator making the casts a whole lot smarter. The infix operator is most likely used in unsafe casts.
Type safety for generics can be enforced as compile time with Kotlin, while at runtime instances of the runtime holds no information. The compiler prohibits type conformance where type erasure may occur.
String literals are another useful feature for Kotlin. A String literal may contain template expression which involves a piece of code usually beginning with a dollar sign.

Friday, April 17, 2020

Java versus Kotlin:
Both Kotlin and Java are statically typed language. Kotlin is newer with official release in 2016 as opposed to official release in 1995. Languages based on JVM can be compiled to JavaScript. Kotlin requires a plugin and can work with existing Java stack.
Kotlin offers a number of advantages over Java. It is definitely terse and more readable. It overcomes Java’s limitations for null references that is controlled by the type system. The NullPointerExceptions can be eliminated for the most part from the language with the help of Kotlin with some exceptions for overt calls and data consistency. Kotlin provides a safe call operator denoted by ‘?.’ that accesses the member of an instance only when the instance is not null.
Kotlin is designed with Java interoperability and enables smooth calls to all methods and properties by following a convention that cuts down code. Since java objects can be null, all objects originating from Java are treated as platform types and all safety guarantees are the same as in Java. Annotations help with providing nullability information for type parameters.
Kotlin uses Array as invariants which prevent assigning of a typed array into another of projected type. Primitive type arrays are maintained without boxing overhead.
It uses a family of function types that have a special notation corresponding to the signatures of the functions involving parameters and return values such as (A, B) -> C. This notation also support a receiver type where an object receives the parameter passed in. There is also support for Suspending functions. Kotlin supports Single Abstract Method aka SAM conversions which are implemented as an interface with a single abstract method. Kotlin function literals can be automatically converted into implementations of Java interfaces with a single non-default method. This can be used to create instances of SAM interfaces.
Kotlin does not support checked exceptions. Many believe that checked exceptions lead to decreased productivity with no significant improvement to code quality. In fact, some call it an outright mistake.
The above comparison makes it equally easy to enumerate what Java has that Kotlin does not. These include checked exceptions, primitive types that are not classes, static members, non-private fields, wildcard types and ternary operator.
Kotline brings a ton of new features over Java such as Lambda expressions, extension functions, smart casts, String templates, primary constructors, first-class delegation, type inferences, singletons, range expressions, operator overloading, companion objects and coroutines.
Sample have implementation: https://1drv.ms/w/s!Ashlm-Nw-wnWrwRgdOFj3KLA0XSi

Thursday, April 16, 2020

Events are preferred to be generated once even if they go to different destinations. This is the principle behind the appender technique used in many software development project. This technique is popularly applied to logging where the corresponding library is called log4j. Software components write to log once regardless of the display or the storage of the logging entries. An appender is simply a serializer of events with the flexibility to send to different destinations simultaneously. These entries are sent to the console or file or both usually. With the popularity of web accessible blobs and continuously appended streams, we have new log4j destinations.
The implementation of the custom appender involves extending the well-known logback AppenderBase class and overrides the methods to start, stop and doAppend which takes the data to be appended to the target. The start and stop help with methods to initialize the writer to the stream store and for proper cleanup when the jar is unloaded. In terms of a data structure, this is the equivalent of the initialization of a data structure and the method to add entries to the data structure. The latter is the actual logic of handling an event. Usually this appender is annotated as a Plugin and the registration method is annotated with the PluginFactory decoration
The Appender is a runtime dependency usually or if necessary, as a compile time dependency such as when certain properties might be set on the appender only via code not declaration. The bean for the appender describes all the properties required for the start() method to succeed. For example, it defines the stream name, the scope name and the controller URI. These parameters alone are sufficient to instantiate the Appender.
Finally, the append method of the appender will invoke the write event on the writer. This involves making a write call to the stream store. Sophisticated implementations can allow logfilters and caches to be implemented with this appender. It can also allow asynchronous processing of the entries. The base class for the appender implementation was chose as AppenderBase but it can extend other derived logback appender classes as appropriate including the ones that help with Asynchronous processing.

Tuesday, April 14, 2020

This is a code sample for the suggestion made in the article about writing log4j appenders for stream store:
public class StreamAppender extends AppenderBase<ILoggingEvent> {
private static final String COMPLETION_EXCEPTION_NAME = CompletionException.class.getSimpleName();
private static final String EXECUTION_EXCEPTION_NAME = ExecutionException.class.getSimpleName();
private final static String CONTROLLER_URI = "CONTROLLER_URI";
private final static String SCOPE_NAME = "SCOPE_NAME";
private final static String STREAM_NAME = "STREAM_NAME";

public String scope;
public String streamName;
public URI controllerURI;
private StreamManager streamManager;
private EventStreamClientFactory clientFactory;
private EventStreamWriter<String> writer;

public StreamAppender(String scope, String streamName, URI controllerURI) {
this.scope = scope;
this.streamName = streamName;
this.controllerURI = controllerURI;
}

@Override
public void start() {
final String scope = getEnv(SCOPE_NAME);
final String streamName = getEnv(STREAM_NAME);
final String uriString = getEnv(CONTROLLER_URI);
final URI controllerURI = URI.create(uriString);

this.scope = scope;
this.streamName = streamName;
this.controllerURI = controllerURI;
init();
super.start();
}

@Override
public void stop() {
if (writer != null) writer.close();
if (clientFactory != null) clientFactory.close();
if (streamManager != null) streamManager.close();
super.stop();
}

private static String getEnv(String variable) {
Optional<String> value = Optional.ofNullable(System.getenv(variable));
return value.orElseThrow( () -> new IllegalStateException(String.format("Missing env variable %s", variable)));
}

private void init() {
StreamManager streamManager = StreamManager.create(controllerURI);

StreamConfiguration streamConfig = StreamConfiguration.builder()
.scalingPolicy(ScalingPolicy.fixed(1))
.build();
streamManager.createStream(scope, streamName, streamConfig);
clientFactory = EventStreamClientFactory.withScope(scope, ClientConfig.builder().controllerURI(controllerURI).build());
writer = clientFactory.createEventWriter(streamName,
new JavaSerializer<String>(),
EventWriterConfig.builder().build());
}

//region Appender Implementation

@Override
public String getName() {
return "Stream Appender";
}

@Override
public void append(ILoggingEvent event) throws LogbackException {
if (event.getLevel() == Level.ERROR) {
recordEvent("error", event);
} else if (event.getLevel() == Level.WARN) {
recordEvent("warn", event);
}
}

private void recordEvent(String level, ILoggingEvent event) {
IThrowableProxy p = event.getThrowableProxy();
while (shouldUnwrap(p)) {
p = p.getCause();
}
if (writer != null) {
writer.writeEvent(level, event.getMessage());
}
}

private boolean shouldUnwrap(IThrowableProxy p) {
return p != null
&& p.getCause() != null
&& (p.getClassName().endsWith(COMPLETION_EXCEPTION_NAME) || p.getClassName().endsWith(EXECUTION_EXCEPTION_NAME));

}

//endregion

}

The sample above refers to opening the stream each time to write an event. This may be avoided with doing it once on the instantiation of the bean.

Monday, April 13, 2020

Data traffic generators:
Storage products require a few tools to test how the products behave under load and duress. These tools require varying types of load to be generated for read and write. Standard random string generators can be used to create such data to store in files, blobs or streams given a specific size of content to be generated. The tool has to decide what kind of payload aka events to generate and employ different algorithms to come up with such load.
These algorithms can be enumerated as:
1) Constant size data traffic: The reads and writes are of uniform size and they are generated in burst mode where a number of packets follow in quick succession filling the pipeline between the source and destination.
2) Hybrid size data traffic: The constant size events are still generated but there are more than one constant size generators for different sizes and the events generated from different constant size generators are serialized to fill the data pipeline between the source and the destination. The different size generators can be predetermined for t-shirt size classification.
3) Constant size with latency: There is a delay introduced between events so that the data does not arrive at predictable times. The delays need not all be uniform and can be for random duration. While 1) allows spatial distribution of data, 3) allows temporal distribution of data.
4) Hybrid size with latency: There is a delay introduced between events from different generators as it fills the pipeline leading to both the size and the delay to vary randomly simulating the real-world case for data traffic. While 2) allows spatial distribution of data, 4) allows temporal distribution of data.
The distribution of size or delay can use a normal distribution which leads to the middle values of the range to occur somewhat more frequently than the outliers and a comfortable range can be picked for both the size and the delay to vary. Each event generator implements it strategy and the generators can be switched independently by the writer so that different loads are generated. The tool may run forever which means they do not need to stop unless interrupted.
The marketplace for tools already has quite a few examples for this kind of load generation and are referred to as packet generators, T-Rex for data traffic, or torture tools for driving certain file system protocols. Most of these tools however do not have an independent or offloaded load generator and are tied to the tool and purpose they are applied for limiting their usage or portability to other applications
One of the best advantages of separating event generation into its own library is that they can be used in conjunction with a log appender so that the target can vary at runtime. The target can be console if the data merely needs to appear on the screen without any persistence, or it can be a file, blob or stream. The appender also allows events to be written simultaneously to different targets leading to a directory of different sized files or a bucket full of different sized objects and so on. This allows other tools to work in tandem with the event generators as upstream and downstream systems. For example, duplicity may take the events generated as input for subsequent data transfer from a source to destination.
Sample code for event generator is included here: https://github.com/ravibeta/JavaSamples/tree/master/EventGenerator

Sunday, April 12, 2020

Writing a unified Log4j appender to Blobs, Files and Streams:
Events are preferred to be generated once even if they go to different destinations. This is the principle behind the appender technique used in many software development project. This technique is popularly applied to logging where the corresponding library is called log4j. Software components write to log once regardless of the display or the storage of the logging entries. An appender is simply a serializer of events with the flexibility to send to different destinations simultaneously. These entries are sent to the console or file or both usually. With the popularity of web accessible blobs and continuously appended streams, we have new log4j destinations.
This article explains how to write an appender for blobs, files and streams at the same time:
The first step is the setup. This involves specifying a configuration each for blob, stream and file. Each configuration is an appender and a logger. The set of configurations appear as a collection. The appender describes the destination, its name, target and pattern/prefix to be used. It can also include a burst filter that limits the rate. The Logger determines the verbosity and characteristics of the log generation. Specifying this configuration each for blob, file and stream makes up the setup step.
The second step is the implementation of the custom appender which we may need to pass in to the application as a runtime dependency usually or if necessary, as a compile time dependency such as when certain properties might be set on the appender only via code not declaration. The custom appender extends the well-known log4j AbstractAppender class and implements the method to register itself as an appender alongwith the method to take data to be appended to the target. In terms of a data structure, this is the equivalent of the initialization of a data structure and the method to add entries to the data structure. The latter is the actual logic of handling an event. Usually this appender is annotated as a Plugin and the registration method is annotated with the PluginFactory decoration
With these two steps, the appender is ready to be used with an application that wants to log to blob, file and stream all at the same time. This application will refer to the appender in its configuration by the same name that the plugin annotation was defined with. The logger defines the logging level to be used with this appender.
The above listed only the basic implementation. The appender can be made production ready with the help of improving reliability with error handling. It can override the custom error handler in this case which generates errors for the appender such as when used with a logging level less than that specified in the configuration.
Finally, the appender should follow the architectural standard set by the Application Block paradigm so that the implementation never interferes with the functionality of the applications generating the events.
Events can easily be generated with: https://github.com/ravibeta/JavaSamples/tree/master/EventGenerator