Cluster computing

Friday, August 7, 2020

Support for small to large footprint introspection database and query.

Introspection is runtime operational states, metrics, notifications and smartness saved from the ongoing activities of the system. Since the system can be deployed in different modes and with different configurations, this dedicated container and query can take different shapes and forms. The smallest footprint may merely be similar to log files or json data while the large one can even open up the querying to different programming languages and tools. Existing hosting infrastructure such as container orchestration framework and etcd database may provide excellent existing channels of publications that can do away with this store and query altogether but they rely highly on network connectivity while we took the opportunity to discuss that Introspection datastore does not compete but only enhances offline production support.

There are ways to implement the introspection suitable to a size of deployment of the overall system. Mostly these are incremental add-ons with the growth of the system except for the extremes of the system deployments of tiny and Uber. For example, while introspection store may save runtime only periodically for small deployments, it can be continuous for large deployments. The runtime information gathered could also be more expansive as the size of the deployment of the system grows. The gathering of runtime information could also expand to more data sources as available from a given mode of deployment from the same size. The Kubernetes mode of deployment usually has many deployments and statefulsets and the information on those may be available from custom resources as queried from kube-apiserver and etcd database. The introspection store is a container within the storage product so it is elastic and can accommodate the data from various deployment modes and sizes. Only for the tiniest deployments where repurposing a container for introspection cannot be accommodated, a change of format for the introspection store becomes necessary. On the other end of the extreme, the introspection store can include not just the dedicated store container but also snapshots of other runtime data stores as available. A cloud scale service to virtualize these hybrid data stores for introspection query could then be provided. These illustrate a sliding scale of options available for different deployment modes and sizes.

Introspection scope versus stream.

This section focuses on dedicating different levels of the container hierarchy in the stream store for the purpose of introspection. In the earlier section, the size of the store was put on a sliding scale to suit the needs of the store. There was no mention of the container. In a stream store, this is always a stream but the choice for using different streams for different event types remains available. If there is only one stream, all the events are of the same type and it is up to the producer and consumer to use that type. At the same time, the producer and consumer do not always from the system since it can include different components and each component can submit a type of event into the stream. The separation of event types by streams implies that the different event types can be treated differently for the purpose of retention and stream truncation. The consumer usually sees only a singleton introspection store either directly or via an interface per instance of the product.

Troubleshooting.

The introspection store helps with troubleshooting the product but like any software it may malfunction. The store itself does not provide any ability to troubleshoot the information collection and that can be discerned from the logs of the publishers.

Introspection analytics

A Flink job might be dedicated to perform periodic analysis of introspection data or to collect information from sensors. The job can also consolidate data from other sources that are internal to the stream store and hidden from users.

Batching and statistics are some of the changes with which the analytics job can help. Simple aggregate queries per time window for sum(), min(), max() can help make more meaningful events to be persisted in the stream store.

The FlinkJob may have network connectivity to read events from external data stores and these could include events published by sensors. Usually those events make their way to the stream store irrespective of whether there is an introspection store or analytics job or not in the current version of the system. In some cases, it is helpful for the analytical jobs to glance at backlog and rate of accumulation in the stream store as overall throughput and Latency for all the components taken together. Calculation and persistence of such diagnostics events is helpful for trends and investigations later.

The use of Flink job dedicated to introspection store immensely improves the querying capability. Almost all aspects of querying as outlined in the Stream processing of Apache Flink by O’Reilly Media can be used for this purpose

Thursday, August 6, 2020

Support for small to large footprint introspection database and query.

Introspection scope versus stream.

Wednesday, August 5, 2020

Stream store with tls continued

A few more exceptions encountered in trying out the change above include:

1) Caused by: java.lang.IllegalStateException: Expected the service ZKGarbageCollector [FAILED] to be RUNNING, but the service has FAILED

at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:366)

at com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:302)

at io.pravega.controller.store.stream.PravegaTablesStreamMetadataStore.<init>(PravegaTablesStreamMetadataStore.java:77)

at io.pravega.controller.store.stream.PravegaTablesStreamMetadataStore.<init>(PravegaTablesStreamMetadataStore.java:67)

at io.pravega.controller.store.stream.StreamStoreFactory.createStore(StreamStoreFactory.java:37)

at io.pravega.controller.server.ControllerServiceStarter.startUp(ControllerServiceStarter.java:230)

at com.google.common.util.con

2) Caused by: java.security.cert.CertificateException: found no certificates in input stream

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:98)

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:64)

at io.netty.handler.ssl.SslContext.toX509Certificates(SslContext.java:1071)

at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:180)

3) java.io.IOException: Invalid keystore format

at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:658)

at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:56)

at sun.security.provider.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:224)

at sun.security.provider.JavaKeyStore$DualFormatJKS.engineLoad(JavaKeyStore.java:70)

at java.security.KeyStore.load(KeyStore.java:1445)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.getTrustManager(ZooKeeperServiceRunner.java:220)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.waitForSSLServerUp(ZooKeeperServiceRunner.java:185)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.waitForServerUp(ZooKeeperServiceRunner.java:164)

at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.start(ZooKeeperServiceRunner.java:109)

at io.pravega.local.InProcPravegaCluster.startLocalZK(InProcPravegaCluster.java:210)

at io.pravega.local.InProcPravegaCluster.start(InProcPravegaCluster.java:182)

at io.pravega.local.LocalPravegaEmulator.start(LocalPravegaEmulator.java:153)

at io.pravega.local.LocalPravegaEmulator.main(LocalPravegaEmulator.java:128)

And the options tried out included:

-Djavax.net.ssl.trustStore=/etc/secret-volume/client1.truststore.jks

-Djavax.net.ssl.trustStorePassword=password

Finally, a set of working files were mounted with and deployed with the operator as an option:

$ kubectl create secret generic controller-tls \

--from-file=./controller01.pem \

--from-file=./ca-cert \

--from-file=./controller01.key.pem \

--from-file=./controller01.jks \

--from-file=./password

$ helm install pravega charts/pravega --set zookeeperUri=zookeeper-client:2181 --set bookkeeperUri=bookkeeper-bookie-headless:3181 --set storage.longtermStorage.filesystem.pvc=pravega-tier2 --set controller.security.tls.enable=true --set controller.security.tls.server.certificate.location=/etc/secret-volume/controller01.pem --set controller.security.tls.server.privateKey.location=/etc/secret-volume/controller01.key.pem --set pravegaservice.security.tls.enable=true --set pravegaservice.security.tls.server.certificate.location=/etc/secret-volume/segmentStore01.pem --set pravegaservice.security.tls.server.privateKey.location=/etc/secret-volume/segmentStore01.key.pem --set tls.secret.controller=controller-tls --set tls.secret.segmentStore=segmentstore-tls

Tuesday, August 4, 2020

Stream Store With TLS (continued)

Sample Implementation:

+ public CompletableFuture<Object> setupTLS() {

+ V1ConfigMap configMap = Futures.getThrowingException(k8sClient.getConfigMap(PRAVEGA_CONTROLLER_CONFIG_MAP, NAMESPACE));

+ return k8sClient.deleteConfigMap(PRAVEGA_CONTROLLER_CONFIG_MAP, NAMESPACE)

+ .thenCompose(v -> k8sClient.createConfigMap(NAMESPACE, addDefaultTlsConfiguration(configMap)));

+ }

+ private V1ConfigMap addDefaultTlsConfiguration(V1ConfigMap configMap) {

+ return configMap

+ .putDataItem("TLS_ENABLED", "true")

+ .putDataItem("TLS_KEY_FILE", "/etc/secret-volume/controller01.key.pem")

+ .putDataItem("TLS_CERT_FILE", "/opt/pravega/conf/server-key.key")

+ .putDataItem("TLS_TRUST_STORE", "/opt/pravega/conf/client.truststore.jks")

+ .putDataItem("TLS_ENABLED_FOR_SEGMENT_STORE", "true")

+ .putDataItem("REST_KEYSTORE_FILE_PATH", "/opt/pravega/conf/server.keystore.jks")

+ .putDataItem("REST_KEYSTORE_PASSWORD_FILE_PATH", "/opt/pravega/conf/server.keystore.jks.passwd");

+ }

/**

+ * Method to get V1ConfigMap

+ * @param name Name of the ConfigMap.

+ * @param namespace Namespace on which the pod(s) reside.

+ * @return Future representing the V1ConfigMap.

+ */

+ @SneakyThrows(ApiException.class)

+ public CompletableFuture<V1ConfigMap> getConfigMap(String name, String namespace) {

+ CoreV1Api api = new CoreV1Api();

+ K8AsyncCallback<V1ConfigMap> callback = new K8AsyncCallback<>("readNamespacedConfigMap");

+ api.readNamespacedConfigMapAsync(name, namespace, PRETTY_PRINT, false, false, callback);

+ return callback.getFuture();

+ }

+ /**

+ * Method to create V1ConfigMap

+ * @param name Name of the ConfigMap.

+ * @param namespace Namespace on which the pod(s) reside.

+ * @param configMap V1ConfigMap to create

+ * @return Future representing the V1ConfigMap.

+ */

"a.txt" 143L, 7941C 1,1 Top

+ * @return Future representing the V1ConfigMap.

+ */

+ @SneakyThrows(ApiException.class)

+ public CompletableFuture<V1ConfigMap> createConfigMap(String namespace, V1ConfigMap configMap) {

+ CoreV1Api api = new CoreV1Api();

+ K8AsyncCallback<V1ConfigMap> callback = new K8AsyncCallback<>("createNamespacedConfigMap");

+ api.createNamespacedConfigMapAsync(namespace, configMap, PRETTY_PRINT, null, false, callback);

+ return callback.getFuture();

+ }

+ /**

+ * Method to delete V1ConfigMap

+ * @param name Name of the ConfigMap.

+ * @param namespace Namespace on which the pod(s) reside.

+ * @return Future representing the V1ConfigMap.

+ */

+ @SneakyThrows(ApiException.class)

+ public CompletableFuture<V1Status> deleteConfigMap(String name, String namespace) {

+ CoreV1Api api = new CoreV1Api();

+ K8AsyncCallback<V1Status> callback = new K8AsyncCallback<>("deleteNamespacedConfigMap");

+ api.deleteNamespacedConfigMapAsync(name, namespace, PRETTY_PRINT, null, 0, false, null, null, callback);

+ return callback.getFuture();

+ }

Monday, August 3, 2020

Stream store with TLS

Testing pravega-tls mode:

Pravega is a stream store that comprises of a control plane server and a data plane server. The former is called controller and the latter is called segmentstore. Both are available to communicate with a client but only the controller provisions the REST api.

Both the controller and the segmentstore can enable TLS. The pravega source includes default certificates, keys, keystore and truststore in its config folder.

TLS is enabled by including the following options in the config folder:

controller.security.tls.enable: "true"

controller.security.tls.trustStore.location: /opt/pravega/conf/client.truststore.jks

controller.security.tls.server.certificate.location: /opt/pravega/conf/server-cert.crt

controller.security.tls.server.privateKey.location: /opt/pravega/conf/server-key.key

controller.security.tls.server.keyStore.location: /opt/pravega/conf/server.keystore.jks

controller.security.tls.server.keyStore.pwd.location: /opt/pravega/conf/server.keystore.jks.passwd

controller.segmentstore.connect.channel.tls: "true"

where the corresponding files are available in the pravega/pravega:latest image that is loaded on a host container.

This configuration did not work because it could not be found by the controller server as shown from the logs included here:

2020-08-03 00:04:17,403 1512 [ControllerServiceStarter STARTING] INFO i.p.c.s.ControllerServiceStarter - Controller serviceConfig = ControllerServiceConfigImpl(threadPoolSize=80, storeClientConfig=StoreClientConfigImpl(storeType=PravegaTable, zkClientConfig=Optional[ZKClientConfigImpl(connectionString: zookeeper-client:2181, namespace: pravega/pravega, initialSleepInterval: 5000, maxRetries: 5, sessionTimeoutMs: 10000, secureConnectionToZooKeeper: false, trustStorePath is unspecified, trustStorePasswordPath is unspecified)]), hostMonitorConfig=io.pravega.controller.store.host.impl.HostMonitorConfigImpl@3b434741, controllerClusterListenerEnabled=true, timeoutServiceConfig=TimeoutServiceConfig(maxLeaseValue=120000), tlsEnabledForSegmentStore=, eventProcessorConfig=Optional[ControllerEventProcessorConfigImpl(scopeName=_system, commitStreamName=_commitStream, commitStreamScalingPolicy=ScalingPolicy(scaleType=FIXED_NUM_SEGMENTS, targetRate=0, scaleFactor=0, minNumSegments=2), abortStreamName=_abortStream, abortStreamScalingPolicy=ScalingPolicy(scaleType=FIXED_NUM_SEGMENTS, targetRate=0, scaleFactor=0, minNumSegments=2), scaleStreamName=_requeststream, scaleStreamScalingPolicy=ScalingPolicy(scaleType=FIXED_NUM_SEGMENTS, targetRate=0, scaleFactor=0, minNumSegments=2), commitReaderGroupName=commitStreamReaders, commitReaderGroupSize=1, abortReaderGroupName=abortStreamReaders, abortReaderGroupSize=1, scaleReaderGroupName=scaleGroup, scaleReaderGroupSize=1, commitCheckpointConfig=CheckpointConfig(type=Periodic, checkpointPeriod=CheckpointConfig.CheckpointPeriod(numEvents=10, numSeconds=10)), abortCheckpointConfig=CheckpointConfig(type=Periodic, checkpointPeriod=CheckpointConfig.CheckpointPeriod(numEvents=10, numSeconds=10)), scaleCheckpointConfig=CheckpointConfig(type=None, checkpointPeriod=null), rebalanceIntervalMillis=120000)], gRPCServerConfig=Optional[GRPCServerConfigImpl(port: 9090, publishedRPCHost: null, publishedRPCPort: 9090, authorizationEnabled: false, userPasswordFile is unspecified, tokenSigningKey is specified, accessTokenTTLInSeconds: 600, tlsEnabled: false, tlsCertFile is unspecified, tlsKeyFile is unspecified, tlsTrustStore is unspecified, replyWithStackTraceOnError: false, requestTracingEnabled: true)], restServerConfig=Optional[RESTServerConfigImpl(host: 0.0.0.0, port: 10080, tlsEnabled: false, keyFilePath is unspecified, keyFilePasswordPath is unspecified)])

When the values were overridden, the following error was encountered:

java.lang.IllegalArgumentException: File does not contain valid certificates: /opt/pravega/conf/client.truststore.jks

at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:182)

at io.pravega.client.netty.impl.ConnectionPoolImpl.getSslContext(ConnectionPoolImpl.java:280)

at io.pravega.client.netty.impl.ConnectionPoolImpl.getChannelInitializer(ConnectionPoolImpl.java:237)

at io.pravega.client.netty.impl.ConnectionPoolImpl.establishConnection(ConnectionPoolImpl.java:194)

at io.pravega.client.netty.impl.ConnectionPoolImpl.getClientConnection(ConnectionPoolImpl.java:128)

at io.pravega.client.netty.impl.ConnectionFactoryImpl.establishConnection(ConnectionFactoryImpl.java:62)

at io.pravega.client.netty.impl.RawClient.<init>(RawClient.java:87)

at io.pravega.controller.server.SegmentHelper.updateTableEntries(SegmentHelper.java:403)

at io.pravega.controller.store.stream.PravegaTablesStoreHelper.lambda$addNewEntry$10(PravegaTablesStoreHelper.java:178)

at io.pravega.controller.store.stream.PravegaTablesStoreHelper.lambda$null$54(PravegaTablesStoreHelper.java:534)

at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)

at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)

at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.security.cert.CertificateException: found no certificates in input stream

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:98)

at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:64)

at io.netty.handler.ssl.SslContext.toX509Certificates(SslContext.java:1071)

at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:180)

... 19 common frames omitted

This required passing custom certificate, key, truststore and keystore to the controller.

In the form of:

TLS_ENABLED = "true"

TLS_KEY_FILE = "/etc/secret-volume/controller01.key.pem"

TLS_CERT_FILE = "/etc/secret-volume/controller01.pem"

TLS_TRUST_STORE = "/etc/secret-volume/client.truststore.jks"

TLS_ENABLED_FOR_SEGMENT_STORE = "true"

REST_KEYSTORE_FILE_PATH = "/etc/secret-volume/server01.keystore.jks"

REST_KEYSTORE_PASSWORD_FILE_PATH = "/etc/secret-volume/pass-secret-tls"

These options are also hard to pass without the use of environment variables especially when used in conjunction with other software that installs and configures the pravega container image. One of the ways to overcome this is to edit the configmap with the overrides above and recreate the pods.

Sunday, August 2, 2020

Support for small to large footprint introspection database and query

An application can choose to offer a set of packaged queries available for the user to choose from a dropdown menu while internalizing all upload of corresponding programs, their execution and the return of the results. One of the restrictions that comes with packaged queries exported via REST APIs is their ability to scale since they consume significant resources on the backend and continue to run for a long time. These restrictions cannot be relaxed without some reduction on their resource usage. The API must provide a way for consumers to launch several queries with trackers and they should be completed reliably even if they are done one by one. This is facilitated with the help of a reference to the query and a progress indicator. The reference is merely an opaque identifier that only the system issues and uses to look up the status. The indicator could be another api that takes the reference and returns the status. It is relatively easy for the system to separate read-only status information from read-write operations so the number of times the status indicator is called has no degradation on the rest of the system. There is a clean separation of the status information part of the system which is usually periodically collected or pushed from the rest of the system. The separation of read-write from read-only also helps with their treatment differently. For example, it is possible to replace the technology for the read-only separately from the technology for read-write. Even the technology for read-only can be swapped from one to another for improvements on this side.

The design of all REST APIs generally follows a convention. This practice gives well recognized uri qualifier patterns, query parameters and methods. Exceptions, errors and logging are typically done with the best usage of http protocol. The 1.1 version of that protocol is revised so it helps with the

The SQL query can also be translated from REST APIs with the help of engines like Presto or LogParser.

Tools like LogParser allow sql queries to be executed over enumerable. SQL has been supporting user defined operators for a while now. Aggregation operations have had the benefit that they can support both batch and streaming mode operations. These operations therefore can operate on large datasets because they view only a portion at a time. The batch operation can be parallelized to a number of processors. This has generally been the case with Big Data and cluster mode operations. Both batch and stream processing can operate on unstructured data.

Presto from Facebook is a distributed SQL query engine can operate on streams from various data source supporting adhoc queries in near real-time.

Just like the standard query operators of .NET the FLink SQL layer is merely a convenience over the table APIs. On the other hand, Presto offers to run over any kind of data source not just Table APIs.

Although Apache Spark query code and Apache Flink query code look very much similar, the former uses stream processing as a special case of batch processing while the latter does just the reverse.

Also, Apache Flink provides a SQL abstraction over its Table API

While Apache Spark provided ". map(...)" and ". reduce(...)" programmability syntax to support batch-oriented processing, Apache Flink provides Table APIs with ".groupby(...)" and ". order(...)" syntax. It provides SQL abstraction and supports steam processing as the norm.

The APIs vary when the engines change even though they form an interface such that any virtualized query can be executed and the APIs form their own query processing syntax. They vary because the requests and responses vary and their size and content vary. Some APIs are better suited for stream processing and others for batch processing.

Analytical applications that perform queries on data are best visualized via catchy charts and graphs. Most applications using storage and analytical products have a user interface that lets them create containers for their data but have very little leverage for the customer to author streaming queries. Even the option to generate a query is nothing more than the uploading of a program using a compiled language artifact. The application reuses a template to make requests and responses but this is very different from long running and streaming responses that are typical for these queries.

Query Interface is not about keywords.

When it comes to a user interface for querying data, people often associate Google user interface for search terms. Avoid Google. This is not necessarily the right interface for Streaming queries. If anything, it is simplistic and keywords based. It is not operators based. Splunk interface is a much cleaner interface since it has seasoned from search over machine data. Streaming queries are not table queries. They also don’t operate on big data on Hadoop file system only. They do have standard operators that can directly translate to operator keywords and can be chained the same manner in a java file as pipe operators on the search bar. These translations of query operator expressions on the search bar to a java language FLink streaming query is a patentable design for streaming user interface and one that will prove immensely flexible for direct execution of user queries as compared to package queries.

Saturday, August 1, 2020

Support for small to large footprint introspection database and query

One of the restrictions that comes with packaged queries exported via REST APIs is their ability to scale since they consume significant resources on the backend and continue to run for a long time. These restrictions cannot be relaxed without some reduction on their resource usage. The API must provide a way for consumers to launch several queries with trackers and they should be completed reliably even if they are done one by one. This is facilitated with the help of a reference to the query and a progress indicator. The reference is merely an opaque identifier that only the system issues and uses to look up the status. The indicator could be another api that takes the reference and returns the status. It is relatively easy for the system to separate read-only status information from read-write operations so the number of times the status indicator is called has no degradation on the rest of the system. There is a clean separation of the status information part of the system which is usually periodically collected or pushed from the rest of the system. The separation of read-write from read-only also helps with their treatment differently. For example, it is possible to replace the technology for the read-only separately from the technology for read-write. Even the technology for read-only can be swapped from one to another for improvements on this side.

The SQL query can also be translated from REST APIs with the help of engines like Presto or LogParser.

Presto from Facebook is a distributed SQL query engine can operate on streams from various data source supporting adhoc queries in near real-time.

Although Apache Spark query code and Apache Flink query code look very much similar, the former uses stream processing as a special case of batch processing while the latter does just the reverse.

Also Apache Flink provides a SQL abstraction over its Table API