Wednesday, September 9, 2020

Exceptions encountered when setting tls on server

 java.lang.IllegalArgumentException: File does not contain valid certificates: /opt/pravega/conf/client.truststore.jks

        at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:182)

        at io.pravega.client.netty.impl.ConnectionPoolImpl.getSslContext(ConnectionPoolImpl.java:280)

        at io.pravega.client.netty.impl.ConnectionPoolImpl.getChannelInitializer(ConnectionPoolImpl.java:237)

        at io.pravega.client.netty.impl.ConnectionPoolImpl.establishConnection(ConnectionPoolImpl.java:194)

        at io.pravega.client.netty.impl.ConnectionPoolImpl.getClientConnection(ConnectionPoolImpl.java:128)

        at io.pravega.client.netty.impl.ConnectionFactoryImpl.establishConnection(ConnectionFactoryImpl.java:62)

        at io.pravega.client.netty.impl.RawClient.<init>(RawClient.java:87)

        at io.pravega.controller.server.SegmentHelper.updateTableEntries(SegmentHelper.java:403)

        at io.pravega.controller.store.stream.PravegaTablesStoreHelper.lambda$addNewEntry$10(PravegaTablesStoreHelper.java:178)

        at io.pravega.controller.store.stream.PravegaTablesStoreHelper.lambda$null$54(PravegaTablesStoreHelper.java:534)

        at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)

        at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)

        at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

Caused by: java.security.cert.CertificateException: found no certificates in input stream

        at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:98)

        at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:64)

        at io.netty.handler.ssl.SslContext.toX509Certificates(SslContext.java:1071)

        at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:180)

        ... 19 common frames omitted


The configuration options for the kubernetes deployments are somewhat different from the standalone case because Kubernetes recognized x509 and pkcs8 format. A tls.crt and tls.key suffices to provide the necessary configuration to configure the tls server. Since the most convenient way to create the certificate and key pair on Kubernetes is via cert-manager, these will usually be available as a Kubernetes secret which can then be mounted on the volume as say ‘/etc/secret-volume’. The configurations therefore look like:

        TLS_ENABLED = "true"

        TLS_KEY_FILE = "/etc/secret-volume/tls.key"

        TLS_CERT_FILE = "/etc/secret-volume/tls.crt"

        TLS_TRUST_STORE = "/etc/secret-volume/tls.crt"

        TLS_ENABLED_FOR_SEGMENT_STORE = "true"

        REST_KEYSTORE_FILE_PATH = "/opt/pravega/server.keystore.jks"

        REST_KEYSTORE_PASSWORD_FILE_PATH = "/opt/pravega/server.keystore.jks.passwd"


Both the controller and the segmentstore make use of configmaps for use with their deployment/statefulset respectively. These configmaps can be displayed to see that the options are correclty set. 

The options were set but I need to make changes because there was a mistake. The pravega controller and segment store are already running. What do I do?

The pravega controller and segment store are typical Kubernetes applications. They can be scaled back to zero replicas, then the configuration adjusted and finally the replicas can be scaled up to the desired number. The server logs will confirm whether the new options were picked up.

4) I see exceptions I don’t recognize in the controller logs? How do I overcome them ?

A few more exceptions encountered in trying out the change above include:

1) Caused by: java.lang.IllegalStateException: Expected the service ZKGarbageCollector [FAILED] to be RUNNING, but the service has FAILED

        at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:366)

        at com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:302)

        at io.pravega.controller.store.stream.PravegaTablesStreamMetadataStore.<init>(PravegaTablesStreamMetadataStore.java:77)

        at io.pravega.controller.store.stream.PravegaTablesStreamMetadataStore.<init>(PravegaTablesStreamMetadataStore.java:67)

        at io.pravega.controller.store.stream.StreamStoreFactory.createStore(StreamStoreFactory.java:37)

        at io.pravega.controller.server.ControllerServiceStarter.startUp(ControllerServiceStarter.java:230)

        at com.google.common.util.con

2) Caused by: java.security.cert.CertificateException: found no certificates in input stream

        at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:98)

        at io.netty.handler.ssl.PemReader.readCertificates(PemReader.java:64)

        at io.netty.handler.ssl.SslContext.toX509Certificates(SslContext.java:1071)

        at io.netty.handler.ssl.SslContextBuilder.trustManager(SslContextBuilder.java:180)

3) java.io.IOException: Invalid keystore format

        at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:658)

        at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:56)

        at sun.security.provider.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:224)

        at sun.security.provider.JavaKeyStore$DualFormatJKS.engineLoad(JavaKeyStore.java:70)

        at java.security.KeyStore.load(KeyStore.java:1445)

        at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.getTrustManager(ZooKeeperServiceRunner.java:220)

        at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.waitForSSLServerUp(ZooKeeperServiceRunner.java:185)

        at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.waitForServerUp(ZooKeeperServiceRunner.java:164)

        at io.pravega.segmentstore.storage.impl.bookkeeper.ZooKeeperServiceRunner.start(ZooKeeperServiceRunner.java:109)

        at io.pravega.local.InProcPravegaCluster.startLocalZK(InProcPravegaCluster.java:210)

        at io.pravega.local.InProcPravegaCluster.start(InProcPravegaCluster.java:182)

        at io.pravega.local.LocalPravegaEmulator.start(LocalPravegaEmulator.java:153)

        at io.pravega.local.LocalPravegaEmulator.main(LocalPravegaEmulator.java:128)

In these cases, look for certificate exceptions. They should have a clear message and directive to fix the exception. The certificate exceptions have a ton of resources online since they are very common. Each one can be individually addressed.

For example, a frequently encountered exception is certificate verification failure as shown below.

DEBUG [2020-09-03 15:04:50.374] [grpc-default-worker-ELG-1-1] i.n.h.s.ReferenceCountedOpenSslContext: verification of certificate failed

java.security.cert.CertificateException: No subject alternative DNS name matching nautilus-pravega-controller.nautilus-pravega.svc.cluster.local found.

        at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:214)

        at sun.security.util.HostnameChecker.match(HostnameChecker.java:96)

        at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455)

        at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436)

        at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252)

        at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)

        at io.netty.handler.ssl.OpenSslTlsv13X509ExtendedTrustManager.checkServerTrusted(OpenSslTlsv13X509ExtendedTrustManager.java:223)

        at io.netty.handler.ssl.ReferenceCountedOpenSslClientContext$ExtendedTrustManagerVerifyCallback.verify(ReferenceCountedOpenSslClientContext.java:255)

        at io.netty.handler.ssl.ReferenceCountedOpenSslContext$AbstractCertificateVerifier.verify(ReferenceCountedOpenSslContext.java:701)

        at io.netty.internal.tcnative.SSL.readFromSSL(Native Method)

        at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.readPlaintextData(ReferenceCountedOpenSslEngine.java:596)

        at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1181)

        at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1298)

        at io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:201)

        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1372)

        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267)

        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314)

        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501)

        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440)

        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)

        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)

        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)

        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)

        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)

        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)

        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)

        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)

        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)

        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

        at java.lang.Thread.run(Thread.java:748)

command terminated with exit code 137


In this case, set the name to what is expected for the controller server and as shown in the certificate specification below. You can also selectively disable verification with:

grpc.ssl_target_name_override=pravega-pravega-controller.default

com.sun.net.ssl.checkRevocation=false

grpc.default_authority=pravega-pravega-controller.default

Finally, remember that certificates and keys can be verified offline before their deployment using standard command line tools as shown below:


openssl rsa -noout -modulus -in tls.key | openssl md5

openssl x509 -noout -modulus -in tls.crt | openssl md5

openssl x509 -x509toreq -in tls.crt -out tls.csr -signkey tls.key

openssl req -text -noout -verify -in tls.csr

curl -i -k --cacert conf/cert.pem -u admin:1111_aaaa https://0.0.0.0:9091/v1/scopes -vvv

Always use pkcs8 format of the tls.key after the cert-manager generates them. For example:

openssl pkcs8 -in tls.key  -inform PEM -out tls.8.key-outform PEM -topk8 -nocrypt


Sample cert-manager specification is also included below

apiVersion: certmanager.k8s.io/v1alpha1

kind: Issuer

metadata:

  name: sdp-tls-issuer

  namespace: nautilus-pravega

spec:

  selfSigned: {}

---

apiVersion: certmanager.k8s.io/v1alpha1

#apiVersion: certificates.certmanager.k8s.io/v1beta1

kind: Certificate

metadata:

  name: sdp-tls-cert

  namespace: nautilus-pravega

spec:

  secretName: sdp-tls-cert-tls

  commonName: 'nautilus-pravega-controller.nautilus-pravega'

  dnsNames:

    - '*.cluster.local'

    - '*.svc.cluster.local'

    - pravega-pravega-controller.nautilus-pravega

    - pravega-pravega-segmentstore.nautilus-pravega

    - pravega-pravega-controller

    - pravega-pravega-segmentstore

  issuerRef:

    name: sdp-tls-issuer


No comments:

Post a Comment