Cluster computing

Tuesday, September 17, 2019

Data Storage and connectors:
The nature of data storage is that they accumulate data over time for analytics. S3 Apis are a popular example of programmatic access for web access to store data in the cloud. A connector works similarly but they don’t necessarily require the data storage to be remote or code to be written for the data transfer. In fact, one of the most common asks from a storage product is that it facilitates data transfer using standard shell commands.
The purpose of the connector is to move data quickly and steadily between source and destination regardless of their type or the kind of data to be transferred. Therefore, the connector needs to be specific to the destination storage. The connector merely automates the steps to organize and fill the containers in the data storage by sending data in fragments, if necessary. The difference made by a connector is enormous in terms of convenience to stash data and for reusability of the same automation for different sources. The traditional stack of command line storage over programmable interfaces to allow automation beyond the programmatic access is not lost. However requiring customers to write their wrappers for their own command line utility to send data is somewhat tedious and avoidable.
In addition to the programmatic access for receiving data, data stores need to customize the input of data from different contexts such as protocols, bridging technologies such as message queues, and even the eccentricities of the sender. It is in these contexts that a no-code ready-made tool is preferred. Data transfers may also need to be in a chain requiring data to be relayed between systems like piping operations in shell environment. A new connector may wrap another existing connector as well.
One of the most common examples of a connector is a TCP based data connector. The data is simply sent by opening a networking socket to make a connection. This is executed with standard command line tools as follows:
cat logfile | nc ipaddress port
The inclusion of such a data connector for a storage product is probably the most convenient form of data transfer. Even if the storage product requires programmatic access, wrapping the access APIs to facilitate a TCP connector like above will immensely benefit those who do not have to write code to send data to storage.
With the automation for a TCP connector written once, there will not be a need to repeat the effort elsewhere or to reinvent the wheel.

Monday, September 16, 2019

Kubernetes provides Webhooks as a way to interact with all system generated events. This is the equivalent of http handlers and modules in ASP. Net in terms of functionality to intercept and change requests and Responses. The webhooks however are an opportunity to work on System generated resources such as pod creation requests and so on.
There are two stages where webhooks can be run. They are correspondingly named as mutating or validating webhooks. The first is an opportunity to change the requests on Kubernetes core V1 resources. The second is an opportunity to add validations to requests and Responses.
Since these span a lot of system calls the webhooks are invoked frequently. Therefore they must be selective in the requests they modify to exclude the possibility that they touch requests that were not intended.
In addition to selectors, the power of webhooks is best denomonstrated when they select all requests of a particular type to modify. For example this is an opportunity for security to raise the baseline by allowing or denying all resources of a particular kind. The execution of privileged pods may be disabled in the cluster with the help of webhooks.
The webhooks are light to run and serve similar to nginx http parameter modifiers. A number of them may be allowed to run

Sunday, September 15, 2019

The difference between log forwarding and event forwarding becomes clear when the use of command line options for kube-apiserver is considered. For example, the audit-log-path option dumps the ausit events to a log file that cannot be accessed from within the Kubernetes runtime environment within the cluster. Therefore this option cannot be used with FluentD because that us a containerized workload. On the other hand, the audit-web-hook option allows the service to listen for callbacks from the Kubernetes control plane to the arrival of audur events. These service listening from Falco for this web hook endpoint is running in its own container as a Kubernetes service. The control plane makes only web request per audit event and since the events are forwarded over the http, the Falco service can efficiently handle the rate and latency of traffic.
The performance consideration between the two options is also notable. The log forwarding is the equivalent of running the tail command on the log file and forwarding it over TCP as netcat command. This utilizes the sand amount of data in transfers and uses a TCP connection although it does not traverse as many layers as the web hook. It is also suitable for Syslog drain that enables further performance improvements
The Webhook command is a push command and requires packing and unpacking of data as it traverses up and down the network layers. There is no buffering involved on the service side so there is a chance that some data will be lost as service goes down. The connectivity is also subject to faults more than the Syslog drain. However the http is best suited for message broker intake which facilitates filtering and processing that can significantly improve performance.
The ability to transform events is not necessarily restricted to the audit container based service or services specific to the audit framework. The audit data is rather sensitive which is why it’s access is restricted. The transformation of events can occur even during analysis. This lets the event queries be simpler when the events are transformed. The use of streaming analysis enables the view of the data since the origin as holistic and continuous. With the help of windows over the data, the transformations are efficient.
Transformations can also be persisted where the computations are costly. This helps pay those costs one time rather than every time they need to be analyzed. Persisted transformations help with reuse and sharing. This makes it convenient and efficient to use a single source of truth. Transformations can also be chained between operators and they serve to firm a pipeline. This makes it easier to diagnose, troubleshoot and improve separation of concerns.

Saturday, September 14, 2019

The native k8s events can also be transformed to custom events to suit the need of any other event processing engine. Typically, organizations have their own event gateway and event stores for making them proprietary such as for the use of dial home, network operations center and remote diagnostic sessions. This ability to transform events then let us do without reserving large storage as long as there is some buffering possible from the source.
It is this notion that can be extended to Extract-Transform-Load operations suitable to different downstream systems.
The difference between log forwarding and event forwarding becomes clear when the use of command line options for kube-apiserver is considered. For example, the audit-log-path option dumps the ausit events to a log file that cannot be accessed from within the Kubernetes runtime environment within the cluster. Therefore this option cannot be used with FluentD because that us a containerized workload. On the other hand, the audit-web-hook option allows the service to listen for callbacks from the Kubernetes control plane to the arrival of audur events. These service listening from Falco for this web hook endpoint is running in its own container as a Kubernetes service. The control plane makes only web request per audit event and since the events are forwarded over the http, the Falco service can efficiently handle the rate and latency of traffic.
The performance consideration between the two options is also notable. The log forwarding is the equivalent of running the rail command on the log file and forwarding it over TCP as netcat command. This utilizes the sand amount of data in transfers and uses a TCP connection although it does not traverse as many layers as the web hook. It is also suitable for Syslog drain that enables further performance improvements
The Webhook command is a push command and requires packing and unpacking of data as it traverses up and down the network layers. There is no buffering involved on the service side so there is a chance that some data will be lost as service goes down. The connectivity is also subject to faults more than the Syslog drain. However the http us best suited for message broker intake which facilitates filtering and processing that can significantly improve performance.

Friday, September 13, 2019

The architecture of the Kubernetes has its control plane over network and storage available over infrastructure providers.

The components above are facilitated with the use Pivotal Container Service (PKS) which helps us migrate the same production stack across core infrastructure. Consequently, the security aspects of the production stack are dependent on the PKS and Kubernetes features and we have to reach out to the Kubernetes apiserver for auditing information from the containerized workloads.
The architecture is standard for reviewing any workloads hosted on Kubernetes. In particular, let us note the use of a distributed key-value database within the Kubernetes control plane. This database is the ‘etcd’ and it is used to maintain the cluster. ‘etcd’ is written in Go and uses the Raft consensus algorithm to manage highly-available replicated log.
Any distributed key-value database could do and it may even have benefits if the database can be offloaded from the control plane. If this cluster database could be object storage, it will continue to provide the durability and reliability while bringing some of the storage best practice.
The database is internal to the Kubernetes control plane so it does not really within the scope of this document. However, the events from the Kubernetes execution environment do pass through the layers. K8s events are noted for their format, labels and content. They help with monitoring, troubleshooting and for subsequent analysis from storage.
The native k8s events can also be transformed to custom events to suit the need of any other event processing engine. Typically, organizations have their own event gateway and event stores for making them proprietary such as for the use of dial home, network operations center and remote diagnostic sessions. This ability to transform events then let us do without reserving large storage as long as there is some buffering possible from the source.
It is this notion that can be extended to Extract-Transform-Load operations suitable to different downstream systems.

Thursday, September 12, 2019

Audit events originate from the Kube-apiserver usually running on the master VM in the PKS Kubernetes cluster.

There are essentially only two considerations:
First, we define the audit policy and the webhook which is passed as the Yaml file locations to the kube-apiserver in the form of command-line arguments. [These command-line options are explained here: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/]. We can also include these options in the kube-apiserver configuration.

Second, we restart the kube-apiserver to use the specified policy and webhook. Changing the configuration file automatically restarts the kube-apiserver.

The steps to setup auditing so that events can be analyzed later, include the following:

1) ssh admin@<pks-apiserver> # such as "ssh ubuntu@opsman.environment.local"

2) ubuntu@opsmanager-2-5:~$ sudo -i # pks and bosh commands are run from an elevated privilege account

3) pks login -a pks-api.environment.local -u -p -k # this let us view and use the pks cluster

4) pks cluster <cluster_name> | grep UUID # this lets us get the UUID for the cluster. The convention for naming service instance is usually service-instance_UUID. You can replace the service instance name with whatever format suits the name.

5) bosh vms -d service-instance_874b838b-6391-4c62-991b-3e1528a4b37e # this lets us use the service instance to display the vms. Usually there will be only one master. The kube-apiserver runs on this master.

6) bosh scp service-instance_874b838b-6391-4c62-991b-3e1528a4b37e master/b9a8aa9f-0e31-4579-8e4b-685c55a80f0e audit-policy.yaml :/var/vcap/jobs/kube-apiserver/config/audit-policy.yaml # we copy the audit policy file locally to the VM where the kube-apiserver runs.

7) bosh ssh service-instance_874b838b-6391-4c62-991b-3e1528a4b37e master/b9a8aa9f-0e31-4579-8e4b-685c55a80f0e -c ' echo "--audit-policy-file=/var/vcap/jobs/kube-apiserver/config/audit-policy.yaml " >> /var/vcap/jobs/kube-apiserver/config/kube-apiserver.yaml' # here we update the configuration of the kube-apiserver with the policy file path. This is the input to the auditing system.

8) bosh ssh service-instance_874b838b-6391-4c62-991b-3e1528a4b37e master/b9a8aa9f-0e31-4579-8e4b-685c55a80f0e -c ' echo "--audit-log-path=/var/vcap/sys/log/kube-apiserver/audit.log" >> /var/vcap/jobs/kube-apiserver/config/kube-apiserver.yaml' # here we update the configuration of the kube-apiserver with the log path. This is the output to the auditing system.

Wednesday, September 11, 2019

We were discussing Falco for auditing.
The actions of uploading the policies and web hook and requiring the kube-apiserver to restart varies from site to site where the Kubernetes cluster is hosted. If the cluster is hosted on the PKS, the actions taken are different from those for minkube. In this case the BOSH cli is used.

This cli can be used when we have the ip address of the coordinator and the credentials to use it are set via environment variables.
The commands are:
1. ssh ubuntu@<pks-api-server>
2. pks login -a <pks-api-server> -u <admin> -p <password> --skip-ssl-validation
3. pks cluster <cluster-name> # which gives the k8s cluster ID
4. bosh vms -d service-instance_<k8s cluster ID> # which gives the vms for the cluster
5. bosh -d service-instance_<k8s cluster ID> ssh <VM CID> # corresponding the vm for the master

6. Then we run the commands to bosh scp to upload the audit policy
7. the bosh scp command to upload the web hook declaration
8. The bosh scp command to copy the apiserver-config.patch.sh script file which causes the kube-apiserver to restart when the configurations audit-webhook-config-file and audit-dynamic-configuration changes in /etc/kubernetes/manifests/kube-apiserver.yaml

The audit logs then become available at /var/lib/k8s-audit/audit.log