Cluster computing

Wednesday, September 11, 2019

We were discussing Falco for auditing.
The actions of uploading the policies and web hook and requiring the kube-apiserver to restart varies from site to site where the Kubernetes cluster is hosted. If the cluster is hosted on the PKS, the actions taken are different from those for minkube. In this case the BOSH cli is used.

This cli can be used when we have the ip address of the coordinator and the credentials to use it are set via environment variables.
The commands are:
1. ssh ubuntu@<pks-api-server>
2. pks login -a <pks-api-server> -u <admin> -p <password> --skip-ssl-validation
3. pks cluster <cluster-name> # which gives the k8s cluster ID
4. bosh vms -d service-instance_<k8s cluster ID> # which gives the vms for the cluster
5. bosh -d service-instance_<k8s cluster ID> ssh <VM CID> # corresponding the vm for the master

6. Then we run the commands to bosh scp to upload the audit policy
7. the bosh scp command to upload the web hook declaration
8. The bosh scp command to copy the apiserver-config.patch.sh script file which causes the kube-apiserver to restart when the configurations audit-webhook-config-file and audit-dynamic-configuration changes in /etc/kubernetes/manifests/kube-apiserver.yaml

The audit logs then become available at /var/lib/k8s-audit/audit.log

Tuesday, September 10, 2019

We describe the steps taken to use Falco for auditing on Kubernetes:
1) Deploy Falco to your Kubernetes cluster
2) Define your audit policy and webhook configuration
3) Restart the API Server to enable Audit Logging
4) Observe Kubernetes audit events at Falco

1) can be done with the help of a chart from stable/Falco
helm install --name my-release stable/Falco
Note Rbac is enabled with :

kubectl create -f k8s-with-rbac/falco-account.yaml
serviceaccount "falco-account" created
clusterrole "falco-cluster-role" created
clusterrolebinding "falco-cluster-role-binding" created
k8s-using-daemonset$

A service is created that allows other services to reach the embedded webserver in falco via port 8765:
k8s-using-daemonset$ kubectl create -f k8s-with-rbac/falco-service.yaml
service/falco-service created

2) The webhook is installed with:
webserver:
enabled: true
listen_port: 8765
k8s_audit_endpoint: /k8s_audit
ssl_enabled: false
ssl_certificate: /etc/falco/falco.pem

The Kubernetes audit is configured with a Kube-config file and kube-apiserver as follows:
cat <<EOF > /etc/kubernetes/audit-webhook-kubeconfig
apiVersion: v1
kind: Config
clusters:
- cluster:
server: http://<ip_of_falco>:8765/k8s_audit
name: falco
contexts:
- context:
cluster: falco
user: ""
name: default-context
current-context: default-context
preferences: {}
users: []
EOF
And the kube-server is launched with the following api commands:
--audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-webhook-config-file=/etc/kubernetes/audit-webhook-kubeconfig

Rules can be customized with k8s_audit_rules.yaml file.

3)
A script enable-k8s-audit.sh performs the necessary steps of enabling dynamic audit support for the apiserver by modifying the apiserver command line to add `--audit-dynamic-configuration`, `--feature-gates=DynamicAuditing=true`, etc. arguments, etc.
The same script can be modified to add a default log backend with commandline arguments:--audit-log-path, --audit-log-format, --audit-log-truncate-enabled and --audit-policy-file

4) Kubernetes audit events will then be routed to the falco daemonset within the cluster
Verify that the falco was setup correctly with
kubectl logs -l app=falco

Monday, September 9, 2019

This is a summary of the book: “An Elegant puzzle” written by Will Larson.
The author has a blog that is also a good read. This book is about the discipline of engineering management presented with an engineering driven approach to solving some of the management problems. The emphasis on engineering here is not about formula or theory as much as it is a bias for action, implementation and re-assessment. This approach includes analysis and metrics not necessarily in that order all the time. It proposes fairness in every step of improvement to foster an inclusive environment.
The author presents two common threads that run through every chapter of the book. First, a framework is used to capture the insights presented. This framework can be customized, instrumented, quantitatively evaluated and iterated for continuous improvement in a variety of contexts. Second, the book provides readers with minimal process improvements that do not interfere with their day to day activities. Tools and techniques proposed in this book serve this purpose.
There are seven chapters including Organizations, Tools, Approaches, Culture and Careers. The chapters are intended for a broad audience.
Organization deals with walking the tightrope between managing risks and sustaining productivity given the winds of change.
Tools is a chapter dedicated to applying systems thinking – a veritable theory that has other books written on this topic. It involves a model-based approach to describing and solving engineering management problems. It benefits the practitioner with incremental progress towards chosen goal.
The Approaches chapter is dedicated to implementing the frameworks described in the Tools chapter. It brings up the differences between the management versus engineering thinking by suggesting that management is in fact a moral profession because it is an opportunity and obligation to the empowerment of those we work with and who work for us. It goes on to suggest that the practice of management has a lasting effect on people.
The chapter on Culture poses it as a double-edged sword where it can vary between mild to extreme indulgence. Some aspects of engineering culture are painstakingly cultivated and preserved such as innovation and accountability. The absence of culture has been brought up as the absence of scaffolding that can result in pitfalls.
The continuous education of workforce is recommended and the representation of diversity in the discipline is sought to be improved with nuanced and results-oriented processes.
The book is careful not to sell any of the proposals as silver bullets and encourages the readers to track the progress of an application over time.
The final chapter on career is about establishing a ladder with continuous replenishment at the base and the opportunity to foster growth by way of change that can be influenced.
As I write this summary, I have to throw in the disclosure that I have never been a manager and have no idea what it brings with formal designation as opposed to practicing without it. The book is guiding and readers can hope to benefit immediately from a results-oriented action plan.

Sunday, September 8, 2019

Deployment models for software on Kubernetes
Each deployment model has its advantages and disadvantages. The on-premise model is focused on deploying distributed containers to on-premise servers while securing storage centrally. The cloud-based model is focused on deploying the instance that can meet the monitoring and logging requirements of the cloud environment.
In this section, we cite the challenges and considerations associated with each deployment model for running workloads in production. In the following sections, we describe the container security, Kubernetes deployments, and the network security. Each layer of the deployment may either be self-managed or fully managed service and comes with its best practice.
1. Standalone On-Premise model:
The standalone deployment of any software in Kubernetes comes with a requirement to automate the initialization of the Kubernetes cluster required to host the deployed instance. This installer comes with a documentation that describes the software included in the package, their version and the minimal steps needed to get the instance up and running for the first time. The installation is already secure out of the box in adhering to the product security guidelines and has been analyzed for container image security, web application, network intrusion security among others. The administrator can then secure the product specific features listed in this guide for both the resource requirements of the application and the Kubernetes execution environment of the cluster hosting the application in a two later srack if application over cluster.
2. Cloud-based security model:
The cloud-based deployment of the same software on Kubernetes also comes with a requirement to automate the deployment of the application but on a PKS cluster so that it can be deployed the same to the public or private cloud. PKS allows us to use the same automation to initialize the cluster regardless of the site where it is deployed. However, there are quite a few differences between cloud-based deployment and a standalone deployment that span layers and shared components which require administrator involvement to customize the configuration before first use.
Both deployments require configuration of the application with administrator involvement before users can begin analyzing streams. This calls out activities such as artifact repository configuration, metrics and monitoring setup, diagnosability and logging configuration as well as settings for scaling the capacity to meet the forecasted demand from the deployed instance.

Saturday, September 7, 2019

Let us review the sink architecture in PKS. This consists of a log sink for monitoring the cluster and namespace logs and a metric sink for monitoring the cluster metrics. The log sink and metric sink therefore serve different purposes although the data may appear in common json format. These resources have to be enabled using the observability manager.
The log architecture forwards them to a common log destination. The forwarding of logs is done with the help of Fluent-bit where a daemon running as a pod on a single node aggregates the events. In addition to logs thus collected, the event collector collects Kubernetes API events and a sink collector handles CRD events pertaining to fluent-bit configmaps. The event collector and sink collector are hosted independently. All aggregated events are then forwarded to the common log destination.
The metrics architecture is also similar with kubelets producing metrics but differs in two different aspects. Instead of the fluent bit forwarding the aggregated events to a common log destination, a plugin is required to forward them to the common metrics destination. The second difference is that there is no sink collector for metrics. Even the CRD events are handled by the metrics controller and only the telegraf is responsible for forwarding metrics.
The sink architecture in PKS is merely for automation. It does not prevent the direct access of logs and configuration for the clusters. If the cluster logs were to be downloaded, the following steps would be necessary.
We gather the credential and IP address information for the. BOSH Director, SSH into the Ops Manager VM, and use the BOSH CLI v2+ to log in to the BOSH Director from the Ops Manager VM. We mention the name of the deployment and list all of the virtual machines. We choose a virtual machine and download the logs from it by specifying the “logs” command line argument.
The sink architecture is also useful for monitoring. The logs and events are combined in a shared format which provides operators with a robust set of monitoring and filtering options. All the entries of the sink data are timestamped, contain the host ID, and are annotated with the namespaces, pod ID and container name. The logs are distinguished by the App-Name field.
The Kubernetes API event entries are distinguished by “k8s.event” in the App-Name field. Strings like “ Error:ErrImagePull”, “Back-off restarting failed container”, and “Started container” help query the events for determining the cause of failure or the time of success.
A sink resource enables PKS users to configure destinations for logs transported following the Syslog Protocol defined in RFC 5424. This resource is dependent on the IaaS infrastructure. The sink resource needs to be enabled because it is not on by default. As with service brokers, sinks are created for cluster and namespace scopes. They don’t use namespace, bucket, resource hierarchy. The “create-sink” command is used to

Friday, September 6, 2019

The log architecture forwards them to a common log destination. The forwarding of logs is done with the help of Fluent-bit where a daemon running as a pod on a single node aggregates the events. In addition to logs thus collected, the event collector collects Kubernetes API events and a sink collector handles CRD events pertaining to fluent-bit configmaps. The event collector and sink collector are hosted independently. All aggregated events are then forwarded to the common log destination.

The metrics architecture is also similar with kubelets producing metrics but differs in two different aspects. Instead of the fluent bit forwarding the aggregated events to a common log destination, a plugin is required to forward them to the common metrics destination. The second difference is that there is no sink collector for metrics. Even the CRD events are handled by the metrics controller and only the telegraf is responsible for forwarding metrics.

Thursday, September 5, 2019

PKS can also be monitored with sinks. RFC 5424 describes log ingress over TCP and introduces the notion of a sink. These sink resources help PKS to send the logs to that destination. Logs as well as events can use a shared format. The Kubernetes API events are denoted by the string “k8s.event” and with their “APP-NAME” field. A typical Kubernetes API event includes the host ID of the BOSH VM, the namespace and the Pod-ID as well. Failure to retrieve containers from Registry is specified with an identifying string of “Error: ErrImagePull”. Malfunctioning containers are denoted with “Back-off restarting failed container” in their events. Successful scheduling of containers has “Started container” in their events.
The logs for any cluster can also be downloaded from the PKS VM using the BOSH CLI command such as “logs pks/0”
Kubernetes master node VMs also have etcd an open source distributed key value store which it uses for service discovery and configuration sharing. The etcd also has metrics which help cluster health monitoring.
Overall PKS has a multi-layer security model for VMWare Enterprise. The layers are Application layer, Container management layer, Platform layer, Infrastructure layer. IAM and monitoring span across all these layers. All aspects of AAA apply to each of these layers and is done with the help of IAM and monitoring.

The Application layer visibility is provided with the help of auditing. PKS integrates well with VMWare and leverages the monitoring of containerized applications and log events.
The Platform layer security is provided by PKS Identity and Access management which is handled primarily by a service called the User Account and Authentication.
Container management layer is secured with the help of private image registry, flexible multi-tenancy, and vulnerability scanning. PKS uses Clair an open source project to statically analyze containers while importing information about vulnerabilities from a variety of sources. Signed container images provide content trust.
Infrastructure security is provided by micro-segmentation, a unified network policy layer and operational tools including those for troubleshooting.