Friday, September 6, 2019

PKS can also be monitored with sinks. RFC 5424 describes log ingress over TCP and introduces the notion of a sink. These sink resources help PKS to send the logs to that destination. Logs as well as events can use a shared format. The Kubernetes API events are denoted by the string “k8s.event” and with their “APP-NAME” field. A typical Kubernetes API event includes the host ID of the BOSH VM, the namespace and the Pod-ID as well. Failure to retrieve containers from Registry is specified with an identifying string of “Error: ErrImagePull”. Malfunctioning containers are denoted with “Back-off restarting failed container” in their events. Successful scheduling of containers has “Started container” in their events.
The logs for any cluster can also be downloaded from the PKS VM using the BOSH CLI command such as “logs pks/0”
Let us review the sink architecture in PKS. This consists of a log sink for monitoring the cluster and namespace logs and a metric sink for monitoring the cluster metrics. The log sink and metric sink therefore serve different purposes although the data may appear in common json format. These resources have to be enabled using the observability manager. 
The log architecture forwards them to a common log destination.  The forwarding of logs is done with the help of Fluent-bit where a daemon running as a pod on a single node aggregates the events. In addition to logs thus collected, the event collector collects Kubernetes API events and a sink collector handles CRD events pertaining to fluent-bit configmaps. The event collector and sink collector are hosted independently. All aggregated events are then forwarded to the common log destination.
The metrics architecture is also similar with kubelets producing metrics but differs in two different aspects. Instead of the fluent bit forwarding the aggregated events to a common log destination, a plugin is required to forward them to the common metrics destination. The second difference is that there is no sink collector for metrics. Even the CRD events are handled by the metrics controller and only the telegraf is responsible for forwarding metrics.

No comments:

Post a Comment