Monday, September 30, 2019

This is a continuation of the previous post to enumerate funny software engineering practice:

Build a product with little or no documentation on those hidden features that only the maker knows

Build a product which cause impedance mismatch between the maker and the buyer in terms of what the product is expected to do

Build a product that breaks existing usage and compatibility in the next revision.

Build a product with a revision that makes partner’s integration investments a total loss.

Build a product without compatibility features that require significant re-investment in downstream systems.

Build a product without automation capabilities and have customers try to reinvent the wheel.

Build a product with the focus on development and leaving the test on the other side of the wall.

Build a product that forces consumers to adopt a technology stack or a cloud vendor so that the subscriptions may accrue income for the maker

Build a product that works weird on the handheld devices

Build a product that does not customize user experience or carry over their profile.

Build a product that makes the user repeatedly sign-in on the same device

Build a product that makes users jump through login interface for each and every site they visit

Build a product that breaks on different browsers or clients with no workarounds for functionality.

Sunday, September 29, 2019

This is a continuation of the previous post to enumerate funny software engineering practice:
  1. Build a product that requires different handlers in different roles driving up costs 
  1. Build a product that propagates pain and frustration through partners and ecosystem as they grapple with integration 
  1. Build a product that requires significant coding for automations and end users driving down adoption 
  1. Build a product that generates more support calls to do even the ordinary  
  1. Build a product that generates a lot of sprawl in disk and network usage requiring infrastructure 24x7 watch  
  1. Build a product that cannot be monitored because it is a black box 
  1. Build a product that says one thing but does another thing 
  1. Build a product that requires administrator intervention for authorizing each and every end user activity 
  1. Build a product that has an overwhelming size for code churn and improving technical debt 
  1. Build a product so big that changes in one component takes hours to integrate with other components 
  1. Build a product so big that it involves hierarchical permission requests before corresponding changes can be made in another component 
  1. Build a product with significant developer productivity but at the cost of business and market requirements. 
  1. Build a product to satisfy the technical managers but fail the product management. 
  1. Build a product with little or no documentation on those hidden features that only the maker knows 
  1. Build a product which cause impedance mismatch between the maker and the buyer in terms of what the product is expected to do 
  1. Build a product that breaks existing usage and compatibility in the next revision. 

Saturday, September 28, 2019

Humour behind Software engineering practice
This is an outline for a picture book, possibly with caricatures, to illustrate frequently encountered humour in the software engineering industry:
1) Build a Chimera because different parts evolve differently
2) Build a car with the hood first rather than the engine
3) Build a product with dysfunctional parts or with annoying and incredibly funny messages
4) Build a product with clunky parts that break down on the customer site
5) Build a product that forces you to twist your arms as you work with it
6) Build a product that makes you want to buy another from the same maker
7) Build a product that highly priced to justify the cost of development and have enterprise foot the bill
8) Build a product that features emerging trends and showcases quirky development rather than the use cases
9) Build a product that requires more manuals to be written but never read
10) Build a product that requires frequent updates with patches for security
11) Build a product that causes outages from frequent updates
12) Build a product that snoops on user activity or collects and send information to the maker
13) Build a product that won’t play well with others to the detriment of the customer buying the software
14) Build a product that says so much in legal language that the it amounts to extortion
15) Build a product that requires the customer to pay for servicing.
16) Build a product that splashes on the newspapers when a security flaw is found.
17) Build a product that finds a niche space and continues to grow by assimilating other products and results in huge debt towards blending the mix
18) Build a product with hard to find defects or those that involve significant costs and pain
19) Build a product with ambitious goals and then retrofit short term realizations
20) Build a product with poorly defined specification that causes immense code churn between releases causing ripples to the customer
21) Build a product with overly designed specification that goes obsolete by the time it hits the shelf.
22) Build a product that fails to comply with regional and global requirements and enforcements causing trouble for customers to work with each other
23) Build a product that creates a stack and ecosystem dedicated to its own growth and with significant impairment for partners and competitors to collaborate
24) Build a product that impairs interoperability to drive business
Software engineers frequently share images that illustrate their ideas, feelings and mechanisms during their discussions in building the product. Some of these images resonate across the industry but no effort has yet been undertaken to unify a comic them and present it as a comic strip. Surprisingly, there is no dearth of such comic books and illustrations in parenting and those involving kids.

Friday, September 27, 2019

Performance Tuning considerations in Stream queries:
Query language and query operators have made writing business logic extremely easy and independent of the data source. This suffices for the most part but there are a few cases when the status quo is just not enough. Enter real-time processing needs and high priority queries, the size of the data, the complexity of the computation and the latency of the response begins to become a concern.
Databases have had a long and cherished history in encountering and mitigating query execution responses. However, relational databases pose a significantly different domain of considerations as opposed to NoSQL storage primarily due to the layered and interconnected data requiring scale up rather than scale out technologies. Both have their independent performance tuning considerations.
Stream storage is no different to suffer from performance issues with disparate queries ranging from small to big. The compounded effect of append only data and stream requiring to be evaluated in windows makes iterations difficult. The processing of the streams is also exported out of the storage and this causes significant round trip time and back and forth.
Apache stack has significantly improved the offerings on the stream processing. Apache Kafka and Flink are both able to execute with stateful processing. They can persist the states to allow the processing to pick up where it left off. The states also help with fault tolerance. This persistence of state protects against failures including data loss. The consistency of the states can also be independently validated with a checkpointing mechanism also available from Flink. The checkpointing can persist the local state to a remote store.  Stream processing applications often take in the incoming events from an event log.  This event log therefore stores and distributes event streams which are written to durable append only log on tier 2 storage where they remain sequential by time. Flink can recover a stateful streaming application by restoring its state from a previous checkpoint. It will adjust the read position on the event log to match the state from the checkpoint. Stateful stream processing is therefore not only suited for fault tolerance but also reentrant processing and improved robustness with the ability to make corrections. Stateful stream processing has become the norm for event-driven applications, data pipeline applications and data analytics application.
Persistence of streams for intermediate executions helps with reusability and improves the pipelining of operations so the query operators are small and can be executed by independent actors. If we have equivalent of lambda processing on persisted streams, the pipelining can significantly improve performance earlier where the logic was monolithic and proved slow from progressing window to window. There is no distinct thumb rule but the fine-grained operators have proven to be effective since they can be studied.
Streams that articulate the intermediary result also help determine what goes into each stage of the pipeline. Watermarks and savepoints are similarly helpful This kind of persistence proves to be a win-win situation for parallelizing as well as subsequent processing while disk access used to be costly in dedicated systems.  There is no limit to the number of operators and their scale to be applied on streams so proper planning mitigates the efforts need to choreograph a bulky search operation.
These are some of the considerations for the performance improvement of stream processing.

Thursday, September 26, 2019

Since Event is a core Kubernetes resource much as the same as others, it not only has the metadata with it just like other resources, but also can created, updated and deleted just like others via APIs. 
func recordEvent(sink EventSink, event *v1.Event, patch []byte, updateExistingEvent bool, eventCorrelator *EventCorrelator) bool { 
: 
newEvent, err = sink.Create(event) 
: 
} 
Events conform to append only stream storage due to the sequential nature of the events. Events are also processed in windows making a stream processor such as Flink extremely suitable for events. Stream processors benefit from stream storage and such a storage can be overlaid on any Tier-2 storage. In particular, object storage unlike file storage can come very useful for this purpose since the data also becomes web accessible for other analysis stacks. 

As compute, network and storage are overlapping to expand the possibilities in each frontier at cloud scale, message passing has become a ubiquitous functionality. While libraries like protocol buffers and solutions like RabbitMQ are becoming popular, Flows and their queues can be given native support in unstructured storage. Messages are also time-stamped and can be treated as events  
Although stream storage is best for events, any time-series database could also work. However, they are not web-accessible unless they are in an object store. Their need for storage is not very different from applications requiring object storage that facilitate store and access. However, as object storage makes inwards into vectorized execution, the data transfers become increasingly fragmented and continuous. At this junction it is important to facilitate data transfer between objects and Event and it is in this space that Events and object store find suitability. Search, browse and query operations are facilitated in a web service using a web-accessible store. 

Query in Event Stream is easier with Flink than using own web service to enumerate the DataSet:
package org.apache.flink.examples.java.eventcount;

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.utils.ParameterTool;
import org.apache.flink.examples.java.wordcount.util.WordCountData;
import org.apache.flink.util.Collector;

public static void main(String[] args) throws Exception {

final ParameterTool params = ParameterTool.fromArgs(args);
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
env.getConfig().setGlobalJobParameters(params);

DataSet<String> text;
text = env.readTextFile(params.get("input"));

DataSet<Tuple2<String, Integer>> counts =
text.flatMap(new Tokenizer())
.groupBy(0)
.sum(1);
counts.print();
}

public static final class Tokenizer implements FlatMapFunction<String, Tuple2<String, Integer>> {
@Override
public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
String hours = getSubstring(entry, "before");
out.collect(new Tuple2<>(hours, 1));
}
}

Wednesday, September 25, 2019

Logs  may also optionally be converted to events which can then be forwarded to event gateway. Products like badger db are able to retain the events for a certain duration.

Components will then be able to raise events like so:
c.recorder.Event(component, corev1.EventTypeNormal, successReason, successMessage)

An audit event can be similarly raised as shown below:
        ev := &auditinternal.Event{
                RequestReceivedTimestamp: metav1.NewMicroTime(time.Now()),
                Verb:                     attribs.GetVerb(),
                RequestURI:               req.URL.RequestURI(),
                UserAgent:                maybeTruncateUserAgent(req),
                Level:                    level,
        }
With the additional data as:
        if attribs.IsResourceRequest() {
                ev.ObjectRef = &auditinternal.ObjectReference{
                        Namespace:   attribs.GetNamespace(),
                        Name:        attribs.GetName(),
                        Resource:    attribs.GetResource(),
                        Subresource: attribs.GetSubresource(),
                        APIGroup:    attribs.GetAPIGroup(),
                        APIVersion:  attribs.GetAPIVersion(),
                }
        }

Since Event is a core Kubernetes resource much as the same as others, it not only has the metadata with it just like other resources, but also can created, updated and deleted just like others via APIs.

func recordEvent(sink EventSink, event *v1.Event, patch []byte, updateExistingEvent bool, eventCorrelator *EventCorrelator) bool {
:
newEvent, err = sink.Create(event)
:
}
Events conform to append only stream storage due to the sequential nature of the events. Events are also processed in windows making a stream processor such as Flink extremely suitable for events. Stream processors benefit from stream storage and such a storage can be overlaid on any Tier-2 storage. In particular, object storage unlike file storage can come very useful for this purpose since the data also becomes web accessible for other analysis stacks.

As compute, network and storage are overlapping to expand the possibilities in each frontier at cloud scale, message passing has become a ubiquitous functionality. While libraries like protocol buffers and solutions like RabbitMQ are becoming popular, Flows and their queues can be given native support in unstructured storage. Messages are also time-stamped and can be treated as events
Although stream storage is best for events, any time-series database could also work. However, they are not web-accessible unless they are in an object store. Their need for storage is not very different from applications requiring object storage that facilitate store and access. However, as object storage makes inwards into vectorized execution, the data transfers become increasingly fragmented and continuous. At this junction it is important to facilitate data transfer between objects and Event and it is in this space that Events and object store find suitability. Search, browse and query operations are facilitated in a web service using a web-accessible store.


Tuesday, September 24, 2019

We wee discussing how to implement the kunernetes sink destination as a service within the cluster, in this regard, the service merely needs to listen on a port.

public class PravegaGateway {
    private static final Logger logger = Logger.getLogger(PravegaGateway.class.getName());

    private Server server;

    private void start() throws IOException {
        int port = CommonParams.getListenPort();
        server = NettyServerBuilder.forPort(port)
                .permitKeepAliveTime(1, TimeUnit.SECONDS)
                .permitKeepAliveWithoutCalls(true)
                .addService(new PravegaServerImpl())
                .build()
                .start();
        logger.info("Server started, listening on " + port);
        logger.info("Pravega controller is " + CommonParams.getControllerURI());
        Runtime.getRuntime().addShutdownHook(new Thread() {
            @Override
            public void run() {
                // Use stderr here since the logger may have been reset by its JVM shutdown hook.
                System.err.println("*** shutting down gRPC server since JVM is shutting down");
                PravegaGateway.this.stop();
                System.err.println("*** server shut down");
            }
        });
    }

Monday, September 23, 2019

We continue with our discussion on logging in Kubernetes. In the example provided, a log service is assumed to be made available. This service accessible at a predefined host and port.
Implementing the log-service within the cluster and listening on a tcp port will help with moving the log destination inside the cluster as stream storage and yet another application within the cluster. A service listening on a port typical for a Java application. The application will read from the socket and write to a pre-determined stream defined by the following parameters:
Stream name
Stream scope name
Stream reading timeout
Stream credential username
Stream credential password

public void write(String scope, String streamName, URI controllerURI, String routingKey, String message) {
        StreamManager streamManager = StreamManager.create(controllerURI);
        final boolean scopeIsNew = streamManager.createScope(scope);

        StreamConfiguration streamConfig = StreamConfiguration.builder()
                .scalingPolicy(ScalingPolicy.fixed(1))
                .build();
        final boolean streamIsNew = streamManager.createStream(scope, streamName, streamConfig);

        try (ClientFactory clientFactory = ClientFactory.withScope(scope, controllerURI);
             EventStreamWriter<String> writer = clientFactory.createEventWriter(streamName,
                                                                                 new JavaSerializer<String>(),
                                                                                 EventWriterConfig.builder().build())) {
           
            System.out.format("Writing message: '%s' with routing-key: '%s' to stream '%s / %s'%n",
                    message, routingKey, scope, streamName);
            final CompletableFuture writeFuture = writer.writeEvent(routingKey, message);
        }
}

public void readAndTransform(String scope, String streamName, URI controllerURI) {
        StreamManager streamManager = StreamManager.create(controllerURI);
       
        final boolean scopeIsNew = streamManager.createScope(scope);
        StreamConfiguration streamConfig = StreamConfiguration.builder()
                .scalingPolicy(ScalingPolicy.fixed(1))
                .build();
        final boolean streamIsNew = streamManager.createStream(scope, streamName, streamConfig);

        final String readerGroup = UUID.randomUUID().toString().replace("-", "");
        final ReaderGroupConfig readerGroupConfig = ReaderGroupConfig.builder()
                .stream(Stream.of(scope, streamName))
                .build();
        try (ReaderGroupManager readerGroupManager = ReaderGroupManager.withScope(scope, controllerURI)) {
            readerGroupManager.createReaderGroup(readerGroup, readerGroupConfig);
        }

        try (ClientFactory clientFactory = ClientFactory.withScope(scope, controllerURI);
             EventStreamReader<String> reader = clientFactory.createReader("reader",
                                                                           readerGroup,
                                                                           new JavaSerializer<String>(),
                                                                           ReaderConfig.builder().build())) {
            System.out.format("Reading all the events from %s/%s%n", scope, streamName);
            EventRead<String> event = null;
            do {
                try {
                    event = reader.readNextEvent(READER_TIMEOUT_MS);
                    if (event.getEvent() != null) {
                        System.out.format("Read event '%s'%n", event.getEvent());
String message = transform(event.getEvent().toString());
write("srsScope", "srsStream", "tcp://127.0.0.1:9090", "srsRoutingKey", message);
                    }
                } catch (ReinitializationRequiredException e) {
                    //There are certain circumstances where the reader needs to be reinitialized
                    e.printStackTrace();
                }
            } while (event.getEvent() != null);
            System.out.format("No more events from %s/%s%n", scope, streamName);
        }
    }
}

Sunday, September 22, 2019

Logging in Kubernetes illustrated:
Kubernetes is a container orchestration framework that hosts applications. The logs from the applications are saved local to the container. Hosts are nodes of a cluster and can be scaled out to as many as required for the application workload. A forwarding agent forwards the logs from the container to a sink. A sink is a destination where all the logs forwarded can be redirected to a log service that can better handle the storage and analysis of the logs. The sink is usually external to the cluster so it can scale independently and may be configured to receive via a well known protocol called syslog. A continuously running process forwards the logs from the container to the node before drains to the sink via the forwarding agent. This process is specified with a syntax referred to as a Daemonset which enables the same process to be started on a set of nodes which is usually all the nodes in the cluster.

 The log destination may be internal to the Kubernetes if the application providing log service hosted within the cluster.
The following are the configurations to setup the above logging mechanism:
1. Registering a sink controller:
apiVersion: apps.pivotal.io/v1beta1
kind: Sink
metadata:
  name: YOUR-SINK
  namespace: YOUR-NAMESPACE
spec:
  type: syslog
  host: YOUR-LOG-DESTINATION
  port: YOUR-LOG-DESTINATION-PORT
  enable_tls: true

2. Deploying a forwarding agent
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-system
3. Deploying the Daemonset
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:elasticsearch
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value:  “sample.elasticsearch.com”
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "30216"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "https"
          - name: FLUENT_UID
            value: "0"
          # X-Pack Authentication
          # =====================
          - name: FLUENT_ELASTICSEARCH_USER
            value: "someusername"
          - name: FLUENT_ELASTICSEARCH_PASSWORD
            value: "somepassword"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers


Saturday, September 21, 2019

Aspects of monitoring feature in a containerization framework.
This article describes the salient features of tools used for monitoring containers hosted on an orchestration framework.  Some products are dedicated for this purpose and strive to make it easier for administrators to help with this use-case. They usually fall in two categories – one for a limited set of built-in metrics that help with, say, the master to manage the pods and another that gives access to custom metrics which helps with, say, horizontal scaling of resources.
Most of the metrics products such as Prometheus for Kubernetes orchestration framework fall in the second category. The first category is rather lightweight and served over the http using resource APIs
The use of metrics is a common theme and the metrics are defined in a JSON format with key value pairs and timestamps. They might also carry additional descriptive information that might aid the handling of metrics.  Metrics are evaluated by expressions that usually look at a time-based window which gives slices of data points. These data points allow the use of calculator functions such as sum, min, average and so on.
Metrics often follow their own route to a separate sink. In some deployments of the Kubernetes orchestration framework, the sink refers to an external entity that know how to store and query metrics. The collection of metrics at the source and the forwarding of metrics to the destination follow conventional mechanisms that are similar to logs and audit events both of which have their own sinks.
As with all agents and services on a container, a secret or an account is required to control the accesses to resources for their activities. This role-based access control, namespace and global naming conventions is a prerequisite for any agent.
The agent has to run continuously forwarding data with little or no disruption. Some orchestrators facilitate this with the help of concepts similar to a Daemonset that run endlessly. The deployment is verified to be working correctly when a standard command produces an output same as a pre-defined output. Verification of monitoring capabilities becomes part of the installation feature.
The metrics comes helpful to be evaluated against thresholds that trigger alerts. This mechanism is used to complete the monitoring framework which allows rules to be written with expressions involving thresholds that then raise suitable alerts. Alerts may be delivered via messages or email or any other form of notification services. Dashboards and mitigation tools may be provided from the product providing full monitoring solution.
Almost all activities of any resource in the orchestrator framework can be measured. These include the core system resources which may spew the data to logs or to audit event stream. The option to combine metrics, audit and logs effectively rests with the administrator rather than the product designed around one or more of these.
Specific queries help with the charts for monitoring dashboards. These are well-prepared and part of a standard portfolio that help with the analysis of the health of the containers.

Friday, September 20, 2019

We looked at a few examples of applying audit data for the overall product. In all these cases, the audit dashboards validate the security and integrity of the data.

  • The incident review audit dashboard provides an overview of the incidents associated with users. 
  • The Suppression audit dashboard provides an overview of notable event suppression activity. 
  • The Per-Panel Filter Audit dashboard provides information about the filters currently in use in the deployment.
  • The Adaptive Response Action Center dashboard provides an overview of the response actions initiated by adaptive response actions, including notable event creation and risk scoring.
  • The Threat Intelligence Audit dashboard tracks and displays the current status of all threat and generic intelligence sources. 
  • The product configuration health dashboard is used to compare the latest installed version of the product to prior releases and identity configuration anomalies. 

The Data model audit dashboard displays the information about the state of data model accelerations in the environment. Acceleration here refers to a speed up of data models that represent extremely large datasets where certain operations such as pivots become faster with the use of data-summary backed methods.
The connectors audit report on hosts forwarding data to the product. This audit is an example for all other components that participate in data handling
The data protection dashboard reports on the status of the data integrity controls.
Audit dashboards provide a significant opportunity to show complete, rich, start-to-finish user session activity data in real-time. These include all access attempts, session commands, data accessed, resources used, and many more. Dashboards can also be compelling and intuitive for Administrator intervention to user experience. Security information and event management can be combined from dedicated systems as well as application audit. This helps to quickly and easily resolve security incidents. The data collection can be considered as tamper-proof which makes the dashboard the source of truth.

Thursday, September 19, 2019


Let us look at a few examples of applying audit data for the overall product. In all these cases, the audit dashboards validate the security and integrity of the data. The audit data must be forwarded and the data should not be tampered with.
The incident review audit dashboard provides an overview of the incidents associated with users. It displays how many incidents are associated with a specific user. The incidents may be selected based on different criteria such by status, by user or by kind or other forms of activity. Recent activities also help determine the relevance of the incidents.
The Suppression audit dashboard provides an overview of notable event suppression activity. This dashboard shows how many events are being suppressed, and by whom, so that notable event suppression can be audited and reported on. Suppression is just as important as an audit of access to resources.
The Per-Panel Filter Audit dashboard provides information about the filters currently in use in the deployment.
The Adaptive Response Action Center dashboard provides an overview of the response actions initiated by adaptive response actions, including notable event creation and risk scoring.
The Threat Intelligence Audit dashboard tracks and displays the current status of all threat and generic intelligence sources. As an analyst, you can review this dashboard to determine if threat and generic intelligence sources are current, and troubleshoot issues connecting to threat and generic intelligence sources.
The ES configuration health dashboard is used to compare the latest installed version of the product to prior releases and identity configuration anomalies. The dashboard can be made to review against specific past versions.
The Data model audit dashboard displays the information about the state of data model accelerations in the environment. Acceleration here refers to a speed up of data models that represent extremely large datasets where certain operations such as pivots become faster with the use of data-summary backed methods.
The connectors audit report on hosts forwarding data to the product. This audit is an example for all other components that participate in data handling
The data protection dashboard reports on the status of the data integrity controls.

Wednesday, September 18, 2019

There are a few caveats to observe when processing events as opposed to other resources with webhooks. 

Events can come as a cascade, we cannot cause undue delay in the processing of any one of them. Events can cause the generation of other events. This would require stopping additional events from generating if the webhook is responsible for transforming and creating new events that cause those additional events. The same event can occur again, if the transformed event triggers the original event or if the transformed events ends up again in the webhook because the transformed event is also of the same kind as the original event. The webhook has to skip the processing of the event it has already handled. 

The core resources of Kubernetes all are subject to the same verbs of Get, create, delete, etc. However, events are far more numerous and occur at a fast clip than some of the other resources. This calls for handling and robustness equivalent to the considerations in a networking protocol or a messaging queue. The webhook is not designed to handle such rate or latency even though it might skip a lot of the events. Consequently, selector labels and processing criteria become all the more important. 

The performance consideration of the event processing by webhooks aside, a transformation of an event by making an http request to an external service also suffers from outbound side considerations such as the timeouts and retries on making the http request. Since performance of the webhooks of events has been called out above as significantly different from the few and far in between processing for other resources, the webhook might have to make do with lossy transformation as opposed to a persistence-based buffering.  

None of the above criteria should matter for the usage of this kind of webhook for the purpose of diagnostics which is most relevant to troubleshooting analysis. Diagnostics have no requirement on performance from webhook. When the events are already filtered to audit events, and when the conversion of those events is even more selective, the impact on diagnostics is little or none from changes in the rate and delay of incoming events.