Cluster computing

Wednesday, October 14, 2020

Network engineering continued ...

This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html

A Peer-to-Peer (P2P) network is popular for file-sharing or data sharing activities because the data accrues based on popularity, not on utility.

A P2P network introduces a network first design where peers are autonomous agents and there is a protocol to enable them to negotiate contracts, transfer data, verify the integrity and availability of remote data, and to reward with payments. It provides tools to enable all these interactions. It enables the distribution of the storage of a file as shards on this network and these shards can be stored using a distributed hash table. The shards themselves need not be stored in this hash table, rather a distributed network and messaging could facilitate it with location information.

1. P2P can be structured or unstructured.
1. In a structured topology, the P2P overlay is tightly controlled usually with the help of a distributed hash table (DHT). The location information for the data objects is deterministic as the peers are chosen with identifiers corresponding to the data object's unique key. Content goes to specified locations that make the subsequent query easier.
1. Unstructured P2P is composed of peers joining based on some rules and usually without any knowledge of the topology. In this case, the query is broadcast, and peers that have matching content return the data to the originating peer. This is useful for highly replicated items but not appropriate for rare items. In this approach, peers become readily overloaded and the system does not scale when there is a high rate of aggregate queries.
  P2P is considered a top-heavy network. Top-heavy means we have an inverted pyramid of layers where the bottom layer is the network layer. This is the substrate that connects different peers. The overlay nodes management layer handles the management of these peers in terms of routing, location lookup, and resource discovery. The layer on top of this is the features management layer which involves security management, resource management, reliability, and fault resiliency.

Tuesday, October 13, 2020

Network engineering continued ...

This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html

Message passing between agents in a distributed environment is required. Any kind of protocol can suffice for this. Some networking systems like to use open source in this regard while others build on message queuing.

Every network prepares for fault tolerance. Since faults can occur in a domain, temporarily or permanently, each component determines which activities to perform and how to overcome what is not available.

Fault domains are a group covering known faults in an isolation. Yet some faults may occur in combinations. It is best to give names to patterns of faults so that they can be included in the design of components.

Data-driven computing has required changes in networking products. While previously, online transactional activities were read-write intensive and synchronous, today most processors including order and payments are done asynchronously on data-driven frameworks usually employing a message queueing. Networking products do better with improved caching for this processing.

The latency involved in the execution of an order and the prevention of repeats in the order processing has not been relaxed. A gift card run through a store register for renewing must make its way to the backend networking so that it is not charged again when run through and before the prior execution is completed. A networking based product such as message queue should not require a database with strong ACID guarantees but it should support serialized readability against all read-write operations

Monday, October 12, 2020

Network engineering continued ...

This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html

The speed of data transfer is important in the case of social engineering applications. In such cases, a large number of concurrent messages such as in the order of millions per second need to be processed. A messaging platform such as one written in Erlang for Whatsapp may be more performant than servers written with extensive inter-process communication whether they take the form of gRPC/REST.

Social engineering applications also have a lot of load balancing requirements and require a large number of servers to be provisioned to handle their load. Bandwidth and latency requirements can be met if there are more servers available. The availability of more servers also relieves the restrictions imposed by the CAP theorem.

The eight fallacies of distributed computing can be listed as:

The network is reliable

There is no latency

The bandwidth is infinite

The network is secure

Topology does not change

There is only one administrator

Transport cost is zero

The network is homogeneous

Networking dominates storage for distributed hash tables and message queues to scale to social engineering applications. Whatsapp’ Erlang and FreeBSD architecture has shown unmatched symmetric multiprocessing (SMP) scalability

Sunday, October 11, 2020

Network engineering continued ...

This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html

data by type: Networks usually don't interpret data in transit. The data appears as a sequence of bits. Only the header information of the packet is helpful to classify the data. But the packet does not have to be those from networking protocols only. Data types can also be used with data to enhance the data and many message libraries like protobuf also make use of it. This helps with the interpretation and validation of data patterns at either end.

Intrusions are detected by activity or policy violations. Intrusion can occur at the network level or at the host level. In either case, changes to the typical activity are detected and reported. At the network level, the monitoring is for network traffic against a known set of attacks. At the host level, the monitoring is for important files that may be open and used or packets at the host level only. Systems with response capabilities are referred to as intrusion prevention systems.

The methods of intrusion detection vary based on signatures or anomalies. The signature-based intrusion detection checks for patterns or sequences used by malware. The anomalies based intrusion detection checks for new or unknown attacks by comparing it to a model of normal activity. Since there may be a lot of noise, the precision and recall are tuned to the model by utilizing the parameters.

The intrusion prevention mechanisms can monitor the entire network, the wireless network, the traffic for anomalies such as a distributed denial-of-service attack or monitor a single host. The detection methods involved include signature-based detection, statistical anomaly-based detection, and stateful protocol-based detection.

The opensource libraries for intrusion detection have become popular with the OSSEC project

Saturday, October 10, 2020

Network engineering continued ...

This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html

Leveraging monitoring of the host: When an application is deployed to Platform-as-a-service, it no longer has the burden of maintaining its own monitoring. The same applies to a networking service as a product depending on where it is deployed. The deeper we go in the stack including the fabric below the networking server, the more amenable they are for the monitoring of the layer above.

Driver verification framework: Not all network actions are fire and forget. Some of them may be state-driven and since there can be any amount of delay between transitions, usually some form of driver verifier framework consistently proves useful to network software development. The testing with such a verifier helps exhaust protocol exchanges.

There are time periods of peak workload for any networking product. These products serve annual holiday sales, specific anniversaries and preplanned high demand. Utilization of the product under such activity is unusually high. While capacity may be planned to meet the demand, there are ways to tune the existing system to extract more performance. Part of these efforts include switching from being disk-intensive to performing more in-memory computation and utilizing other software products to be used with the networking server such as Memcache.

When the load is high, it is difficult to turn on the profiler to study bottlenecks. This can be safely done in advance in performance labs, but an even easier strategy involves selectively turning off components that are not required and scaling the component that is under duress. A priority list of mitigatory steps may then be curated prior to periods of heavy loads.

Friday, October 9, 2020

Network engineering continued ...

This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html

In-memory server: If the networking server is stateless, it can run entirely in-memory require no state persistence. A server that runs entirely in-memory with little or no disk access can improve performance by orders of magnitude.

Distributed ledger: this is gaining popularity where there is no central ledger and no requirement for central ownership for verification of grants and revocations. It mitigates tampering. It is great storage for storing and processing network data and works for products that do not belong to an organization or cloud.

Footprint: The code for a networking server can run on any device. Java, for example, runs on billions of devices and a networking server written in Java can run even on pocket devices. If the storage is flash and the entire storage server runs only on flash, it makes a great candidate for usability.

Editions: Just like the code for networking service can be made suitable for different devices, it can ship as different editions. One way to determine the different kinds of editions is to base it on where the customers demand it. Although there are many perspectives in these decisions, the ultimate service of the product is for the customer.

Standalone mode: Most networking products offer capabilities in a standalone mode. This makes it easy to try out the product without dependencies. One Box deployments also work in this manner. When we remove the dependency on the hardware, we enable the product to be easier to study and try out the features.

Thursday, October 8, 2020

Network engineering continued ...

This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html

Event-based programming is harder to co-ordinate and diagnose as compared to sequential programming yet it is fondly used in many storage drivers and even in user mode components that do not need to be highly responsive or where there might be a significant delay between action triggers. This requires a driver verifier to analyze all the code paths. Instead, synchronous execution suffices with an object-oriented design for better organization and easier troubleshooting. While it is possible to mix the two, the notion that the execution follows the timeline in the logs for the activities performed by the storage product helps, reduce the overall cost of maintenance.

StackTraces: When the layers of a software product are traversed by shared data structures such as login contexts, then it is helpful to capture and accumulate the stack trace at the boundaries for troubleshooting purposes. These ring buffer of stack traces provide instant history for the data structures

Wrapping: This is done not just for packets, certificates, encryption, or exceptions. Wrapping works for any artifact that is refreshed or renewed and we want to preserve the old with the new. This may apply even to headers and metadata of data structures that fall in the control path
Boundaries: The boundaries in a continuous stream of data is usually dependent on two things: a fixed-length boundary of segments or a variable length depending on the application logic. However, the application refers to the component and the scope of performing demarcation on the data. Therefore, boundaries can be nested, non-overlapping, and frequently require translations.
Virtual Timeline: Most entities on the network rely on the NTP for the actual timeline. With messages passed between distributed components of the storage, there is a provision for using sequence numbers as a virtual timeline. Together boundaries and virtual timeline enable spatial and temporal capture of the changes to the data suitable for resolving conflicts. Recording the virtual time with the help of a counter and using it for comparisons is one thing, reconstructing the event sequence is another.