Cluster computing

Tuesday, March 5, 2019

Today we continue discussing the best practice from storage engineering:

539) From supercomputers to large scale clusters, the size of compute, storage and network can be made to vary quite a bit. However, the need to own or manage such capability reduces significantly once it is commoditized and outsourced.

540) Some tasks are high priority and are usually smaller in number than the general class of tasks. If they arrive out of control, it can be significant cost. Most storage products try to control the upstream workload for which they are designed. For example, if the tasks can be contrasted significantly, it can be advantageous.

541) The scheduling policies for tasks can vary from scheduler to scheduler. Usually a simple policy scales much better than complicated policies. For example, if all the tasks have a share in a pie representing the scheduler, then it is simpler to expand the pie rather than re-adjusting the pie slices dynamically to accommodate the tasks.

542) The weights associated with tasks are set statically and then used in computations to determine the scheduling of the tasks. This can be measured in quantums of time and if a task takes more than what is expected, it is called a quantum thief. A scheduler uses tallying to find and make a quantum thief yield to other tasks.

543) Book-keeping is essential for both scheduler and allocator not only to keep track of grants but also for analysis and diagnostics.

544) A scheduler and allocator can each have their own manager that separates the concerns of management from their work

545) The more general purpose the scheduler and allocator become, the easier it is to use them in different components. Commodity implementations win hands down against specialized ones because they scale.

546) The requests for remote resources are expected to perform longer than local operations. If they incur timeouts, the quantum grants may need to stretch over.

547) Timeout must expand to include timeouts from nested operations.

Monday, March 4, 2019

Today we continue discussing the best practice from storage engineering:

537) The number of times a network is traversed also matters in the overall cost for data. The best cost for data is when data is at rest rather than in transit.

538) The choice between a faster processor or a large storage or both is a flexible choice if the dollar value is the same. In such cases, the strategy can be sequential, streaming or batched. Once the strategy is in place, the dollar TCO significantly increases when business needs change.

539) From supercomputers to large scale clusters, the size of compute, storage and network can be made to vary quite a bit. However, the need to own or manage such capability reduces significantly once it is commoditized and outsourced.

540) Some tasks are high priority and are usually smaller in number than the general class of tasks. If they arrive out of control, it can be significant cost. Most storage products try to control the upstream workload for which they are designed. For example, if the tasks can be contrasted significantly, it can be advantageous.

541) The scheduling policies for tasks can vary from scheduler to scheduler. Usually a simple policy scales much better than complicated policies. For example, if all the tasks have a share in a pie representing the scheduler, then it is simpler to expand the pie rather than re-adjusting the pie slices dynamically to accommodate the tasks.

542) The weights associated with tasks are set statically and then used in computations to determine the scheduling of the tasks. This can be measured in quantums of time and if a task takes more than what is expected, it is called a quantum thief. A scheduler uses tallying to find and make a quantum thief yield to other tasks.

543) Book-keeping is essential for both scheduler and allocator not only to keep track of grants but also for analysis and diagnostics.

544) A scheduler and allocator can each have their own manager that separates the concerns of management from their work

545) The more general purpose the scheduler and allocator become, the easier it is to use them in different components. Commodity implementations win hands down against specialized ones because they scale.

Sunday, March 3, 2019

We were discussing the implementation of a ledger with our first and second posts. We end it today.
There have been two modes to improve sequential access. First is the batch processing which allows data to be partitioned in batches on parallel threads. This works very well for data where the results can also behave as if they were data for the same calculations and the data can be abstracted into batches This is called summation form. Second, the batches can be avoided if the partitions are tiled over. This is called streaming access and it uses a window over partitions to make calculations and adjust them accordingly as the windows slides over the data in a continuous manner. This works well for data which is viewed as continuous and limitless such as from a pipeline.
Operations on the writer side too can be streamlined when it has to scale to large volumes. Some form of parallelization is also used here after the load is split into groups of incoming requests. To facilitate faster and better ledger writes, they are written once and as detailed as possible to avoid conflicts with others and enable more operations to be read-only. This separation of read-write and read-only activities on the ledger improve not only the ledger but also let it remain the source of truth. Finally, ledgers have grown to be distributed even while most organizations continue to keep the ledger in-house and open up only for troubleshooting, inspection, auditing and compliance.
Translations are one of the most frequently performed operations in the background. An example of translation is one where two different entries are to be reconciled the same as one uniform entry. These entries so that the calculations can be simpler.
Some of these background operations involve forward only scanning of a table or list with no skipping. They achieve this with the help of a progress marker for themselves where they keep track of the sequence number that they last completed their actions on. This works well in the case where the listing order remains unchanged.
Let us consider a case where this progressive scan may skip range. Such a case might arise when the listing is ordered but not continuous. There are breaks in the table as it gets fragmented between writes and the scanner does not see the writes between the reads. There are two ways to handle this. The first way to handle it is to prevent the write between the reads. This can be enforced with a simple sealing of the table prior to reading so that the writes cascade to a new page. The second way is to revisit the range and see if the count of processed table entries matches the sequence and redo it when it doesn’t agree. Since the range is finite, the retries are not very expensive and requires no alteration of the storage. Both approaches will stamp the progress marker at the end of the last processed range. Typically there is only one progress marker which moves from the ends on one range to the next.
Sometimes it is helpful to take actions to check that the table is stable and serving even for analysis. A very brief lock and release is sufficient in this regard.

#codingexercise
int GetPermutations (int n, int k) {
If (n == 0 || k > n) return 0;
if (k == 0 || k == n)
return 1;
return Factorial (n) / Factorial (n-k);
}

Saturday, March 2, 2019

Today we continue discussing the best practice from storage engineering:

528) Live updates versus backup traffic, for instance, qualify for separate products. Aging and tiering of data also qualify for separate storage. Data for reporting can similarly be separated into its own stack. Generated data that drains into logs can similarly feed diagnostic stacks.

529) The number of processors or resources assigned to a specific stack is generally earmarked with T-shirt sizing. This is helpful for cases where the increment or decrement of resources doesn’t have to be done by a notch by notch level.

530) Public cloud and hybrid cloud storage discussions are elaborated on many forums. The hybrid storage provider is focused on letting the public cloud appear as front-end to harness the traffic from the users while allowing storage best practice for the on-premise data.

531) Data can be pushed or pulled from source to destination. If it’s possible to pull, it helps in relieving the workload to another process.

532) Lower level data transfers are favored over higher level data transfers involving say HTTP.

533) The smaller the data transfers the larger the number which results in more chatty and potentially fault prone traffic. We are talking about very small amount of data per request.

534) The larger size reads and writes are best served by multiple parts as opposed to long running requests with frequent restarts

535) The up and down traversal of the layers of the stack are expensive operations. These need to be curtailed.

static int GetNChooseKDP(int n, int k)
{
if ( n == 0 || k > n)
return 0;
if (k == 0 || k == n)
return 1;
return GetNChooseKDP(n - 1, k - 1) + GetNChooseKDP(n - 1, k);
}

Friday, March 1, 2019

Today we continue discussing the best practice from storage engineering :

517) Customers also prefer ability to switch products and stacks. They are willing to try out new solutions but have become increasingly wary of tying to any one product or the increasing encumbrances

518) Customers have a genuine problem with data being sticky. They cannot keep up with data transfers

519) Customers want the expedient solution first but they are not willing to pay for re- architectures

520) Customers need to evaluate the cost of even data transfer over the network. Their priority and severity is most important to them.

521) Customers have concerns with the $/resource whether it is network, compute or storage. They have to secure ownership of data and yet have it spread out between geographical regions. This means they have trade-offs from the business perspectives rather than the technical perspectives

522) Sometimes the trade-offs are not even from the business but more so from the compliance and regulatory considerations around housing and securing data. Public cloud is great to harness traffic to the data stores but there are considerations when data has to be on-premise.

523) Customers have a genuine problem with anticipating growth and planning for capacity. The success of an implementation done right enables future prospect but implementations don’t always follow the design and it is also hard to get the design right.

524) Similarly, customers cannot predict what technology will hold up and what won’t in the near and long term. They are more concerned about the investments they make and the choices they have to face.

525) Traffic, usage and patterns are good indicators for prediction once the implementations is ready to scale.

Thursday, February 28, 2019

Today we continue discussing the best practice from storage engineering:

515) Most of the storage products have embraced APIs in one form or the other. Their usage for protocols with external agents, internal diagnostics and manageability are valuable as online tools and merit the same if not better appreciation than scripts and offline tools.

516) Storage products solve a piece of the puzzle. And customers don’t always have boilerplate problems. Consequently, there needs to be a bridging somewhere.

517) Customers also prefer ability to switch products and stacks. They are willing to try out new solutions but have become increasingly wary of tying to any one product or the increasing eencumbrances

518) Customers have a genuine problem with data being sticky. They cannot keep up with data transfers

519) Customers want the expedient solution first but they are not willing to pay for re- architectures

520) Customers need to evaluate the cost of even data transfer over the network. Their priority and severity is most important to them.

Wednesday, February 27, 2019

Today we continue discussing the best practice from storage engineering:

510) The storage product just like any other software product is a culmination of efforts from a forum of roles and people playing those roles. The recognition and acceptance of the software is their only true feedback.

511) Almost every entry in the storage for users data is sandwiched between a header and a footer in some container and the data segments read with offset and length. This mechanism is repeated at various layers and becomes all the more useful when data is encrypted.

512) Similarly entries of data are interspersed with routine markers and indicators from packaging and processing perspectives. Many background jobs frequently stamp what’s relevant to them in between data segments so that they can continue their processing in a progressive manner.

513) It must be noted that the packaging of data inside the storage product has many artifacts that are internal to the product and certainly not readable in raw form. Therefore some offline command line tool to dump and parse the contents could prove very helpful.

514) The argument above also holds true for message passing between shared libraries inside the storage product While logs help capture the conversations, their entries may end up truncated. An offline tool to fully record, replay and interpret large messages would be helpful for troubleshooting.

515) Most of the storage products have embraced APIs in one form or the other. Their usage for protocols with external agents, internal diagnostics and manageability are valuable as online tools and merit the same if not better appreciation than scripts and offline tools.