Cluster computing

Saturday, March 30, 2019

Today we continue discussing the best practice from storage engineering

648) If the listing is distributed, it helps to have a map-reduce on the listing

649) If the entries vary widely affecting the overall results at a high rate, it is easier to take on the changes on the compute side but allow the storage of the listings to be progressive. This way tasks can communicate the changes to their respective sortings to the scheduler which can then adjust the overall sort order

650) If the listing is a stream, processing on a stream works the same was as cursor on a database adjusting the rankings gathered so far for each and every entry as they are encountered.

651) The processing of stream is facilitated with compute packages from Microsoft and Apache for example. These kind of packages highlight the stream processing techniques that can be applied to stream from a variety of storage.

652) About the query and the algorithm be it mining or machine learning can be externalized. This can work effectively across storage just as much as it is applicable to specific data.

653) The algorithms vary widely in their duration and convergence even for the same data. There is usually no specific rule to follow when comparing algorithms in the same category

654) The usual technique in the above case is to use a strategy pattern that interchanges algorithms and evaluated them on trial and error basis.

Cluster computing

Saturday, March 30, 2019

No comments:

Post a Comment