Best practice from storage engineering:
Introduction: Storage is one of the three pillars of any commercial software. Together these three concepts of compute, networking and storage, are included directly as products to implement solutions, as components to make products, as perspectives for implementation details of a feature within a product and so on. Every algorithm that is implemented pays attention to these three perspectives in order to be efficient and correct. We cannot think of distributed or parallel algorithms without network, efficiency without storage, and convergence without compute. Therefore these disciplines bring certain best practice from the industry.
We list a few in this article from storage engineering perspective:
1) Not a singleton – Most storage vendors know that that data is precious. It cannot be lost or corrupted. Therefore storage industry vendors go to great lengths in making data safe at rest by not allowing a single point of failure such as a disk crash. If the data is written to a store, it is made available with copies or archived as backup.
2) Protection against loss – Data when stored may get corrupted. In order to make sure the data does not change, we need to keep additional information. This is called erasure coding and with additional information about the data, we can not only validate the existing data, we may even be able to recreate the original data by tolerating certain loss. How we store the data and the erasure code, also determines the level of redundancy we can use.
3) Hot warm cold – Data differs in treatment based on the access. Hot data is one that is actively read and written. Warm and cold indicate progressive inactivity over the data. Each of these labels allows different leeway with the treatment to the data and the cost of storage.
4) Organizational unit of data – Data is often written in one of several units of organization depending on the producer. For example, we may have blobs, files and block level storage. These do not need to be handled the same way and each organizational unit even comes with its own software stack to facilitate the storage.
#codingexercise
Introduction: Storage is one of the three pillars of any commercial software. Together these three concepts of compute, networking and storage, are included directly as products to implement solutions, as components to make products, as perspectives for implementation details of a feature within a product and so on. Every algorithm that is implemented pays attention to these three perspectives in order to be efficient and correct. We cannot think of distributed or parallel algorithms without network, efficiency without storage, and convergence without compute. Therefore these disciplines bring certain best practice from the industry.
We list a few in this article from storage engineering perspective:
1) Not a singleton – Most storage vendors know that that data is precious. It cannot be lost or corrupted. Therefore storage industry vendors go to great lengths in making data safe at rest by not allowing a single point of failure such as a disk crash. If the data is written to a store, it is made available with copies or archived as backup.
2) Protection against loss – Data when stored may get corrupted. In order to make sure the data does not change, we need to keep additional information. This is called erasure coding and with additional information about the data, we can not only validate the existing data, we may even be able to recreate the original data by tolerating certain loss. How we store the data and the erasure code, also determines the level of redundancy we can use.
3) Hot warm cold – Data differs in treatment based on the access. Hot data is one that is actively read and written. Warm and cold indicate progressive inactivity over the data. Each of these labels allows different leeway with the treatment to the data and the cost of storage.
4) Organizational unit of data – Data is often written in one of several units of organization depending on the producer. For example, we may have blobs, files and block level storage. These do not need to be handled the same way and each organizational unit even comes with its own software stack to facilitate the storage.
#codingexercise
// predicate to select positive integer sequence from enumerated  combinations 
List <List <Integer>> result = new ArrayList <>(Collection2.filter (combinations, new Predicate (List <Integer>() { 
       @Override 
       public boolean apply (List <Integer> sequence) { 
                   return isPositive (sequence); 
               } 
});
 
No comments:
Post a Comment