Cluster computing: March 2019

Sunday, March 31, 2019

Today we continue discussing the best practice from storage engineering:

649) If the entries vary widely affecting the overall results at a high rate, it is easier to take on the changes on the compute side but allow the storage of the listings to be progressive. This way tasks can communicate the changes to their respective sortings to the scheduler which can then adjust the overall sort order

650) If the listing is a stream, processing on a stream works the same was as cursor on a database adjusting the rankings gathered so far for each and every entry as they are encountered.

651) The processing of stream is facilitated with compute packages from Microsoft and Apache for example. These kind of packages highlight the stream processing techniques that can be applied to stream from a variety of storage.

652) About the query and the algorithm be it mining or machine learning can be externalized. This can work effectively across storage just as much as it is applicable to specific data.

653) The algorithms vary widely in their duration and convergence even for the same data. There is usually no specific rule to follow when comparing algorithms in the same category

654) The usual technique in the above case is to use a strategy pattern that interchanges algorithms and evaluated them on trial and error basis.

655) Storage services can take advantage of containerization just like other applications. Provisioning the service over the containers while allowing the nodes to remain part of the cluster is both scale-able and portable.

Saturday, March 30, 2019

Today we continue discussing the best practice from storage engineering

648) If the listing is distributed, it helps to have a map-reduce on the listing

649) If the entries vary widely affecting the overall results at a high rate, it is easier to take on the changes on the compute side but allow the storage of the listings to be progressive. This way tasks can communicate the changes to their respective sortings to the scheduler which can then adjust the overall sort order

650) If the listing is a stream, processing on a stream works the same was as cursor on a database adjusting the rankings gathered so far for each and every entry as they are encountered.

651) The processing of stream is facilitated with compute packages from Microsoft and Apache for example. These kind of packages highlight the stream processing techniques that can be applied to stream from a variety of storage.

652) About the query and the algorithm be it mining or machine learning can be externalized. This can work effectively across storage just as much as it is applicable to specific data.

653) The algorithms vary widely in their duration and convergence even for the same data. There is usually no specific rule to follow when comparing algorithms in the same category

654) The usual technique in the above case is to use a strategy pattern that interchanges algorithms and evaluated them on trial and error basis.

Friday, March 29, 2019

Today we continue discussing the best practice from storage engineering:

639) If there are multiple registrations that need to be kept in sync, they get harder to maintain. It is easier if the lists can be combined or there is a one to one mapping between the lists

640) Failed tasks may require new tasks to be added in which case, it is better to find the failed tasks as separate from the otherwise new tasks.

641) When the tasks are constantly replenished, it is helpful to keep track of in versus out.

642) The tasks that are out are candidates for cleanup.

643) The tasks that are in are either existing or new. They are mutually exclusive so it is easy to tell the new ones from the old.

644) The tasks that are new will need things setup for them to execute. It involves initialization so that they can be included in the list

645) The tasks that run long need to indicate progress in some way so that the scheduler knows that this task is still active and not stuck.

646) When the tasks have to sort the results, the sorting order might change as the listing changes. It is helpful to refresh the listing before sorting.

647) If the listing is large, it is not easy to refresh without taking a cost on the overall query time. In such cases, it helps to have progressive listing. Where the changes are made to one one end of the listing while the other ends remains as is. As the listings are added to the tail, the stats from unchanged can be reused for the new entries.

Thursday, March 28, 2019

Today we continue discussing the best practice from storage engineering :

633) The state of an object is authoritative. If it weren’t the source of truth, the entries itself cannot be relied on without involving validation logic across entries. There is no probllem performing validations but doing them over and over again not only introduces delays but can be avoided altogether with clean state.

634) The states are also representative and unique. The entries are not supposed to be in two or more states at once. It is true that bitmask can be used to denote conjunctive status but a forward only discrete singular state is preferable.

635) The attributes in an entry are often added on a case by case basis since it is expedient to add a new attribute without affecting others. However, the accessors of the entry should not proliferate the attributes. If the normalization of the attribute can serve more than one accessor, it will provide consistency across accesses.

636) Background tasks may be run or canceled. Frequently these tasks need to be canceled. If they don’t do proper cleanup, they can leave their results in bad state. The shutdown helps release the resources properly

637) The list of background tasks may need to include and exclude the tasks as they appear or disappear. This is in addition to start and stop on each task. If the start and registration are combined, the stop and deregistration must also be combined.

638) As tasks appear and disappear, it is sometimes too tedious to perform all the chores for each task. In such cases, we merely difference the new tasks and add them to the list. This prevents the cleanup on each job as they are left. A large-scale global shutdown may suffice later.

639) If there are multiple registrations that need to be kept in sync, they get harder to maintain. It is easier if the lists can be combined or there is a one to one mapping between the lists

640) Failed tasks may require new tasks to be added in which case, it is better to find the failed tasks as separate from the otherwise new tasks.

Wednesday, March 27, 2019

Today we continue discussing the best practice from storage engineering:

631) Listing entry values are particularly interesting. In addition to the type of attributes in an entry, we can take advantage of the range of values that these attributes can take. For example, we can reserve boundary values and extremely tiny values that will not be encountered in the real world at least for the majority of cases.

632) When the values describe the size of an associated object, the size itself can be arbitrary and it is not always possible to rule out a size for a user object no matter how unlikely it seems. However, when used together with other attributes such as status, they become usable as representative of some object state that is otherwise not easily ffound.

633) The state of an object is authoritative. If it weren’t the source of truth, the entries itself cannot be relied on without involving validation logic across entries. There is no probllem performing validations but doing them over and over again not only introduces delays but can be avoided altogether with clean state.

634) The states are also representative are also unique. The entries are not supposed to be in two or more states at once. It is true that bitmask can be used to denote conjunctive status but a forward only discrete singular state is preferable.

635) The attributes in an entry are often added on a case by case basis since it is expedient to add a new attribute without affecting others. However, the accessors of the entry should not proliferate the attributes. If the normalization of the attribute can serve more than one accessor, it will provide consistency across accesses.

636) Background tasks may be run or canceled. Frequently these tasks need to be canceled. If they don’t do proper cleanup, they can leave their results in bad state. The shutdown helps release the resources properly

Tuesday, March 26, 2019

Today we continue discussing the best practice from storage engineering:

626) Listings are great to use when they are in a single location. However, they are often scoped to a parent container. If the parent containers are distributed, the listings tend to be multiple. In such cases the effort is repeated.

627) When the listings are separated by locations, the results from the search may be fewer than the expected total if only one of the locations is searched. This has often been encountered in deployments.

628) The listings do not need to be aggregated across locations in all cases. Sometimes, only the location is relevant and the listing and the search can be scoped to it.

629) Iterating the listings has proved banal in most cases both for system and for user. Consequently, either an identifier is used to go directly to the entry in the listing or a listing is reserved so that only that listing is accessed.

630) The listing can be cleaned up as well. There is no need to keep it growing with outdated entries and then archived by age. The cleaning can happen in the background so that list iterations skip over entries or do not see the entries that appear as removed.

631) Listing entry values are particularly interesting. In addition to the type of attributes in an entry, we can take advantage of the range of values that these attributes can take. For example, we can reserve boundary values and extremely tiny values that will not be encountered in the real world at least for the majority of cases.

Monday, March 25, 2019

Today we continue discussing the best practice from storage engineering:

620) When a new key is added, it may not impact existing keys but it does affect the overall space consumption of the listing depending on the size and number.

621) The keys can have as many fields as necessary. However, the lookups are faster when there are only a few keys to compare.

622) Key comparison can be partial or full. Partial keys are useful to match duplicates. The number of keys that share the same subkeys can be many. This form of comparison is very helpful to group entries.

623) Grouping of entries also help with entries that span groups based on sub keys. These work across groups

624) The number of entries may run to a large order but the prefix could be more inclusive of subkeys to narrow the search. This makes it efficient to run on these listings.

625) The number of entries also don’t matter to the number of keys in each entry as long as the prefix is using a small set of subkeys.

626) Listings are great to use when they are in a single location. However, they are often scoped to a parent container. If the parent containers are distributed, the listings tend to be multiple. In such cases the effort is repeated.

#codingexercise
Find paths in a matrix
int GetPaths(int x, int y)
{
if (x <= 0 || y <= 0)
return 1;

return GetPaths(x - 1, y) +
GetPaths(x - 1, y - 1) +
GetPaths (x, y - 1); // for the three possible directions
}

Saturday, March 23, 2019

Today we continue discussing the best practice from storage engineering:

611) Background tasks may sometimes need to catch up with the current activities. In order to accommodate the delay, they may either be run upfront so that changes to be processed are incremental or they can increase in number to divide up the work.

612) The results from the background tasks mentioned above might also take a long time to accumulate. They can be made available as they appear or batched.

613) The load balancer works very well to enable background tasks to catch up by not overloading a single task and distributing the online activities to ensure that the background task has light load

614) The number of background tasks or their type should not affect online activities. However, systems have known to be impacted when the tasks are consuming memory or delay garbage collection

615) There is no specific mitigation for one or more background tasks that takes plenty of shared resources but generally they are written to be fault tolerant so that they can pick up from where they left off.

Friday, March 22, 2019

Today we continue discussing the best practice from storage engineering :

606) We use data structures to keep the information we want to access in a convenient form. When this is persisted, it mitigates faults in the processing. However each such artifact brings in additional chores and maintenance. On the other hand, it is cheaper to execute the logic and the logic can be versioned. Therefore when there is a trade-off between compute and storage for numerous small and cheap artifacts, it is better to generate them dynamically

607) The above has far reaching impact when there are a number of layers involved and ac cost incurred in the lower layer bubbles up to the top layer.

608) Compute tends to be distributed in nature while storage tends to be local. They can be mutually exclusive in this regard.

609) Compute oriented processing can scale up or out while storage has to scale out.

610) Compute oriented processing can get priority but storage tends to remain in a class

611) Background tasks may sometimes need to catch up with the current activities. In order to accommodate the delay, they may either be run upfront so that changes to be processed are incremental or they can increase in number to divide up the work.

612) The results from the background tasks mentioned above might also take a long time to accumulate. They can be made available as they appear or batched.

613) The load balancer works very well to enable background tasks to catch up by not overloading a single task and distributing the online activities to ensure that the background task has light load

Thursday, March 21, 2019

Today we continue discussing the best practice from storage engineering

600) As with any product, a storage product also qualifies for the Specific-measureable-attainable-realistic-timely aka SMART process where improvements can be measured and the feedback used to improve the process and the product.

601) As with each and every process, there is some onus but the rewards generally outweigh the costs when it is reasoned and accepted by all. The six sigma process for example sets a high bar for quality because it eliminates errors progressively.

602) The iterations for six sigma were high so it takes longer and the results are not always available in the interim. The agile development processes allowed results to be incremental.

603) The agile methodology improved the iterations over the features in such a way that it did not impact the rest of the product. This enables faster feature development

604) The continuous integration and continuous deployment model made the individual feature improvements available for use because the changes were build, tested and deployed in lock step with development.

605) Together with the process to make improvements one change at a time and to have it build tested and deployed gives great benefit to the overall product.

606) We use data structures to keep the information we want to access in a convenient form. When this is persisted, it mitigates faults in the processing. However each such artifact brings in additional chores and maintenance. On the other hand, it is cheaper to execute the logic and the logic can be versioned. Therefore when there is a trade-off between compute and storage fo

Wednesday, March 20, 2019

We were discussing the S3 API:

Virtually all storage providers in cloud and on-premise storage solutions support the S3 Application Programming Interface. With the help of this APIs, applications can send and receive data without having to worry about the storage best practice. They are also able to switch from one storage appliance to another, one on-premise cluster to another, one cloud provider to another and so on. The API was introduced by Amazon but has become the industry standard and accepted by many storage providers. Even competitors like Azure provide an S3 Proxy to allow applications to access their storage with this API.

S3 like any other cloud-based service has been developed with the Representation State Transfer (REST) best practice. However, it does not involve all the provisions of HTTP 2.0 (released) or 3.0 (in-progress). Neither does it provide a protocol like abstraction where layers can be added above or below in a storage stack. A networking stack on the other hand has dedicated protocols for each of its layers. A storage stack may comprise of say, at the very least, an active workload versus a backup workload layer where the active remains as the higher layer and can support say HTTP and the backup remains as the lower layer and supports S3. Perhaps this distinction has been somewhat obfuscated where object storage can expand its capabilities to both layer.

The S3 API makes no endearment for developers on how object storage can be positioned as an object queue, an object cache, an object query, a gateway, a log index store, and many other such capabilities. API best practice enables automated monitoring, logging, auditing, and many more.

If a new storage class is added and the functionalities not at par with the regular S3 storage, then they would have an entirely new set of APIs and these would preferably have a prefix to differentiate the APIs from the rest.

If the storage stack is layered from the active regular s3 storage on the top to the less frequently used storage classes at a lower level than the regular and finally the glacier or least used data as the last layer, then aging of data alone is sufficient to migrate from the top layer all the way to the bottom without any involvement of API. That said, the API could provide visibility to the users on the contents of each storage class along with the additional functionality of direct placement of objects in those classes or their eviction. Since the nature of the storage class differentiates the api set and we decided to use prefix based api naming conventions to indicate the differentiation, each storage class adds a new set to the existing APIs. On the other hand, policies common to all three storage classes or the functionality that stripes across layers will be provided either with request attributes targeting that layer and its lower layers or with the help of parameters.

Functionalities such as deduplication, rsync for incremental backups, compression and management will require new APIs and these do not have to be limited to objects in any one storage class. APIs that automate the workflow of calling more than one APIs can also be written as a coarse granularity API. These wrapped APIs can collate functionalities for a single layer or across layers. They can also include automation not specific to the control or data path of the storage stack. Together the new functionality and wrapped APIs can become one whole set.

S3 API can become a protocol for all storage functionalities. They can be organized as a flat list of features or by resource path qualified functionalities where the resource path may pertain to storage classes. These API could also support discoverability. And these APIs could support nuances specific to file protocols and content addressability.

https://1drv.ms/w/s!Ashlm-Nw-wnWuT-u1f7DRjBRuvD4

Tuesday, March 19, 2019

We were discussing the S3 API from previous post.

The S3 API makes no endearment for developers on how object storage can be positioned as an object queue, an object cache, an object query, a gateway, a log index store, and many other such capabilities. API best practice enables automated monitoring, logging, auditing, and many more.
This makes the S3 API more useful and mash-able than what it is today. It highlights the notion that the storage is no more an appliance, product, or cloud but a layer that simplifies and solves most application storage needs without any involvement. Perhaps then it could be called a simpler storage service.
Let us take a closer look at how the API could be structured to facilitate calls generally to the storage stack, calls individually to a storage class, or calls that bridge lateral functionalities to the same class.
The S3 command guide provides core functionalities to buckets and objects, the listing, creating, updating and deleting of objects, the setting of access control lists, the setting of bucket policy, the expiration period, or the life-cycle, and the multi-part uploads.
If a new storage class is added and the functionalities not at par with the regular S3 storage, then they would have an entirely new set of APIs and these would preferably have a prefix to differentiate the APIs from the rest.
If the storage stack is layered from the active regular s3 storage on the top to the less frequently used storage classes at a lower level than the regular and finally the glacier or least used data as the last layer, then aging of data alone is sufficient to migrate from the top layer all the way to the bottom without any involvement of API. That said, the API could provide visibility to the users on the contents of each storage class along with the additional functionality of direct placement of objects in those classes or their eviction. Since the nature of the storage class differentiates the api set and we decided to use prefix based api naming conventions to indicate the differentiation, each storage class adds a new set to the existing APIs. On the other hand, policies common to all three storage classes or the functionality that stripes across layers will be provided either with request attributes targeting that layer and its lower layers or with the help of parameters.

Monday, March 18, 2019

The Simple Storage Service (S3) API
Virtually all storage providers in cloud and on-premise storage solutions support the S3 Application Programming Interface. With the help of this APIs, applications can send and receive data without having to worry about the storage best practice. They are also able to switch from one storage appliance to another, one on-premise cluster to another, one cloud provider to another and so on. The API was introduced by Amazon but has become the industry standard and accepted by many storage providers. Even competitors like Azure provide an S3 Proxy to allow applications to access their storage with this API.
This API has wide spread popularity for storing binary large objects also called blobs for short but that has not stopped providers from making an S3 façade over other forms of storage. They get the audience with S3 since the application changes are minimal and retrofit the storage solutions to support the most used aspect of S3.
S3 however is neither a restriction on the storage type nor true to its roots in object storage. It provides the ability to navigate the buckets and objects and adds improvements such as multi-part upload to the way data is sent into the object storage. It supports various headers to make the most of the object storage which includes sending options along with the payload as well as retrieving information on the stored products.
S3 like any other cloud-based service has been developed with the Representation State Transfer (REST) best practice. However, it does not involve all the provisions of HTTP 2.0 (released) or 3.0 (in-progress). Neither does it provide a protocol like abstraction where layers can be added above or below in a storage stack. A networking stack on the other hand has dedicated protocols for each of its layers. A storage stack may comprise of say, at the very least, an active workload versus a backup workload layer where the active remains as the higher layer and can support say HTTP and the backup remains as the lower layer and supports S3. Perhaps this distinction has been somewhat obfuscated where object storage can expand its capabilities to both layer.
Object storage can also support the notions of storage classes, versioning, retention, data aging, deduplication and compression and none of these are really feature very well by S3 without a header here or a prefix there or a tag or a path and this leads to incredible inconsistency not just between the features but also in their usages by the customers.

Sunday, March 17, 2019

The operation on inactive entries
Recently I came across an unusual problem of maintaining active and inactive entries. It was unusual because there were two sets and they were updated differently and at different times. Both the sets were invoked separately and there was no sign whether the sets were mutually exclusive. Although we could assume the invocations of operations on the sets were from the same component, they were invoked in different iterations. This meant that the operations taken on the set would be repeated several times.
The component only took actions on the active set. It needed to take an action on the entries that were inactive. This action was added subsequent to the action on the active set. However, the sets were not mutually exclusive so they had to be differentiated to see what was available in one but not the other. Instead this was overcome by delegating the sets to be separated at the source. This made it a lot easier to work with the actions on the sets because they would be fetched each time with some confidence that they would be mutually exclusive. There was no confirmation that the source was indeed giving up to date sets. This called for a validation on the entries in the set prior to taking the action. The validation was merely to check if the entry was active or not.
However, an entry does not remain in the same set forever. It could move from the active set to the inactive set and back. The active set and the inactive set would always correspond to their respective actions. This meant that the actions needed to be inverted between the entries so that they could flip their state between the two processing.
There were four cases for trying this out. The first case was when the active set was called twice. The second case was when the inactive set was called twice. The third case was when the active set was followed by the inactive set. The fourth case was when the inactive set was followed by the active set.
With these four cases, the active and the inactive set could have the same operations taken deterministically no matter how many times they were repeated and in what order.
The only task that remained now was to ensure that the sets returned from the source were good to begin with. The source was merely subscribed to events that added entries to the sets. However, the events could be called in any order and for arbitrary number of times. The event handling did not all exercise the same logic so the entries did not appear final in all the cases. This contributed to the invalid entries in the set. When the methods used to retrieve the active and inactive set were made consistent, deterministic, robust and correct, it became easier to work with the operations on the set in the calling component.
This concluded the cleaning up of the logic to handle the active and inactive sets.
We now follow up with the improvements to the source when possible:
There are different lists maintained for active, inactive, failover, scanned collections. They could all be part of the same collection and synchronized so that all accesses are serialized. Attributes on the entries can describe the state of the entry. If the entries need to be on separate collections, they could be synchronized with the same lock.
The operations on the entries may be made available as simpler full service methods where validation and updates are included. When the source is rewritten, it must make sure that all the existing behavior is unchanged.

Saturday, March 16, 2019

Today we continue discussing the best practice from storage engineering:

595) Virtually every service utilized from the infrastructure is candidate for standardization and consistency so that one component/vendor in the infrastructure may be replaced with another with little or no disruption

596) There are a number of stack frames that a developer has to traverse in order to find the code path taken by the execution thread and they don’t always pertain to layers but if the stack frames can get simpler for the developer, the storage product on the whole improves tremendously. This is not a rule but just a rule of thumb that the simpler the better.

597) As with all one-point maintenance code, there is bloating and complexity to handle different use cases from the same code. Unfortunately, developers don’t have the luxury to rewrite core components without significant investment of time and effort. Therefore version 1 of the product must always strive for building it right from the get go.

598) As use cases increase and the business improves, the product management pays a lot of attention to sustainable growth in the face of business needs. It is at this cusp of technology and business that the system architecture plays its best.

599) There are very few cases where the process goes wrong. On the other hand, there is a lot of advantage to trusting the process. Consequently, the product must be improved with sound processes.

600) As with any product, a storage product also qualifies for the Specific-measureable-attainable-realistic-timely aka SMART process where improvements can be measured and the feedback used to improve the process and the product.

Friday, March 15, 2019

Today we continue discussing the best practice from storage engineering:
es time and cost.

588) There are notions of SaaS, PaaS and, IaaS with clear separation of concerns in the cloud. The same applies to a storage layer in terms of delegating to dedicated products.

589) The organization in the cloud does not limit the number and type of services available from the cloud. The same holds true for the feature as services within a storage product.

590) The benefits that come with the cloud can also come from a storage product.

591) There are times when the storage product will have imbalanced load. They will need to be load balanced. Since this is an ongoing activity, it can be periodically scheduled or responded when thresholds are crossed.

592) When the layers of infrastructure and storage services are clearly differentiated, the upper layer may utilize the alerting from the lower layers for health checks and to take corrective actions.

593) There are a number of ways to monitor a system whether it is for performance, statistics or health checks. A system center management system can consolidate and unify the operations management. The storage product merely needs to publish to a system center.

594) There are several formats of metrics and monitoring data and generally they are proprietary. Utilizing an external stack for these purposes via APIs helps alleviate the concerns from the storage service.

595) Virtually every service utilized from the infrastructure is candidate for standardization and consistency so that one component/vendor in th

Thursday, March 14, 2019

Today we continue discussing the best practice from storage engineering:

583) Storage products have a tendency to accumulate user artifacts such as rules, containers and settings. It should be easy to migrate and upgrade them.

584) The migration mentioned above is preferable to be done via user friendly mechanism because they matter more to the user than to the system.

585) There are several times that customers will run into issues with upgrade and migration. Unfortunately, there is usually no dry run for the instance. One of the best techniques is to plan the upgrade.

586) Storage products embrace compute as much as the services are needed over the raw storage but the line of separation between compute and storage remains clear in solutions that use storage. The purer the compute over the storage, the better for the storage.

587) The dependency of storage on healthy nodes is maintained with the help of detection and remedial measures. If the administrator does not have to rush to replace a bad unit, it saves time and cost.

588) There are notions of SaaS, PaaS and, IaaS with clear separation of concerns in the cloud. The same applies to a storage layer in terms of delegating to dedicated products.

589) The organization in the cloud does not limit the number and type of services available from the cloud. The same holds true for the feature as services within a storage product.

590) The benefits that come with the cloud can also come from a storage product.

Wednesday, March 13, 2019

Today we continue discussing the best practice from storage engineering:

578) This export of logic is very helpful in overcoming the limitations of static configuration and reload of service. Regardless of the need for a runtime to execute the logic, even listenable config values can help with changes to rules.

579) The rules can be flat conjunctive filters or expression trees. Their evaluation is in program order.

580) The outcome of the processing of rules is the treatment given to the resource. There can be classes in outcome.

581) The number of rules is generally not a concern to compute and it is also not a concern of storage. However, the maintenance of rules is a significant onus and is preferable to avoid first.

582) The type of rules and the classes of outcome generally don’t change even in most heavily used filters. IPSec for example has a lot of attributes to secure the network but its type of rules and outcomes are well-known. Rules can therefore be rewritten periodically to make them more efficient.

583) Storage products have a tendency to accumulate user artifacts such as rules, containers and settings. It should be easy to migrate and upgrade them.

584) The migration mentioned above is preferable to be done via user friendly mechanism because they matter more to the user than to the system.

585) There are several times that customers will run into issues with upgrade and migration. Unfortunately, there is usually no dry run for the instance. One of the best techniques is to plan the upgrade.

Tuesday, March 12, 2019

Today we continue discussing the best practice from storage engineering:

574) If the range of sequences can be limited to a window, the user and application can take on much of the processing relieving the compute requirements from storage. Such intensive scripts can run anywhere the user wants as long as the data is available.

575) If logic pertains specifically to some data and applicable only to that data, it is possible to register logic and load a runtime to execute that logic specific to data just as it is possible to externalize query processing over an iterative data set.

576) There are several containers for logic usually packaged as modules and they can be invoke by a common runtime. However, at its simplest form, this logic is merely a set of rules.

577) The rules are scoped to the artifacts they secure. For system wide resources, there is only a singleton. For user resources, they can be dynamically fetched and executed as long as they are registered.

578) This export of logic is very helpful in
overcoming the limitations of static
configuration and reload of service.
Regardless of the need for a runtime to execute the logic,
even listenable config values can help with changes to rules.

Monday, March 11, 2019

Today we continue discussing the best practice from storage engineering:

571) Storage products often make use of bitmap index to store sequences efficiently when they are rather sparse. Bitmaps also help with conjunctive filters and this is useful in sequences with repeating members

572) The sequences can be more efficiently queried than standard query operators if the predicates are pushed down closer to the storage.

573) Sequences work well with bloom filters which test whether a member is part of the sequence or not. Sometimes it is enough to rule out that a member is not part of the set

574) If the range of sequences can be limited to a window, the user and application can take on much of the processing relieving the compute requirements from storage. Such intensive scripts can run anywhere the user wants as long as the data is available.

575) If logic pertains specifically to some data and applicable only to that data, it is possible to register logic and load a runtime to execute that logic specific to data just as it is possible to externalize query processing over an iterative data set.

#codingexercise
GetCombinations with repetitions for r items among n
int GetCombinations (int n, int r) {
return GetNChooseK ( (n+r-1) , r ) ;
}

We put n objects in k bins with (n-1) Choose (k-1)
int getGroups ( int n, int k) {
return GetNChooseK (n-1, k-1);

}

We can do permutations with n!/(n-r)!

Sunday, March 10, 2019

Today we continue discussing the best practice from storage engineering:

565) The focus on business value does not remain confined to the people on the border with the customers. It comes from deep within product development and engineering.

566) The storage product can relieve compute altogether where results are computed once and saved for all subsequent usages. This works well for data that does not change over time.

567) When the data changes frequently, it helps to organize it in a way such that those that don’t change are on one side and those that do are on the other side. This helps to making incremental results from the data.

568) Data will inevitably have patterns with reuse. We can call them sequences. While most data might be stored with general purpose btree, the sequences call for more efficient data structures such as radix tree. These help insert and lookup sequences easier.

569) Sequences are more efficiently stored if they are sorted. This canonicalizes them. It also makes lookup use binary search.

570) The number of sequences might become very large. In such case, it might be better to not make it part of the same tree and user other data structures like better to navigate shards

Saturday, March 9, 2019

Today we continue discussing the best practice from storage engineering:

560) The number of applications using the same storage is usually not a concern. The ability to serve them with storage classes is noteworthy

561) When an application wants to change the workload on the storage, architects to prefer to swap the storage product with something more suitable. However, a performance engineer can circumvent the approach with optimizations that leverage the existing product. It is always a good practice to give this a try.

562) System architecture holds in favor of changing business needs from the smallest components to the overall product. However, it is rather centralized and sometimes using another instance of the product with customizations can mitigate the urgency while giving ample time for consolidation.

563) The use of storage product also depends on the developer community. Many products such as time series databases and graph databases have generated greater acceptance by endearing the product to developers.

564) Sales and support need to be armed with the latest information and remain current on all features from the customers. They need to have those features work exactly as they say it would.

565) The focus on business value does not remain confined to the people on the border with the customers. It comes from deep within product development and engineering.

#codingexercise

when we have to select groups as well as individuals, we use stars and bars methods
We put n objects in k bins with (n-1) Choose (k-1)
int getGroups ( int n, int k) {
return GetNChooseK (n-1, k-1);
}

double GetNChooseK(double n, double k)
{
if (k <0 || k > n || n = 0) return 0;
if ( k == 0 || k == n) return 1;
return Factorial(n) / (Factorial(n-k) * Factorial(k));
}
Alternatively,

static int GetNChooseKDP(int n, int k)
{
if ( k <0 || k > n || n = 0)
return 0;
if (k == 0 || k == n)
return 1;
return GetNChooseKDP(n - 1, k - 1) + GetNChooseKDP(n - 1, k);
}

Friday, March 8, 2019

Today we continue discussing the best practice from storage engineering:

551) Adding and dropping containers are easy to address cleanup.

552) The number of replication groups is determined by the data that needs to be replicated.

553) Some containers can remain open all the time. Some of these can even be reserved for System purposes.

554) When containers are split, they contribute individually to shared statistics. Such stats do not differentiate between containers. Consequently either the statistics must be differentiated or the origin registered with the collector

555) The statistics may themselves be stored in a container belonging to the system. Since the system containers are treated differently from the user,
they will need to be serviced separately.

556) System and shared notions go well together. They don’t have the isolations required for user containers. System only adds privilege and ownership to otherwise merely shared containers. The elevation to system may not be required in all cases

557) Application and system both publish statistics. They may both need to be the source of truth for their data

558) When the same container is replicated in different zones, there is a notion of local and remote. Only one of them is designated as primary. The remote is usually secondary

559) With primary and secondary containers for a replicated container, they become four when the replication group is split

Thursday, March 7, 2019

Today we continue discussing the best practice from storage engineering:

539) From supercomputers to large scale clusters, the size of compute, storage and network can be made to vary quite a bit. However, the need to own or manage such capability reduces significantly once it is commoditized and outsourced.

540) Some tasks are high priority and are usually smaller in number than the general class of tasks. If they arrive out of control, it can be significant cost. Most storage products try to control the upstream workload for which they are designed. For example, if the tasks can be contrasted significantly, it can be advantageous.

541) The scheduling policies for tasks can vary from scheduler to scheduler. Usually a simple policy scales much better than complicated policies. For example, if all the tasks have a share in a pie representing the scheduler, then it is simpler to expand the pie rather than re-adjusting the pie slices dynamically to accommodate the tasks.

542) The weights associated with tasks are set statically and then used in computations to determine the scheduling of the tasks. This can be measured in quantums of time and if a task takes more than what is expected, it is called a quantum thief. A scheduler uses tallying to find and make a quantum thief yield to other tasks.

543) Book-keeping is essential for both scheduler and allocator not only to keep track of grants but also for analysis and diagnostics.

544) A scheduler and allocator can each have their own manager that separates the concerns of management from their work

545) The more general purpose the scheduler and allocator become, the easier it is to use them in different components. Commodity implementations win hands down against specialized ones because they scale.

546) The requests for remote resources are expected to perform longer than local operations. If they incur timeouts, the quantum grants may need to stretch over.

547) Timeout must expand to include timeouts from nested operations.

548) Some event notification schemes are helpful to handle them at the appropriate scope.
549) A recovery state machine can help with global event handling for outages and recovery.
550) The number of steps taken to recover from outages can be reduced by dropping scoped containers in favor of standby
551) Adding and dropping containers are easy to address cleanup.
552) The number of replication groups is determined by the data that needs to be replicated. Generally there is very little data

Wednesday, March 6, 2019

The operation on inactive entries
Recently I came across an unusual problem of maintaining active and inactive entries. It was unusual because there were two sets and they were updated differently and at different times. Both the sets were invoked separately and there was no sign whether the sets were mutually exclusive. Although we could assume the invocations of operations on the sets were from the same component, they were invoked in different iterations. This meant that the operations taken on the set would be repeated several times.
The component only took actions on the active set. It needed to take an action on the entries that were inactive. This action was added subsequent to the action on the active set. However, the sets were not mutually exclusive so they had to be differentiated to see what was available in one but not the other. Instead this was overcome by delegating the sets to be separated at the source. This made it a lot easier to work with the actions on the sets because they would be fetched each time with some confidence that they would be mutually exclusive. There was no confirmation that the source was indeed giving up to date sets. This called for a validation on the entries in the set prior to taking the action. The validation was merely to check if the entry was active or not.
However, an entry does not remain in the same set forever. It could move from the active set to the inactive set and back. The active set and the inactive set would always correspond to their respective actions. This meant that the actions needed to be inverted between the entries so that they could flip their state between the two processing.
There were four cases for trying this out. The first case was when the active set was called twice. The second case was when the inactive set was called twice. The third case was when the active set was followed by the inactive set. The fourth case was when the inactive set was followed by the active set.
With these four cases, the active and the inactive set could have the same operations taken deterministically no matter how many times they were repeated and in what order.
The only task that remained now was to ensure that the sets returned from the source were good to begin with. The source was merely subscribed to events that added entries to the sets. However, the events could be called in any order and for arbitrary number of times. The event handling did not all exercise the same logic so the entries did not appear final in all the cases. This contributed to the invalid entries in the set. When the methods used to retrieve the active and inactive set were made consistent, deterministic, robust and correct, it became easier to work with the operations on the set in the calling component.
This concluded the cleaning up of the logic to handle the active and inactive sets.
#codingexercise
Selecting four from a set of n:
double GetNChooseK(double n, double k)
{
if (k <0 || k > n || n = 0) return 0;
if ( k == 0 || k == n) return 1;
return Factorial(n) / (Factorial(n-k) * Factorial(k));
}

GetNChooseK (n, 4);

Tuesday, March 5, 2019

Monday, March 4, 2019

Today we continue discussing the best practice from storage engineering:

537) The number of times a network is traversed also matters in the overall cost for data. The best cost for data is when data is at rest rather than in transit.

538) The choice between a faster processor or a large storage or both is a flexible choice if the dollar value is the same. In such cases, the strategy can be sequential, streaming or batched. Once the strategy is in place, the dollar TCO significantly increases when business needs change.

539) From supercomputers to large scale clusters, the size of compute, storage and network can be made to vary quite a bit. However, the need to own or manage such capability reduces significantly once it is commoditized and outsourced.

540) Some tasks are high priority and are usually smaller in number than the general class of tasks. If they arrive out of control, it can be significant cost. Most storage products try to control the upstream workload for which they are designed. For example, if the tasks can be contrasted significantly, it can be advantageous.

541) The scheduling policies for tasks can vary from scheduler to scheduler. Usually a simple policy scales much better than complicated policies. For example, if all the tasks have a share in a pie representing the scheduler, then it is simpler to expand the pie rather than re-adjusting the pie slices dynamically to accommodate the tasks.

542) The weights associated with tasks are set statically and then used in computations to determine the scheduling of the tasks. This can be measured in quantums of time and if a task takes more than what is expected, it is called a quantum thief. A scheduler uses tallying to find and make a quantum thief yield to other tasks.

543) Book-keeping is essential for both scheduler and allocator not only to keep track of grants but also for analysis and diagnostics.

544) A scheduler and allocator can each have their own manager that separates the concerns of management from their work

545) The more general purpose the scheduler and allocator become, the easier it is to use them in different components. Commodity implementations win hands down against specialized ones because they scale.

Sunday, March 3, 2019

We were discussing the implementation of a ledger with our first and second posts. We end it today.
There have been two modes to improve sequential access. First is the batch processing which allows data to be partitioned in batches on parallel threads. This works very well for data where the results can also behave as if they were data for the same calculations and the data can be abstracted into batches This is called summation form. Second, the batches can be avoided if the partitions are tiled over. This is called streaming access and it uses a window over partitions to make calculations and adjust them accordingly as the windows slides over the data in a continuous manner. This works well for data which is viewed as continuous and limitless such as from a pipeline.
Operations on the writer side too can be streamlined when it has to scale to large volumes. Some form of parallelization is also used here after the load is split into groups of incoming requests. To facilitate faster and better ledger writes, they are written once and as detailed as possible to avoid conflicts with others and enable more operations to be read-only. This separation of read-write and read-only activities on the ledger improve not only the ledger but also let it remain the source of truth. Finally, ledgers have grown to be distributed even while most organizations continue to keep the ledger in-house and open up only for troubleshooting, inspection, auditing and compliance.
Translations are one of the most frequently performed operations in the background. An example of translation is one where two different entries are to be reconciled the same as one uniform entry. These entries so that the calculations can be simpler.
Some of these background operations involve forward only scanning of a table or list with no skipping. They achieve this with the help of a progress marker for themselves where they keep track of the sequence number that they last completed their actions on. This works well in the case where the listing order remains unchanged.
Let us consider a case where this progressive scan may skip range. Such a case might arise when the listing is ordered but not continuous. There are breaks in the table as it gets fragmented between writes and the scanner does not see the writes between the reads. There are two ways to handle this. The first way to handle it is to prevent the write between the reads. This can be enforced with a simple sealing of the table prior to reading so that the writes cascade to a new page. The second way is to revisit the range and see if the count of processed table entries matches the sequence and redo it when it doesn’t agree. Since the range is finite, the retries are not very expensive and requires no alteration of the storage. Both approaches will stamp the progress marker at the end of the last processed range. Typically there is only one progress marker which moves from the ends on one range to the next.
Sometimes it is helpful to take actions to check that the table is stable and serving even for analysis. A very brief lock and release is sufficient in this regard.

#codingexercise
int GetPermutations (int n, int k) {
If (n == 0 || k > n) return 0;
if (k == 0 || k == n)
return 1;
return Factorial (n) / Factorial (n-k);
}

Saturday, March 2, 2019

Today we continue discussing the best practice from storage engineering:

528) Live updates versus backup traffic, for instance, qualify for separate products. Aging and tiering of data also qualify for separate storage. Data for reporting can similarly be separated into its own stack. Generated data that drains into logs can similarly feed diagnostic stacks.

529) The number of processors or resources assigned to a specific stack is generally earmarked with T-shirt sizing. This is helpful for cases where the increment or decrement of resources doesn’t have to be done by a notch by notch level.

530) Public cloud and hybrid cloud storage discussions are elaborated on many forums. The hybrid storage provider is focused on letting the public cloud appear as front-end to harness the traffic from the users while allowing storage best practice for the on-premise data.

531) Data can be pushed or pulled from source to destination. If it’s possible to pull, it helps in relieving the workload to another process.

532) Lower level data transfers are favored over higher level data transfers involving say HTTP.

533) The smaller the data transfers the larger the number which results in more chatty and potentially fault prone traffic. We are talking about very small amount of data per request.

534) The larger size reads and writes are best served by multiple parts as opposed to long running requests with frequent restarts

535) The up and down traversal of the layers of the stack are expensive operations. These need to be curtailed.

static int GetNChooseKDP(int n, int k)
{
if ( n == 0 || k > n)
return 0;
if (k == 0 || k == n)
return 1;
return GetNChooseKDP(n - 1, k - 1) + GetNChooseKDP(n - 1, k);
}

Friday, March 1, 2019

Today we continue discussing the best practice from storage engineering :

517) Customers also prefer ability to switch products and stacks. They are willing to try out new solutions but have become increasingly wary of tying to any one product or the increasing encumbrances

518) Customers have a genuine problem with data being sticky. They cannot keep up with data transfers

519) Customers want the expedient solution first but they are not willing to pay for re- architectures

520) Customers need to evaluate the cost of even data transfer over the network. Their priority and severity is most important to them.

521) Customers have concerns with the $/resource whether it is network, compute or storage. They have to secure ownership of data and yet have it spread out between geographical regions. This means they have trade-offs from the business perspectives rather than the technical perspectives

522) Sometimes the trade-offs are not even from the business but more so from the compliance and regulatory considerations around housing and securing data. Public cloud is great to harness traffic to the data stores but there are considerations when data has to be on-premise.

523) Customers have a genuine problem with anticipating growth and planning for capacity. The success of an implementation done right enables future prospect but implementations don’t always follow the design and it is also hard to get the design right.

524) Similarly, customers cannot predict what technology will hold up and what won’t in the near and long term. They are more concerned about the investments they make and the choices they have to face.

525) Traffic, usage and patterns are good indicators for prediction once the implementations is ready to scale.