Friday, April 14, 2017

Archiving 
As  new assets are added to an inventory and the old ones retired, the inventory grows  to a very large size. The number of active assets within the inventory may only be a fraction. Moreover, this active set of assets may be sprawled all over the table used to list the inventory. Consequently software engineers perform a technique called archiving which moves older and unused records to a secondary table. This technique is robust and involves some interesting considerations. 
For example, the assets are continuously added and retired. Therefore there is no fixed set to work and the records may have to be scanned again and again.  Fortunately, the assets on which the  archiving action needs to be performed do not accumulate forever as the archival catches up to the rate of retirement.  
Also the retirement policy may be dependent not just on the age but several other attributes of the assets. Therefore the archival may have policy stored as a separate logic to evaluate each asset against. Since the archival is expected to run over and over again, it is convenient to revisit each asset again with this criteria to see if the asset can now be retired.  
Moreover, the archival action may fail and the source and destination must remain clean without any duplication or partial entries. Consider the case when the asset is removed from the source but it is not added to the destination. It may be missed forever if the archival fails before the retired asset makes it to the destination table. Similarly, if the asset has been moved to the destination table, there need not be another entry for the same asset if the archival runs again and finds the original entry lingering in the source table.  
This leads to a policy where the selection of the asset, the insertion into the destination and the removal from the original is done in a transaction that guarantees all parts of the operation happen successfully or are rolled back to just before these operations started. But this can be relaxed by adding checks in each of the statement to make sure each operations can be taken again and again on the same asset with a forward only movement of the asset from the source to the destination. This is often referred to as reentrant logic and helps take action on the assets in a failsafe manner without requiring the use of locks and logs for the overall set of select, insert and delete. 
Lastly, the set of three actions mentioned above only work on one asset at a time. This is prudent and mature consideration because the storage requirement and possible changes to the asset is minimized when we work on one asset at a time instead of several.  Consider the case when an asset is faultily marked as ready for retirement and then reverted back again. If it were part of  a list of say ten assets that were being archived, it may affect the other nine to be rolled back and the actions repeated by excluding the one. On the other hand, if we were to work with only one asset at a time, the rest of the inventory is untouched. 
Thus the technique for archival has many considerations that can be applied upfront into the design before the policy is implemented and executed.
#codingexercise
Find the Longest common subsequence of two strings:
int GetLCS(String X, String Y, int m, int n)
{
if ( m == 0 || n == 0)
     return 0;
if (X[m-1] == Y[n-1])
   return 1 + GetLCS(X, Y, m-1, n-1);
else 
   return Math.Max(GetLCS(X,Y, m, n-1), GetLCS(X, Y, m-1, n));
}

1 comment: