Namespace, Buckets, Objects and their use with Querying
FileSystem does not lend itself to the same querying capabilities
and performance as database tables do. Directories and files from a file-system
are enumerated using iterations. Database tables have indexes allowing faster
access than sequential scan from iterations. There is nothing that works quite
like a database for efficient querying both historically and for the physics
that the local data access is far more efficient than remote data access
especially when it is organized at the finest granularity of the data and
managed with metadata and query caching. We have realized cloud databases where
the remote access does not matter to the service level agreement for the
business transactions but we have yet to realize database like queries over an
object store.
We are adding compute to storage. There is no limit to the possibilities once we take the virtualization that some object stores enable. Without the compute, the storage solution of such object stores satisfies immense and diverse requirements. With the compute and data processing capabilities offered out of the platform, the operations expand beyond create, update and delete to performing standard query operations that can support a dashboard of charts and graphs, participate in streaming queries or improve the metadata of the objects. For example, the usage statistics of the object may now be part of the metadata.
When we iterate namespaces, buckets and objects, we often have to rely on sequentially visiting each one of them. There is no centralized data structure that speeds them up nor are they organized in a sorted manner unlike the indexes. These S3 artifacts may be stored over a layer that might facilitate data structure that speed lookup. One such example is a B-plus tree – a data structure that relies on storing ranges by their keys. Another example may be skip lists – a data structure that relies on the links not only between adjacent occurring records but also skipping adjacencies usually by an exponent of two. Such techniques improve lookup because they resist from having to visit each element one after the other.
#codingexercise
#codingexercise
Find count of common elements between two arrays before a mismatch
Int GetCountMoves(List<int> A, List<int> B)
{
Assert (A!= null);
Assert (B!=nul)
Assert (A.Count == B.Count);
A.sort();
B.sort();
Int result = 0;
For (int I = 0; I < A.Count; I++)
{
If (A[I] == B[I]) result++;
Else break;
}
Return result;
}