How do we search the object store ?
There have been a couple of mentions in my previous post but
not a solution. Let us face it, the S3 apis only permit search on the
prefix-included-name for objects. And even that is limited to a PCRE regular
expression search. Consequently, a lot of time may be spent in coming up with naming
conventions. The trouble with naming conventions is that it static and may even
require changing of the names when there is an attribute change or some
conflict arises.
On the other hand, metadata for each object is available on
an iterator basis. This means that we can iterate one object after another and
match its metadata to that of the query. For example, if we want to find out
all objects created by a certain owner, then we scan all the objects and match
its owner or created_by field to the value in the query.
Arguably, the name and the metadata of an object are smaller
in size than the average object size. In other words, we could keep a mirror of
the object store with empty files for each of the corresponding object in the
object store. We therefore have the name and metadata of each.
Another way to do this would be to create an index over the
prefix names. A SQL table with the object identifier and its metadata attributes
as columns or relation to key-value table would suffice. The table will have an
index on prefix and even the metadata values for those field that are common to
all.
With an index on the prefix name, searching and sorting in
TSQL becomes far more easier. A clustered sequential index on the name or
object key would even help reduce the disk access.
Moreover adding a table for the name and metadata lends
itself to standard query operators. Operators like Select, Join, Union,
intersect, Except, Distinct, Range, SequenceEqual, Skip, SkipWhile, Where etc
can be seamlessly performed on the object keys which makes it easier to come up
with a final result set of the objects of interest. Moreover, aggregator operations
can also be performed in addition to different kinds of positional access.
Lastly, the object store enables such index to be an object
itself in the object store. So we don’t need to keep another database for this
purpose.
#codingexercise
bool GetMax (node root) {
If (root == null) return false;
if (root.right == null) return root;
while (root && root.right)
root = root.right;
return root;
}
#codingexercise
bool GetMax (node root) {
If (root == null) return false;
if (root.right == null) return root;
while (root && root.right)
root = root.right;
return root;
}
No comments:
Post a Comment