Sunday, February 21, 2016

National databases – SQL or NoSQL a comparision
National databases have to integrate data from a variety of sources mostly regional in nature. What are some of the challenges and how are they overcome ? We take a look in this study. Further we will explore what could be an integrator for various upstream databases. We will also compare SQL and NoSQL solutions in this regard.
Integration:
Regional databases evolve different from the National database. They are the primary source of truth for their region. They need not have the same convention, syntax and semantics as the national one.
It’s also likely that they have overlapping information about the same entity as that entity moves from region to region. Consider name-address pair as an example. This varies for individuals as they go out of state. Fortunately such information has a schema and the data is often indexed.
Consequently, data with schema is extracted , transformed and loaded into central databases with or without code. As the data flows through this process, there is no straightforward way of knowing that origin or the final destination completely  therefore staging tables are used to consolidate, scrub and translate data.  Periodic  work flows are required to add data as and when differences arise. This is a fairly complicated process because changes in the data affect this process.
If we look at the interface to plug in data sources of various regions, they look like the standard query operators on the C# LINQ pattern. But essentially we just need an iterator that can get one record at a time with state information from one to the other. Sure some of the attributes may be different or unrecognizable or even missing but the they can be filled in a large sparse table where columns are mapped to the attributes. A connecting string including the credentials is required for each data source. This may not be the final destination table but it can be used to feed the final destination
On the other hand, NoSQL databases enable a homogenous document model across heterogeneous data. Here each regional database can have any kind of information. Since they are documents or key value pairs they are easily combined or collected together. Searches over these fairly large and possibly redundant data is made better with parallelization techniques such as map-reduce algorithm.
Considerations for NoSQL databases include :
Searrch by key, by cell or for values in column families.
Limiting search to key/column ranges
Retaining the last N historical values.
If it supports bit operations, set operations, list operations, hashes and sorted sets.
If it  has  scriptability
If it supports transactions and publisher subscriber messaging
If the dataset size can be predicted.
Etc
The operations on public databases are mostly read only. Therefore transaction semantics for create-update-delete operations and isolation levels are  not as important as retrieving some or all records. Consequently a nosql option and parallelization  techniques are favored.
#merlin qa
In the previous code exercise problem, we showed an approach based on permutations that we can exploit to cover all possible orderings of spells before taking the one that yields maximum materials. I called this the naive approach. An alternative would be to have a score for each spell and then order the spells by that score in a strictly increasing materials. The score can be based on the maximum material produced if any or zero or a combination of the materials produced. 

Saturday, February 20, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We saw how Jigsaw improves the language and runtime for its purpose
We also saw that jigsaw has robust encapsulation tecniques for cross principal sharing. The execution context for the children are determined with the resource permissions and this is strictly dropping. All data is accessible by reference which is a performance gain.
Jigsaw introduces restrictions on the Javascript language. For example surrogate objects don't expose their prototype. The prototype of global builtin objects cannot be tampered. But Jigsaw supports many Javascript features and even improves them such as function objects, object literals, and pass by reference for objects shared across domains.
The compiler parses the Jigsaw code using ANTLR toolchain. The library implements the runtime security checks.
#coding exercise

We discussed Merlin QA in the previous message:
We discussed how to generate permutations
We calculate the benefit this way
Int getBenefit (List <Spell> spells)
{
Int benefit = 0;
Foreach (var spell in spells)
{
benefit += (new list <int> ( 0, spell [0], spell[0]+ spell [1]).max (); // the first two materials may suffice for determining relative cost between spells otherwise we can cumulate all materials for each spell.
The order in which we iterate is sufficient to find the cumulative across this order of spells.
}
Return benefits;
}

Friday, February 19, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We saw how Jigsaw improves the language and runtime for its purpose. Specifically we saw the public private access modifiers and visibility modifiers, that can be used for all variables and properties, cross box event listeners and scrubbing of private references, surrogate objects, predefined objects etc.
We saw that Jigsaw.getParentPrincipals () and getSameDomainPrincipals() allow a child to access it's parents resources and a parent to access it's children principal objects. Privileges are restricted down the hierarchy and this is strictly decreasing order. Privileges apply to network communication, visual fields and local storage such as DOM.
This we see that jigsaw has robust encapsulation tecniques for cross principal sharing. The execution context for the children are determined with the resource permissions and this is strictly dropping. All data is accessible by reference which is a performance gain.
#Google code jam - merlin QA
This is  a problem about a quality assurance engineer in Merlin Inc who has to test N spells. Each spell takes a certain amount of material and converts it to another. There are M materials in all.She has no materials to begin with but she can take as much as she wants from the Merlin inventory. She gets to keep all the materials left behind from her tests. Each Material has a certain value so she wants to profit most by picking an order of the N spells to test that yields most.
Solution:
The naiive solution would be to permute the N spells and for each permutation determine the value of byproducts left behind. The one with the max value is the solution  and the chosen order. Permutation of N spells is equivalent to the permit code below :

Permutations of  a string: 
Void Permute (String a, StringBuilder b, bool[] used) 
{ 
  If ( b.Length == a.Length { print b.ToString(); return;} 
  For (int I = 0; I < a.Length ; i++) 
  { 
    If (used[i]) continue; 
     used[i] = true; 
     B += A[i]; 
     Permute(a, b, used); 
     B [B.Length – 1] = ‘/0’; 
     used[i] = false; 
  } 
} 
As we list each order of the spells above, we find the value of the outcomes and pick the one with the maximum value.

Wednesday, February 17, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We were reviewing boxes and the choice of encapsul]ation syntax for principal objects.
This syntax can also be used not just as access modifiers but also visibility modifiers. They can be applied on property and anywhere on variables.
Jigsaw checks whether the statement in which public or private appears has visibility settings for the relevant prototype object. If there's a public method on 'obj', the surrogate wraps the this and binds it to the obj preventing attacks where the 'this' pointer can be changed. Jigsaw also does lazy evaluation because the objects graph can be big with most objects lightly used.
Surrogates automatically protect cross box data exchanges.

Jigsaw also virtualizes the DOM tree in each box. It lives inside each box's Javascript namespace. This is also an example of a set of predefined Javascript objects that jigsaw manages. As we can see, jigsaw improves the language and usage for its purpose.

Crossbox events are also supported. If a box X wants to register an event handler in a different box Y, it passes the handler to Y via Y's public interface. Y can then register the handler with the browser's event engine in the standard way. The difference is that the event object may not contain Y's references so its scrubbed before its passed externally. This prevents information leakage via foreign event handlers. The handler executes in the Javascript context of the box that created it.

Access from child to parents namespace is facilitated with Jigsaw.getParentPrincipal() and the access to the principal objects of its own children is facilitated with Jigsaw.principals Jigsaw.getSameDomainPrincipals. Jigsaw associates each principal with a unique id, and parent can restrict a child's access to a subset of the principal ids.

Similarly networking privileges and visual fields can be further restricted or relinquished but this has to be progressive down the hierarchy. This means that privileges decrease strictly down the hierarchy.

#codingexercise
uint factorial(uint n)
{
 if (n == 1) return 1;
 return n * factorial(n-1);
}
#codingexercise
uint GetEven (int n)
{
 If (n%2 != 0) return n+1;
Return n;
}

Today we continue to discuss the paper on Jigsaw
Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We were reviewing how boxes were different from iframes. We saw the use of principal objects and DOM tree.the latter is constrained by the parent. A child may update its visual field and receive GUI events by modifying the DOM. A parent can dynamically change a child's visual field by writing to it. Jigsaw will not permit a child to do the same for a parent unless the visual field is delegated. Overlapping visual fields from different boxes are treated as opaque. The box with the highest z-order prevails. Even with these protections, a principal  object must rely on its hierarchy. Like visual field, network access is also secured hierarchically with increasinglyrics restrictive scope. The top level box may communicate with all servers using http requests while the children may be allowed to communicate with select domains. Special tokens such as self and parent may be used to represent the origins of the respective boxes. Like network resources, local storage is also secured but here Jigsaw partitions the key value client side database with separate DOM storage. Cross principal access to this is allowed only via corresponding public operations  on the principal objects. We will shortly see why the public private syntax is better for securing objects than existing encapsulation syntax from the Javascript language. The language provides prototype as a means for inheritance. The objects are merely key value dictionaries. A prototype is an exemplar that defines property names and default values for other instances of these objects.  Javascript has a permissive reflective interface that allows properties or methods of all objects to be enumerated.
By setting the __proto__ property, an object can dynamically become an instance of the exemplars class. Properties may be dynamically added to any object. Even the __proto__ property van be assigned dynamically which makes an object change its inheritance dynamically. Thus jigsaw has to add some syntax to provide cross domain encapsulation  and borrows it from more object oriented languages.

Tuesday, February 16, 2016

A scientist recorded the temperatures with some unit on regular intervals. She has N data points and she proceeded to calculate the averages for a smoothing window of size K. This means she took K consecutive readings and calculated their average on a sliding scale. The next K data points are chosen starting from the second. And so on  this gives n-k+1 averages and she writes them down as just the sums omitting the division by k. But she loses the original data set. Assuming that the original data points were all integers, How do we find out the minimum difference between possible lowest and highest temperatures.
Solution we know the consecutive sums give us the last individual element added and the first element removed. Consequently we write down these n-k+1 successive differences. Now for each of the k positions within a window say i, we do the following  operations :
We  initialize an array b of length (n-1-i)/k and fill them with the values of the last element of a in the corresponding k sized jumps  then for these elements of b we add up the adjacent pairs  successively
and append a zero. Then we take the difference between the min and max of b.
Since we do this for each position in k, we get the highest among all such differences as the least possible

The central idea here is to collate the differences starting at I and every k step size because the last element participates in the difference every k step size. In other word after every k the last element becomes the first element. Consequently these collated differences can then give an idea of  the differences between individual original elements. Therefore when we aggregate them towards the right pair wise we build up the maximum difference between the elements . With the cycles that we iterate we are sure to find the maximum of the  differences between individual elements by considering them to participate in first and last differences. We only have to refine the answer by 1 if we be conservative due to the potential exclusion by taking lower bounds.
Int GetMinDifference (int n, int k, list <int> sums){

Var a = new List <int> (n-k);
Var r = new List <int> (k);
Var q = new List <int> (sums);
For ( int I =1; I < n-k+1; I ++)
A [i-1] = sums [i-1] -sums [i];
For ( int I = 0; I < k: i++)
{
Var b = new List <>(n);
For (int x = 0; x < (n-1-i)/k; x++)
b [x] = a [I + k*x];
For (int c=1; c < b.Count: c++)
{b [c] = b [c-1] + b [c];
}
b.add(0);
Int d = Math.abs (b.max ()-b.min ());
r [i] = d;
q [i] = b.max ();
}
Var m = r.max ();
Var rem = (sums [0]-q.sum ())%k;
Var n = m * k -r.sum ();
If ( n >=rem )
   Return m;
Else 
    Return m+1;
}
Courtesy: codejam solutions.
The original problem statement suggests you to come up with an integer data point distribution for each of the averages and then overlap them 

Monday, February 15, 2016

Today we continue to discuss the paper on Jigsaw
Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We saw how iFrames differ from boxes and why boxes are helpful with the Jigsaw design goals. A Jigsaw  application can have multiple boxes, but all of the boxes live in the same frame. Jigsaw code uses access modifiers to expose state to external domains. The Jigsaw runtime uses these modifiers to validate cross-domain operations. A single Jigsaw application may contain multiple  principals. Jigsaw places each of these principals in a separate box.  As with the DOM, a principal can access its parent principal object or the root.Principal objects can synchronize their writes to DOM storage. Each Jigsaw box can potentially contain a DOM tree and an associated visual field. The updates to these are constrained by the parent. A visual field consists of width height and location within the parents visual field and a z-order.  A parent can set these for the child. If a child wants to modify these, it can try but it has to be permitted by the parent. The child fires a  Jigsaw event on any modification and the parent must have a registered handler for the same. Visual fields can overlap but boxes cannot request overlapping  visual region. With the hierarchy, there  is fine control of the visual field and storage but a principal must still trust in its ancestors. DOM storage obstruction allows each origin to maintain a client-side key/value database. Jigsaw partitions this DOM storage.

#google codejam Riverflow problem:
A river receives water from tributaries that can be diverted to farms by farmers. Farmers choose an integer power of 2 from 1 to D both inclusive and toggle their usage everytime that many number of days has passed. Given N days history of water flow in the river, determine if the farmers are cheaters or the smallest number of farmers who could be diverting from the river.

int GetMinFarmers(int D, int N, List<int> flows)
{
bool cheaters = false;
int net = 0;
int farmers = 0;
for (int d = D; d >= 1; d /= 2)
{
    for (int i = 0; i < d; i++) {
         int net = flows[i+d]-flows[i] - flows[i+d-1] + flows[(i+2*d-1)%(2*d)];
                                                                 // i+d-1 stands for previous day to be reduced by next cycle
                                                                                          // (i+2*d-1)%(2*d) stands for the next cycle
         if (net%2) cheaters = true;
         net /= 2;
         if (net > 0){
                     farmers += net;
                     for(int j = i+d; j <= i+2*d;j++){
                           flows[j%(2*d)] -= net;
                     }
         }else{
                     farmers -= net;
                     for(int j = i; j <= i+d;j++){
                           flows[j%(2*d)] += net;
                     }
         }
    }
}
if (cheaters || flows[0] < 0)
   Console.WriteLine("cheaters");
return farmers;
}
#courtesy: Gnurdux