Thursday, February 25, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We were discussing the porting effort involved with Jigsaw and performance improvements. We now review end to end performance study of Jigsaw. Jigsaw performs checks on property accesses. It does bookkeeping of all objects be setting their box id and object id. It checks access to virtualized browser resources for compliance to security policies. All this might affect the end to end performance. Consequently three different libraries were ported over to Jigsaw and their performance evaluated. These libraries include  JSON-RPC a library that performs RPC over Ajax connections.  A local host server is chosen over a remote server to avoid network delays in performance evaluation. A DOM-SQL library that uses SQL as the storage for DOM was another library used. The AES encryption function I'm the standard crypto library was also used for the study. Mousemove is another simple benchmark library that registers handlers on mode moves. In each case, the library code was put in a box separate from that of the integrator. All cross box communication with the integrator took place through virtualized resources or via a principal object.
#codejam exercise  Campinatorics 
A park ranger has to distribute families to tents in a N×N grid subject to the following rules:
Families can be 1 or 2 or 3 members 
The number of 3 member families is X
Thereally are plenty of 1 or 2 member families as fillers 
A cell can be occupied by only one tent
The number of members in any row or column must be exactly 3
The number of tents in any row or column cannot exceed two.
Find how many arrangements are possible if two arrangements are different if any of their cells are different or if the members in the same cell are different.
Solution : we are looking at different arrangements of 0, 1, 2 or 3 in an N×N grid subject to the above combinations. Therefore one way to do this would be to combine those values for each N×N cell . After all the cells have been given a number, we can quickly reject or accept an arrangement by running the checks above. We return the count of all accepted arrangements.
If we list the cells row wise as a string then the ith element belongs to the i/N row and i%N column.
Void arrange (StringBuilder a, int start)
{
  If (start == N×N){ print a.toString (); return; }
  For ( int m = 0; m <= 3; m++)
  {

      a [start] = m;
      arrange (a, start +1);
      a [start] = '/0';
}
}

Bool isValid (string s, int N, int X) // representation for 
{
Int [,] matrix = s.toMatrix ();
Int threes = 0;
For ( int I = 0; I < N; i++)
{
Int sum = 0;
Int count = 0;
For (int j=0; j < N; j++)
{
If (matrix [i][j] != 0) count++;
If (count >2) return False;
If (matrix [i][j] == 3) threes++;
sum += matrix [i][j];
}
If (sum != 3) return False;
}
For ( int j = 0; j< N; j++)
{
Int sum = 0;
Int count = 0;
For (int i=0; i< N; i++)
{
If (matrix [i][j] != 0) count++;
If (count >2) return False;
If (matrix [i][j] == 3) threes++;
sum += matrix [i][j];
}
If (sum != 3) return False;
}
If (threes != X) return False;
Return True;
}

Wednesday, February 24, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We were discussing the porting effort involved with Jigsaw and how the surrogate log and fail stop exceptions on private property access made it easy to identify legacy code properties that needed to be marked public. PortMash and Caja were other two software examples cited for comparison with jigsaw. Unlike PortMash, jigsaw does not have to change asynchronous calls and unlike Caja jigsaw does not have to do object sanitization on boundary crossing. Jigsaw automatically wraps objects with a proxy in such cases. Moreover jigsaw enables pass by reference for all objects and this significantly reduces the marshaling otherwise. When large object trees have to be passed across isolation barriers, jigsaw creates surrogates lazily on first access to a sharing object. Initially only the root of the object tree gets a surrogate by this example.
In pass by value the entire object tree is recreated  that said, the surrogates introduce getters and setters that have more overhead than property access. However, this is acceptable because of three reasons:
First, this cost is only paid on access to external objects and not paid otherwise.
Second, jigsaw's initial sharing cost is a constant with respect to the size of the graph  thus is orders of magnitude better than pass by value. It is in fact very common to have large object graphs.
Third, jigsaw allows mashups to communicate synchronously with pass by reference as opposed to the asynchronous required in large object graph pass by value that may have significant latencies.
#codejam
In yesterday's post, we were solving a code jam problem
This involved a graph with N vertices and M edges where guards traverse from a source to destination node and we can obstruct upto K vertices. Each edge takes a unit time to traverse and each obstruction takes an additional unit time to clear. What is the minimum time for the guards to reach destination. ?
We discussed the solution. Today we look at a few steps 
First the breadth first traversal:
 Void bfs ()
{
  // initialize a list q for nodes to be visited and add the source 
// initialize a list for distances and set the first to be zero and all else unreachable.
While we have nodes to visit in q:
       Pick a node x and for each edge associated with x
         If distance has not been set and node y is not blocked :
             Add y to q;
             Distance of y = distance of x +1;
             Parent of y = x;
              
}
Void getTime ()
{
  bfs ();
  Int time = distance of destination node d [n-1]
  K--;
  While (k)
   {
        bfs ();
        If  d [n-1]  > time or n-1 is unreachable
           return time+2;
        Int z = n-1; 
        While  (z){
              If (Z!= n-1)
                  Blocked [z] = true;
              Z = parent of z;
         }
        K--;
   }
   bfs ();
   If  d [n-1]  > time or n-1 is unreachable
           return time+2;
   Return time +1;
}

Tuesday, February 23, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. 
Many preexisting Javascript libraries shared a notion of public and private priperties. Jigsaw makes these implicit visibility properties explicit. These libraries also used a root object for their purpose. The jigsaw runtime was instrumented to find such root objects and make them principal objects and expose the properties as public. The public properties of all other library objects were marked next by modifying the runtime.If the library used any surrogate objects that crossed the boc boundary, the runtime logged the public and private properties of the backing object.  Also the runtime was modified to throw an exception and not returned an undefined for private accessors. With logging and failstop behaviour, all preexisting libraries could be ported to Jigsaw because now all the properties that needed to be marked public could be found.
#code jam exercise
Yesterday we were discussing code jam solution to costly binary search. Here's how the solution looks like:

Int getCost (List <int> a)

{
Int N = a.Count ();
var dpL = new int [190, 1000010];
Var dpR = new int [190, 1000010];
For (int i=0; I < 190; i++)
For (int j=0; j < N+1; j++)
{
dpL[i][j] = j;
dpR[i][j] = j;
}
 
 for(int cost=1;cost<190;cost++){
    For ( int i = 0; I < N; i++)
if(a[i] <= cost){
   int L = dpL[cost-a[i]][i];
   int R = dpR[cost-a[i]][i+1];
   dpL[cost][R] = min(dpL[cost][R], L);
   dpR[cost][L] = max(dpR[cost][L], R);
  }
  for(i=1;i<=N;i++) dpR[cost][i] = max(dpR[cost][i], dpR[cost][i-1]);
  for(i=N-1;i>=0;i--) dpL[cost][i] = min(dpL[cost][i], dpL[cost][i+1]);
 }
 
 for(i=0;i <190;i++) if(dpR[i][0] == N) return i;
 Return 190;
}

Courtesy:rng..58

We discuss another code jam problem today. 
This one is about a maze represented as a graph with N vertices  and M edges where  guards run from an entrance to a destination vertex. They take a unit time to cross each edge. We have to delay their pursuit. We can mark upto k vertex as obstructed  each of which costs one more additional unit of time. How long will the guards take ?
Solution: the graph is traversed breadth wise. This let's us keep track of distance and parent. The distance is -1 for unconnected and incremented otherwise.
First we traverse the graph ourselves to find the k vertices to obstruct. We traverse fir each of the k times we obstruct. Then the guards traverse it to find the total time taken. We know the k to begin with and decrement it one by one as we mark the vertices as obstructed. The destination vertex is the nth vertex and if it is reachable then the time taken is the computed distance so far plus the entrance and destination obstruction. For each of the k-2 vertices in between the entrance and destination in each iteration, we obstruct the destination vertex and successive parents all the way to the entrance to compute the time taken.
 
the choice of breadth first search over depth first search is appropriate here because the guards will take unit time to cover each edge.
 

Monday, February 22, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We saw how Jigsaw improves the language and runtime for its purpose. It supports robust encapsulation technique using a Jigsaw to Javascript compiler and a client side library. The latter introduces box management interface and the rewriter adds an integer object id to every object created. This ensures that at most one proxy is created for every object. In addition, jigsaw includes the id of the creating box with each object so that it can determine if the surrogate object is executing within the same box. If so, it in wraps the object  and executes methods. Jigsaw adds a per object map of visibility Metadata by listing properties that are declared public and private and this map is immutable which secures the object from an untrusted box. Jigsaw also creates new boxes using the eval () statement  the boxes are created by the function that adds aliases to the global objects. This adds a security layer when out of box calls are made. This later enforces security policies and prevents virtual operations to be applied to their real counterparts.

#code jam exercise
We have to insert an element into a sorted array of N elements no tworries of which are duplicates ever. Typically we do binary search to find the position where we insert. In this modified problem, each comparison has a varying cost between 1 and 9 inclusive. What will be the total cost of this binary search in the worst case ?

 Here the cost of a comparison can vary. Therefore for each comparison there may be a better pick that achieves the same result of narrowing down the range. The final position of insertion is only one where the elements to the left are smaller and the elements to the right are bigger. This can be one of N+1 positions. The worst case is when we have exhausted all these positions to arrive at this one.
Consequently we create two sufficiently large matrices  where the rows cover the cumulative costs (arbitrarily large) and the columns span from 0 to N+1. These matrices keep track of the final position as columns against each possible cost as rows with the values initially ranging from 0 to N+1 aND iNC reading fRon left to right. We need two because one for left and another for right of the range we are considering. We update this matrix to reflect the monimum position for the given cost and  keep the values sorted before each update. Finally we read the row index of the right matrix as the answer where the first column equals N as the positions considered.

Sunday, February 21, 2016

National databases – SQL or NoSQL a comparision
National databases have to integrate data from a variety of sources mostly regional in nature. What are some of the challenges and how are they overcome ? We take a look in this study. Further we will explore what could be an integrator for various upstream databases. We will also compare SQL and NoSQL solutions in this regard.
Integration:
Regional databases evolve different from the National database. They are the primary source of truth for their region. They need not have the same convention, syntax and semantics as the national one.
It’s also likely that they have overlapping information about the same entity as that entity moves from region to region. Consider name-address pair as an example. This varies for individuals as they go out of state. Fortunately such information has a schema and the data is often indexed.
Consequently, data with schema is extracted , transformed and loaded into central databases with or without code. As the data flows through this process, there is no straightforward way of knowing that origin or the final destination completely  therefore staging tables are used to consolidate, scrub and translate data.  Periodic  work flows are required to add data as and when differences arise. This is a fairly complicated process because changes in the data affect this process.
If we look at the interface to plug in data sources of various regions, they look like the standard query operators on the C# LINQ pattern. But essentially we just need an iterator that can get one record at a time with state information from one to the other. Sure some of the attributes may be different or unrecognizable or even missing but the they can be filled in a large sparse table where columns are mapped to the attributes. A connecting string including the credentials is required for each data source. This may not be the final destination table but it can be used to feed the final destination
On the other hand, NoSQL databases enable a homogenous document model across heterogeneous data. Here each regional database can have any kind of information. Since they are documents or key value pairs they are easily combined or collected together. Searches over these fairly large and possibly redundant data is made better with parallelization techniques such as map-reduce algorithm.
Considerations for NoSQL databases include :
Searrch by key, by cell or for values in column families.
Limiting search to key/column ranges
Retaining the last N historical values.
If it supports bit operations, set operations, list operations, hashes and sorted sets.
If it  has  scriptability
If it supports transactions and publisher subscriber messaging
If the dataset size can be predicted.
Etc
The operations on public databases are mostly read only. Therefore transaction semantics for create-update-delete operations and isolation levels are  not as important as retrieving some or all records. Consequently a nosql option and parallelization  techniques are favored.
#merlin qa
In the previous code exercise problem, we showed an approach based on permutations that we can exploit to cover all possible orderings of spells before taking the one that yields maximum materials. I called this the naive approach. An alternative would be to have a score for each spell and then order the spells by that score in a strictly increasing materials. The score can be based on the maximum material produced if any or zero or a combination of the materials produced. 

Saturday, February 20, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We saw how Jigsaw improves the language and runtime for its purpose
We also saw that jigsaw has robust encapsulation tecniques for cross principal sharing. The execution context for the children are determined with the resource permissions and this is strictly dropping. All data is accessible by reference which is a performance gain.
Jigsaw introduces restrictions on the Javascript language. For example surrogate objects don't expose their prototype. The prototype of global builtin objects cannot be tampered. But Jigsaw supports many Javascript features and even improves them such as function objects, object literals, and pass by reference for objects shared across domains.
The compiler parses the Jigsaw code using ANTLR toolchain. The library implements the runtime security checks.
#coding exercise

We discussed Merlin QA in the previous message:
We discussed how to generate permutations
We calculate the benefit this way
Int getBenefit (List <Spell> spells)
{
Int benefit = 0;
Foreach (var spell in spells)
{
benefit += (new list <int> ( 0, spell [0], spell[0]+ spell [1]).max (); // the first two materials may suffice for determining relative cost between spells otherwise we can cumulate all materials for each spell.
The order in which we iterate is sufficient to find the cumulative across this order of spells.
}
Return benefits;
}

Friday, February 19, 2016

Today we continue to discuss jigsaw.Efficient, low effort mashup isolation by James Mickens and Matthew Finifter. Mashups are web applications that contain code from different principals.
Jigsaw lets mashups be isolated with its framework  and lets them selectively expose private state. We saw how Jigsaw improves the language and runtime for its purpose. Specifically we saw the public private access modifiers and visibility modifiers, that can be used for all variables and properties, cross box event listeners and scrubbing of private references, surrogate objects, predefined objects etc.
We saw that Jigsaw.getParentPrincipals () and getSameDomainPrincipals() allow a child to access it's parents resources and a parent to access it's children principal objects. Privileges are restricted down the hierarchy and this is strictly decreasing order. Privileges apply to network communication, visual fields and local storage such as DOM.
This we see that jigsaw has robust encapsulation tecniques for cross principal sharing. The execution context for the children are determined with the resource permissions and this is strictly dropping. All data is accessible by reference which is a performance gain.
#Google code jam - merlin QA
This is  a problem about a quality assurance engineer in Merlin Inc who has to test N spells. Each spell takes a certain amount of material and converts it to another. There are M materials in all.She has no materials to begin with but she can take as much as she wants from the Merlin inventory. She gets to keep all the materials left behind from her tests. Each Material has a certain value so she wants to profit most by picking an order of the N spells to test that yields most.
Solution:
The naiive solution would be to permute the N spells and for each permutation determine the value of byproducts left behind. The one with the max value is the solution  and the chosen order. Permutation of N spells is equivalent to the permit code below :

Permutations of  a string: 
Void Permute (String a, StringBuilder b, bool[] used) 
{ 
  If ( b.Length == a.Length { print b.ToString(); return;} 
  For (int I = 0; I < a.Length ; i++) 
  { 
    If (used[i]) continue; 
     used[i] = true; 
     B += A[i]; 
     Permute(a, b, used); 
     B [B.Length – 1] = ‘/0’; 
     used[i] = false; 
  } 
} 
As we list each order of the spells above, we find the value of the outcomes and pick the one with the maximum value.