Friday, October 28, 2016

Today we continue our discussion on natural language processing with a particular word ontology called FrameNet. We see how it is different from other ontologies , such as WordNet, PropBank etc. One such was AMR which had extensive documentation. It had PropBank predicate-argument semantics. It had name and value entities with entity linking. Entities and events formed coreference.
It included modality, negations and questions. It had relations between nominals. The content words were canonicalized. It would convert adverbs to adjectives to noun to verbs wherever possible. Moreover it represented all of this in a single graph. AMR Assets is a snazzy annotation tool that comprised of 15,000 AMRs which is roughly 270K words released with another 5000 AMRs or 150K words annotated and in the pipeline. AMR differed from FrameNet in that it's purpose was cost effective annotation together with NLP as opposed to FrameNet's lexicography. FrameNet frames was about rich semantic groupings while AMR retained PropBank's lexicalized rolesets. FrameNet annotated by using labeled spans in sentences. AMR was represented by graph where sentences and words  did not directly translate to nodes. A sentence does not have any explicit relationships across frame annotations in FrameNet where as AMR comprised of predicates with shared arguments.
We note here the functionality of the FrameNet and the advantages of the form of AMR. In general, there are many ways to come up with an ontology or pseudo ontology for different purposes, however a graph based data structure where the nodes are not necessarily terms only comes helpful in many cases.  Further a lookup of each term to such graph nodes also helps with directly representing models over the graph nodes instead of terms with more static pre-discovered relationships available for incorporation in the models.

#puzzle
Prove that for every prime p greater than 3, the value p ^ 2 - 1 is a  multiple of 24.
p ^2 -1 can be written as (p-1)(p+1)
Since p is prime both the components above are even. So they must each be divisible by 2.
Further they are consecutive even numbers, therefore they should be divisible by 4 as well. So we have factors 2 x 2 x 2
Next p-1, p and p+1 are three consecutive numbers, hence they must be divisible by 3. We know p is not divisible, therefore this must be p-1 or p+1. This gives us 3 as a factor.
Consequently the factors are 2 x 2 x 2 x 3 = 24 for all p > 3.

#codingexercise
Find power x, n with logarithmic time complexity and limited storage
int power( int x, uint n)
{
int result = 1;
while (n > 0)
{
   if (n & 1)
      result = result * x;
   n = n >> 1; // reduce n by half
   x = x * x;
}
return result;
}



This can be made recursive as well but it wont be storage efficient.

No comments:

Post a Comment