Cluster computing

Wednesday, October 26, 2016

Today we continue our discussion on natural language processing with a particular word ontology called FrameNet. We see how it is different from other ontologies. For example, FrameNet differs from WordNet in the lexical information and is in fact, complimentary in many ways. FrameNet has far fewer nouns, verbs, adjectives and adverbs as compared to WordNet. This is because FrameNet has very little to say about common nouns. WordNet has noun hierarchies and encompasses common nouns. WordNet has little syntagmatic information where as FrameNet has a lot.
WordNet separates hiearchy for Nouns, Verbs, Adjectives and Adverbs. Each Frame on the other hand can conver words of any part-of-speech. WordNet has hypernymn or hyponymn relations between synsets. Lexical units come close to synsets but there are no relations per se between lexical units. However, there are several types of frame relations. There are also content differences between the two. WordNet has annotated examples showing syntax and semantics of each LU. FrameNet describes Frame Elements or roles for each frame. FrameNet Frame hierarchies provide better generalizations than WordNet synset hierarchies.
FrameNet also differs from PropBank (Proposition Bank discussed by Palmer et al 2005). PropBank began with labeling verbs and their arguments. PropBank uses Penn POS tags, Penn Tree Bank parse structures. It added nouns and roles from associated verbs. This includes an efficient semantic role labeling system but there is no equivalence to frames in FrameNet. There are two levels of role labels - those that are completely general and those that are specific to lexical unit. The lemmas from text are split into sequences that carry equivalence between PropBank verb specific and FN FE names. FrameNet assigns names such as communicator, evaluee and reason for corresponding PropBank role labels.
There have been other works to represent semantic information such as Abstract Meaning Representation ( Banarescu et al, Law 2013). AMR is a graph based representation of lexical concepts and typed relations between those concepts that are denoted by an English sentence. It integrates several aspects of lexical/relational meaning - abstracting away from the grammatical details - in a single structure. This structure specifically helps with rapid corpus annotation and data driven NLP. However the lexical semantics in AMR are much shallower than FrameNet
Courtesy : Colin Baker FCSI
In general, it seems to be helpful to have a hash between a term and a corresponding ontology identifier
#codingexercise
Insert into a circular linked list that is sorted but where we are given a random node from the list not the head.
// start is a random node in the sorted circular list

Void insertCircular(ref node start, node target)

{
var head = start;
If (head == null) {head = target; target.next = target; return;}
Node prev = null;
Node curr = head;
Max = int_min;
Var pos = null;
Var prevpos = null;
while(curr.next != head){
If (curr.data > max && cur.data < target.data ){
Prevpos = prev;
Pos = curr;
Max= curr.data;
}
Prev = curr;
Curr = curr.next;
}

If (curr.data > max && cur.data < target.data ){
Prevpos = prev;
Pos = curr;
Max= curr.data;

}
If (pos == null){
If (head.data < target.data){
InsertAt(head, target);
}else{
InsertAtStart(head, target);
}
}else{
if (pos.data < target.data) {
InsertAt(pos, target);
}else{
InsertAt(prevpos, target);
}
}
}
Int rowWithMaximum1sInSortedMatrix(int[,] m)
{
Int max = int_min;
Int index = -1;
For (int i = 0; i < m.rows; i++)
{
int col = findfirst(m, i, 0, m.cols-1, 1);
If (col != -1 && m.cols - col > max)
{
Max = m.cols -col;
Index = i;
}
}
Return index;
}
The findfirst method is a binary search method.

Cluster computing

Wednesday, October 26, 2016

No comments:

Post a Comment