Today we continue our discussion on natural language processing with a particular word ontology called FrameNet. We see how it is different from other ontologies , such as WordNet, PropBank , AMR etc. While AMR introduced graphical model, it did not have the rich semantic grouping that FrameNet had. Now let us look at a specific example of a model that make use of terms in a graphical data structure to see what FrameNet may bring to it. To do that, we consider building a classifier could be build that creates lexical units for FrameNet frames and maps terms or senses to their corresponding FrameNet Frames.
Let us first look at one such model that does POS tagging with the help of graphs. It relies on the assumption that similar words will have similar Part of Speech tags. A similarity graph is constructed and the labels are propagated. Data both labeled and unlabeled is represented by vertices in a graph. Graph edges link vertices that are likely to have the same label. Edge weights determine how strongly the labels agree. Usually the number of labels are small. And even smaller are the labels that are composites. The number of atomic labels may be small but there may be much large number of ways to combine them. One way to combine them would be to structure them as a tree.
A classifier is trained on prelabeled data. Then it is used to predict the decode the labels from the target domain. The classifier which here can also be called a tagger can be based on any suitable model. For example, it can be based on word matrices. However computations for large matrices do not scale well especially if there are inversions involved. On the other hand conditional random fields are better suited for these purposes. They scale well because they use only conditional probability distributions. Standard inference and learning techniques as well as standard graph propagation technhiques are also scalable. The building blocks for the CRF inference could contribute to the efficiency.
The label propagation works like this:
Initialize the labels at all nodes in the network. For a given node, the label at start is the initialization label.
Set the iteration to 1
Take the nodes in a random order and assign it to a list
For each node in that list, use a function to assign the labels from all those adjacent to that node from the assignments in the previous iteration. For example, this could be the most frequently occurring label among the adjacencies.
If every node satisfies the chosen function, stop the algorithm or repeat with the next iteration.
#trivia question
What is a keep-alive header ?
The keep-alive header provides connection use policies. It's a hop by hop header that provides information about a persistent connection. It includes a timeout that determines how long a connection can be idle before it is closed.
Let us first look at one such model that does POS tagging with the help of graphs. It relies on the assumption that similar words will have similar Part of Speech tags. A similarity graph is constructed and the labels are propagated. Data both labeled and unlabeled is represented by vertices in a graph. Graph edges link vertices that are likely to have the same label. Edge weights determine how strongly the labels agree. Usually the number of labels are small. And even smaller are the labels that are composites. The number of atomic labels may be small but there may be much large number of ways to combine them. One way to combine them would be to structure them as a tree.
A classifier is trained on prelabeled data. Then it is used to predict the decode the labels from the target domain. The classifier which here can also be called a tagger can be based on any suitable model. For example, it can be based on word matrices. However computations for large matrices do not scale well especially if there are inversions involved. On the other hand conditional random fields are better suited for these purposes. They scale well because they use only conditional probability distributions. Standard inference and learning techniques as well as standard graph propagation technhiques are also scalable. The building blocks for the CRF inference could contribute to the efficiency.
The label propagation works like this:
Initialize the labels at all nodes in the network. For a given node, the label at start is the initialization label.
Set the iteration to 1
Take the nodes in a random order and assign it to a list
For each node in that list, use a function to assign the labels from all those adjacent to that node from the assignments in the previous iteration. For example, this could be the most frequently occurring label among the adjacencies.
If every node satisfies the chosen function, stop the algorithm or repeat with the next iteration.
#trivia question
What is a keep-alive header ?
The keep-alive header provides connection use policies. It's a hop by hop header that provides information about a persistent connection. It includes a timeout that determines how long a connection can be idle before it is closed.
No comments:
Post a Comment