Thursday, September 21, 2017

We continue to review the slides from Stanford that introduce Natural Language Processing via Vector Semantics.We said that vector representation is useful and opens up new possibilities. We saw that a lookup such as a thesaurus does not help.
Stanford NLP has shown there are four kinds of vector models.
A Sparse vector representation where a word is represented in terms of the co-occurrences with the other words and using a set of weights for their co-occurrences. This weight is usually based on a metric called the mutual information.
A dense vector representation that takes one of the following vector models:
A representation based on weights associated with other words where the weights are computed as using conditional probabilities of the occurrences and referred to as latent semantic analysis
A neural network based models where the weights with other words are first determined by predicting a word based on the surrounding words and then predicting the surrounding words based on the current word
A set of clusters based on the Brown corpus.
#codingexercise
Find the minimum number of squares whose sum equals to a given number n
We write a few base cases say upto n = 3
For the n greater than that, we can initialize the number of squares to be the candidate we consider from 4 to n. Each number can be represented with the maximum number of squares as those comprising of unit squares only.
Next for each number from 1 to that candidate, we can recursively calculate the maximum number of squares for the n minus the square of the iterator and incrementing one towards the count. We update the minimum as we find for each iterator. All the results are memoized for easy lookup. This results in the smallest number of squares being found in the table entry for n.

No comments:

Post a Comment