Thursday, July 17, 2014

We mentioned that Random walk is a special case of a Markov chain. To understand Markov chain, we take the example of a mouse in a cage with two cells - the first has fresh cheese and the second has stinky cheese. If at time n, the mouse is in cell 1, then at time n+1, the mouse is in either cell1 or cell 2. Let us say the probability to move from cell 1 to cell 2 is alpha and the reverse is beta.  In this case, alpha is much smaller than beta.  Then the transition diagram shows transitions as follows:
1->1 with a  probability of 1- alpha
1 -> 2 with a probability of alpha
2 -> 1 with a probability of beta
and 2->2 with a probability of 1 - beta.
The mouse makes the choice of staying or moving with equal probability, the time it takes for the mouse to make the first move from 1 to 2 is  the mean of the binomial distribution = 1/alpha.
Thus the moves are described in terms of random variables Xi. Let T denote the subset of integers in a  sequence of random variables {Xn : n belongs to T}.  The Markov property is then defined as follows:
any n belonging to T, the future process Xm  (m > n) is independent of past process Xm (m < n) and conditionally dependent on Xn.
In the case of the mouse, the probability of the move is regardless of where the mouse was at earlier times i.e the future is independent of the past given the present.
This dependence of the future on the present is easy to generalize with random numbers instead of deterministic objects. The states are countable and can be denoted by state space S and since the moves are between states, Xn is called a Markov chain.
The Markov property talks about all before and after processes so (Xn+1 .... Xn+m) and (X0, ... Xn-1) are independent conditionally on Xn.
The walk based on a Markov chain is thus dependent on transition probabilities. If the transition probabilities are defined, then the walk is just iterative.
The joint probability distribution can be expressed in terms of two functions -
one for the initial states for each i in the state space S
and the other for the transitions pi,j(n, n+1) = P(Xn+1 = j | Xn = i), i, j belongs to S, n >= 0
We can generalize the transition from one step to more than one steps with
 pi,j(m,n) = P(Xn = j | Xm = i).
We call Markov chain- time homogeneous when the single step transition probability does not depend on n
In the case of the mouse, the probability of move is independent of where it was at earlier times. Each single step transition probability, denoted by just P, is the same no matter how many times it is repeated and the transition from P(m,n) = P * P * P ... n-m times.
In a random walk, we add a few more properties we discussed earlier.
Now let us consider a random walk on the bipartite graph starting at an unclassified term and arriving at another semantically equivalent domain term. The walker starts from any node and then moves to any other node with transition probability Pij.
Pij is the normalized probability of the co-occurrence counts of the terms and the corresponding context.The transition probability between any two nodes i,j is defined as Pi,j = Wi,j / Sum(Wik) for all k.
The terms are either unclassified or domain and cannot be connected with each other except through context.
Each walk uses the transition probabilities for the move.
Each walk starts at some unclassified term and ends at a domain term or exceeds the maximum number of moves without finishing.
Several walks upto say K times can be initiated.

No comments:

Post a Comment