Cluster computing

Thursday, August 7, 2014

Today we will continue our discussion on counting paths in random walks. In the previous posts, we discussed the number of paths from (0,0) to (n,y) where y >= 0 and remain above the horizontal axis. We also counted the number of paths from (0,0) to (n,0) that remain above the horizontal axis. as well as the number of paths that start from (0, 0) and have length n and remain above the horizontal axis.
With the paths of length n that start at (0, 0) and remain above the horizontal axis, we showed that they represent the shaded region in the lower right half triangle above the horizontal axis. By reflection, the same holds true for below the horizontal axis. However, what we went ahead and in proving but did not conjecture apriori is that the paths of Ilength n starting at (0,0) that remain above the horizontal axis is the the sum of the paths of length n starting at (0, 0) and ending at (n, 0) both above and below the horizontal axis. When the shaded area is represented in the graph, we can see by symmetry and reflection how this is self-evident.
We will next look at probability calculations.
Since all paths of certain length have the same probability, we can translate the path counting formulae into corresponding probability formulae:
The probability distribution after n steps is as follows:
P0(Sn = y) = (n Choose (n+y)/2) times (2 ^ -n)
Pn (Sn = y | Sm = x) = ( n-m Choose ((n - m + y - x) / 2) times 2 ^ (-n+m)
These equations are written based on the definitions of the probability as a ratio of paths satisfying an event to the total possible paths as in a grid. We have introduced this in earlier posts. In particular with the above two equations if we take n as even, then
u (n) = P0 (Sn = 0) = (n Choose n/2) times 2 ^ (-n).
The proof is described as below :
For the first equation we take the fraction of the number of paths from (0,0) to (n, y ) to 2^n
For the second equation we take the fraction of the number of paths from (m, x) to (n, y) to 2 ^ ( n -m).

The number of paths between two points (m, x) and (n, y) is given by (n-m) Choose (n-m+y-x )/ 2 ^ (n-m)
Substituting m = 0 and x = 0 we get equation 1
The two equations are thus proved.
Courtesy : Konstantopoulous mcrw.pdf

Wednesday, August 6, 2014

In this post, we continue our discussion on counting paths in single dimension simple random walks. Tonight we look at Catalan numbers. The number of paths from (0, 0) to (n, 0) that remain >= 0 is given by 1 / (n+1) times (n + 1) Choose (n/2)
How do we prove this? We take the formula we described in the previous post for finding the number of paths from (0,0) to (n, y), where y >= 0, that remain >= 0. And in this formula we substitute y=0
That number of paths was given by n Choose (n + Y) / 2 - n Choose (( n + y ) / 2 + 1)
And substituting y = 0 we get
N Choose n/2 - n Choose (n/2 - 1)
We can expand the definition of the combination n Choose r = n! / r! (N-r)!
And we will reduce the result to 1/(n+1) times (n+1) Choose n/2
We look at another corollary as well. The number of paths that start from (0,0), have length n, and remain non-negative is given by :

(n Choose ceil(n/2)) where ceil(x) denotes the smallest integer that is just greater than or equal to x.

This is different from the other corollaries we have mentioned so far. This one talks about paths that have length n. We show that their number is (n Choose ceil(n/2)) this way.

We take the number of paths we computed earlier in this post for paths from (0,0) to (n,y) that don’t touch the horizontal axis as = n Choose (n+y)/2 - n Choose ((n+y)/2 + 1)

and then we sum over all y >= 0.

If n is even (2m) then we need to take y as even (2z) otherwise the formula above is equal to zero, and we have the number of desired paths

Sum z >= 0 ( 2m Choose (m+z) - 2m Choose (m+z+1))

which we can expand the series by listing each z >= 0 and summing it.

We will see that adjacent terms cancel out and we will get the result as the first term = 2m Choose m which is n Choose n/2

If n is odd (2m +1) then we need to take y as odd (2z + 1) otherwise the formula is again equal to zero, and we have

the number of paths from (0,0) to (2m+1, any) where paths remain above horizontal) =

Sum ( z >= 0 [ (2m + 1) Choose (m+z+1) - (2m + 1) Choose (m+z+2)]

and again we expand this series for each term z >= 0 , we see that the adjacent terms cancel out leaving us with only the first term as 2m+1 Choose m+1 which is

(n Choose ceil(n/2))

Tuesday, August 5, 2014

Today we will continue our discussion on path counting in random walks.
The number of paths from (m,x) to (n,y) that do not touch level z < min(x,y) is given by :
(n-m) Choose 1/2 (n-m+y-x) - (n-m) Choose (1/2(n-m-y-x)-z)
We prove it this way:
Here the relative positions of x and y with respect to z plays any role as compared to the equation we derived in the earlier post where we found the number of paths for level z = 0 or the horizontal axis. So we can take the relative distance the other way and state that
the number of paths between (m,x) to (n,y) that do not touch level z =
the number of paths between (m, x+z) to (n, y+z) that do not touch level 0 or horizontal axis.
which is equal to
(n-m) Choose 1/2(n-m+ y + z -x -z ) - (n-m) Choose 1/2 (n - m - y -z -x -z )
resulting in
(n-m) Choose 1/2 (n-m+y-x) - (n-m) Choose (1/2(n-m-y-x)-z)
Another corollary is that the number of paths from (0,0) to (n,y) where y >=0 , that remain >= 0 is given by
n Choose (n+y)/2 - n Choose ((n+y)/2 + 1)
We prove it this way:
The target number of paths is the same as the number of paths that do not touch level -1 and are above it
which is the same as
the number of paths between (0,1) and (n, y+1) that remain > 0
which we can write as
n Choose 1/2(n+y) - n Choose 1/2(n-y-1-1)
=n Choose 1/2(n+y) - n Choose (1/2(n-y) - 1)
Now we use the equation we derived at the beginning of this post for paths that do not touch level z < min(x,y) = -1

and rewrite the above as
n Choose 1/2(n+y) - n Choose (1/2 (n-y-1-1) - (-1))

In the previous post, we discussed a path counting technique in a simple symmetric random walk in one dimension. We said that we can compute the probability of an event described by elements in a sample space by counting paths that make up the event and representing it as a fraction of all the possible paths. As an example, we took a grid with unit distance squares and the path as a sequence of + or -1. Taking the initial position S0 = 0 and a choice of other positions, we have an event. We then draw all the paths and take the ratio with 2^n paths because there are that many elements in the space.
We next looked at the number of paths that start from a specific position and end at another specific position. We represented it as a combination in terms of x and y axis displacement in the sample space
In today's post we continue the discussion on paths that start and end at specific points (m, x) and (n,y) and and do not touch the horizontal axis at all. The number of such paths is given by
(n-m) Choose 1/2(n-m+y-x) - (n-m) Choose 1/2 (n-m-y-x).
Such paths cover a shaded region that lie above the horizontal axis in the sample space grid/graph. Consider a path that does touch the horizontal axis From the point where it touches the horizontal axis, let us consider a path that we get based on reflection of this path. If we know the reflected path, we can find the original path. So we can count the number of reflected paths. Each reflected path starts from (m,x) but ends at (n,-y) so we substitute -y in place of y in the general case as (n-m)Choose 1/2(n-m-y-x).
Now we can find the number of paths that don't touch zero as the difference between the total number of paths for all x, y > 0 and the count of all reflected paths.
This gives us the formula (n-m) Choose 1/2(n-m+y-x) - (n-m) Choose 1/2 (n-m-y-x)
For a special case where we look at the paths from {1,1} to (n,y) where any y > 0 and the path doesn't touch the horizontal axis. Here we can show that the number of such paths is y Choose n times the total number of paths between {0,0} and {n,y} by substituting m =1 x = 1 in the formula above.

Monday, August 4, 2014

In this post, we will describe path counting using simple symmetric random walk in dimension 1. In this random walk there is equal probability to go forward or backward in one dimension. When the walk starts from zero, we can describe it as sum of 1 to n i.i.d random variables each with a probability of 1/2. This is a vector with values likely between -1,1 taken n times and each are equally likely. Since there are 2^n elements in {0,1} ^ n, we have the probability distribution as P(E1 =e1, E2=e2, ... En=en) = 1/2 * 1/2 * 1/2 ... = as 1/(2^n)
We now say that the Events A that depend only on the first n i.i.ds are subsets of this space.
So if there are #A elements in this space, we have P(A) = #A/2^n.
Therefore, if we can count the number of elements of A, then we can compute its probability.
First, we assign to each initial position and to each sequence of n signs, a path of length n. We can construct a grid and trace this path as cartesian coordinates.
Now let us take an example where we have an Event on this graph/grid constituting of positions and we want to compute its probability.
Here is such an example : A = {S2 >= 0, S4 = -1, S6 < 0, S7 < 0}.
The paths comprising this event are shown below. We can count that there are six paths in A.
So P(A) = 6/2^6 = 6//128 = 3/64 which is approximately 0.047.
We have thus found probabilities by counting path.
Next we look at paths that start and end at specific points. The number of paths that start at one specific point (x,y) and end at another specific point (m,n) is given by (m - x) Choose ((m -x + n - y)/2).
Additionally, If we take the start as (0,0) and finish at (n,i) then if (n+i) is not an even number, then there is no path that starts from (0,0) and ends at (n, i). Hence the totat number of paths = n Choose (n+i)/2 which is equal to zero.

Sunday, August 3, 2014

In the previous post, we did not mention how we use the law of large numbers. We will complete that today.
We claimed that pi(y) = g(y) - g(y-1) and this satisfies the balance equation. We then attempted to say the Markov chain describing the supply and demand at an electronics store can be expressed as
Mn+1 = 0 or max Zj where 1 <= j <= n-1 where Zj was the sum of all differences between demand and supply till date.
and this we approximated to Mn+1 = 0 or (Mn + ksi-0) because we can replace the random letter variables ksi without affecting the probability.
P(Mn+1) = P(0 or (Mn + ksi-0)
= Sigma x>= 0 P(Mn = x) pxy
Since the n-dependent terms are bounded, we may take the limit of both sides as n-> infinity.
and lim n-> infinity for P(Mn = y) is our definition for pi(y)
we can write the earlier equation as pi(y) = Sigma x >= 0 pi(x) pxy. This shows that pi satisfies the balance equations however we need to prove further that Sigma-y pi(y) = 1
That is we have a weighted distribution of the probabilities where the sum is normalized to unity.
Here we say g(y) is not identically equal to 1 when the pi adds up to unity.
we prove this by using the assumption E.ksi-n = -mu < 0
P(lim n->infinity Zn/n = -mu ) = 1 from the law of large numbers.
This implies that the P(lim n->infinity Zn = -infinity) = 1
But we know that we wrote down P(lim n-> infinity Mn = 0 or max {z1, z2 ...} < infinity ) = 1
This gives g(y) = P(0 or max {z1, z2, ...} > y) is not identically equal to 1.

I'm going to take a short break today to discuss storage or queuing application of a Markov chain. A storage facility has Xn electronic items at time n. A number An of the new components are added to the stock, while a number of them are sold to the market. If the demand is for Dn components , it is met instantaneously with what is available. If all of Dn cannot be satisfied, then it is satisfied with the stock reaching level zero. at time n+1. Thus the store has items Xn + An - Dn components in the stock or zero at time n+1 .
we write this Xn+1 = (Xn + An - Dn) or 0.
We make the following assumptions:
1) The pairs of random variables (An, Dn), n >= 0 are independent and identically distributed random variables (iid). The supply and demand at any time n is independent of the previous states and come from the same distribution and
2) E(An - Dn) = -mu < 0 . Thus the demand is larger than the supply per unit time on average which is indicated by the negative sign.
Now, Xn, n >= 0 is a Markov chain and we will show this Markov chain is positive recurrent.
Let us take xi (greek-letter) = An - Dn
Then the Markov chain is a simple recursion Xn+1 = Xn + xi or 0 for all n >= 0
Let us try solving the recursion.
In step 1, X1 = (X0 + xi-0) or 0
In step 2, X2 = (X1 + xi-1) or 0 = X0 + xi-0 + xi-1 or xi-1 or 0
In step n, Xn = (X0 + xi-0 + ...xi-n-1) or (xi-1 + ... + xi-n-1) or ... or xi-n-1 or 0
The probability that Px(Xn > y) can now be written in terms of step n. Thus the event whose probability we want to compute is a function of (xi-0, ..., xi-n-1) . Moreover these xi-0, ... xi-n-1 are i.i.ds so we can move them around and even reverse it without altering the distribution.
Then for abbreviations, we use
Zn = xi-0 + ... + xi-n-1 and
Mn = Zn or Zn-1 or ... Z1 or 0
and g = Px(Xn>y)
P0(Xn > y ) can then be written as P( Mn > y ) and since Mn+1 is >= Mn, we know that
P0(Xn > y) is less than and tending to left hand side than P0 ( X n+1 > y ) for all n. At infinity the limit of this bounded and increasing sequence exists.
We can take this difference as pi(y) = g(y) - g(y-1) and we claim that it satisfies the balance equations.
But Mn+1 = 0 or Max Zj where j is in 1 to n+1
which we can say is = 0 or (Mn + xi-0) . We make this approximation based on the law of large numbers.
Hence for y >= 0
P(Mn+1 = y ) = Sigma- x>=0 P(Mn = x) .pxy. Since these terms are bounded, we may take the limit of both sides as n-> infinity, we find pi(y) = Sum x>=0 pi(x)pxy. With this definition, where the weights add up to 1 we can say that pi satisfies the balance equations and g(y) does not become 1 at infinity.
Courtesy : Kostantopoulous