Sunday, April 27, 2014

Today we look at some more usages of random variables.  We mentioned so far that random variables can be combined. We know that random variables can be independent and for the different values that it can take, the average value gives a good indication of the summary. Let us take an interesting application of this technique in a hiring problem.  Let us say you wanted to hire an office assistant. We can use indicator random variables with this. Let us first describe the problem. When you hire an office assistant, you may have to interview some candidates. You want to hire a suitable candidate but to actually hire somebody you have more costs. You have to fire the current candidate and you must pay a large hiring fee to the employment agency that is sending the candidates. You are interested in estimating this price This is the hiring problem.
This is written as follows:
int Hire-Assistant(int[] candidates)
{
 int best = -1; // least - qualified dummy candidate
 foreach (var candidate in candidates)
 {
    if (interview(candidate)  > best )
    {
         best = candidate;
    }
 }
hire(best);
return best;
}
We now use probability to analyze this problem. In order to do that, we must use the assumptions about the distribution of inputs. Then we analyze our algorithm and compute an expected run-time.
Since we take the distribution over the inputs, we are averaging the running time over all possible inputs.
We use probabilistic analysis when we can make assumptions about the distribution of inputs i.e. we can assume something about the set of all possible inputs and for designing an efficient algorithm and as a means for gaining insight into the hiring problem. For the hiring problem, we can assume that the candidates are coming in random order. This means we can compare any two candidates in any order and decide who's best.In fact, we can use this fact to establish a distinct ranking of the candidates.
An indicator random variable associated with an event A is defined as 1 if the event occurs and 0 otherwise.
Let us determine the expected number of successes for the interviews. Our sample space is S  = {S, F} and we define a random variable which can take one of the two values of Success or Failure with equal probability. We can then define an indicator random variable which we can express as the event Y = Success. The expected number of successes obtained in one interview is simply the expected value of our indicator variable. 

Saturday, April 26, 2014

In today's post we discuss discrete random variables from the textbook we have been referring. A random variable X is a function from a finite or countably infinite sample space S to the real numbers. It associates a real number with each possible outcome of an experiment, which allows us to work on probability distribution induced on the resulting set of numbers. These variables can also be defined for uncountably infinite sample spaces but we will only look at random variables that are discrete.
For a random variable X and a  real number x, the event X = x to be such that {s belongs to S : X(s) = x } thus Pr[ X = x] = Sum Pr[s]
The function f(x)  = Pr[X = x] is the probability density function of the random variable X
Per the definitions of probabilities we know that Pr[X = x] >= 0
and  that the sum of the individual probabilities is equal to 1.
If we take the example of a pair of dice with six possible outcomes each and we define a random variable X to be the maximum of the two values showing on the dice, then we have
Pr[X = 3] = 5/ 36
because there are 36 possible outcomes when we take the values in pairs
and the value that X assigns is 3 since
it has 5 possible outcomes (1,3), (2,3), (3,3), (3,2), (3,1)
It is common for several random variables to be defined on the same sample space.
If there are two random variables defined on the same sample space, say X and Y
then their co-occurrence has a probability distribution function that is
Pr [ X = x and Y = y] which is the joint probability distribution.
If we fix one of the values, we can vary the other and this can be summed.
For a fixed value y, Pr[Y = y] = Sum of all x Pr[X=x and Y = y]
The same goes for a fixed value of x, where we can vary y.
We can extend this to conditional probabilities as well. For example,
Pr[X = x | Y = y]  = Pr [ X = x and Y = y] / Pr [Y = y]
We can say that two random variables x and y are independent if for all x and y
the events X = x and Y = y are independent which we can express as
Pr[ X = x and Y = y] = Pr [X = x].Pr[Y = y]
The simplest summary of the distribution of a random variable is the average of the values it takes.

Friday, April 25, 2014

Today we look at some counting theories:
Counting theories explain how many without actually enumerating how many. This is very helpful when it is not only daunting to count a set of items but also when it is difficult to make the set.
Consider for example how many different ways can we arrange n distinct elements ?
We review some of the elements of counting theory.
A set of items that we wish to count can sometimes be expressed as a union of disjoint sets or as a Cartesian product of sets.
The rule of sum says that the number of ways to choose an element from one of two disjoints sets is the sum of the cardinalities of the sets.
The rule of product says that the number of ways to choose an ordered pair is the number of ways to choose the first element times the number of ways to choose the second element.
We look at them in detail now.
If A and B are two finite sets with no members in common, then the number of ways to choose an item from one of the sets is the count of items in both sets. For example, a license plate may have either alphabets or numbers in each of the position. Since there are 26 alphabets and 10 numbers, there is only one pick out of 36. We can now extend this to sets that have duplicates and the answer does not change because it depends on cardinalities.
If we use the same sets A and B we can express the number of ways to choose an ordered pair is to choose the first element times that from the other set. For example, an icecream with 28 flavors and 4 toppings can be mixed and matched to give 28*4 different icecreams.

We sometimes need to bound the size of a binomial coefficient. For 1 <= k= < n , we have the lower bound as
(n
 k)     = n ( n-1) ... (n-k + 1)   /   k.k-1...1
which we can rearrange to get >= (n/k) ^ n
Here use the inequality k! >= (k / e ) ^ k  derived from Stirling's approximation, we obtain the upper bounds
(n
k)     = n.(n-1)...(n-k+1) / k. k-1. ... 1    <= (n^k / k!)   <= (en/k) ^ k
for all 0 =k <= n , we can use mathematical induction to prove the bound
(n
k)   <=  n^n  /  k^k.(n-k)^(n-k)
where for convenience we assume that 0 ^ 0 = 1
i.e for the trivial case we have k = 0, we have
(n
0) <= 1
and for k we assume  that it holds, now we look at k+1
(n
k+1) <= n^n/ (k+1)^(k+1).(n-k-1)^(n-k-1) which we rearrange to
write as <= n^n / k^k.(n-k)^(n-k) because the denominator increases with k+1
and it holds for k+1
For k = lambda.n, where 0<=lambda<=1, this bound can be rewritten as
(n
lambda.n) <= n^n / (lambda.n) ^ (lambda.n) . (((1-lambda)n) ^ ((1-lambda)n))



                

Thursday, April 24, 2014

We look at approximation by integrals today.When a summation is expressed in terms of a function of k where this f(k) is a monitonically increasing function, we can approximate it by integrals because it finds the area under the curves in slices. By using the summation of the rectangles under the curve we can get a pretty good idea on the bounds of the function. Moreover, this applies to both the monotonically increasing and decreasing curves.
In the case of the monotonically increasing function on a graph from left to right, the area of the slice on the left of a chosen slice will be lesser or equal and the slice on the right of the chosen slice will be higher or equal. That is we have a generic three partitions of the ranges and we can show this for any of the repeating three slices.
The integral approximation gives a tight estimate for the nth harmonic number. For a lower bound, we obtain.
Sum of k = 1 to n of (1/k)  greater than or equal to Integral of 1 to n slices of (1/x) dx = ln (n  + 1) because each slice can be of unit width.
For the upper bound we derive the inequality
Sum of k = 2 to n of (1/k) is less than or equal to Integral of slices (1/x)(dx) = ln (n) again based on  unit-width slices
Together this yields the bound of the harmonic series Sum of k  = 1 to n of (1/k) <= ln (n)  + 1.
 We note that the total area under the curve is the the value of the summation. The integral is represented by the shaded area under the curve. By comparing the areas for the lower and upper bound, we see that the rectangles are slightly under the curve in the case of the lower bound and slightly over the curve  in the case of the upper bound. We compute the areas as
Sum m-1 to n of f(x)dx  <= Sum m to n of f(x)dx and by shifting the rectanges one to the right we establish sum m to n of f(x)dx <= Sum m to n+1 of f(x)dx.

Wednesday, April 23, 2014

We will now consider some other techniques in addition to the previous post. One way to obtain bounds on a difficult summation is to express the series as the sum of two or more series by partitioning the range of the index and then to bound each of the resulting series. For example, suppose we try to find a lower bound on the arithmetic series we had seen earlier. If the number of terms were even, this could comprise of two partitions. The lower bound has to be greater than or equal to that obtained from the upper half. i.e. the lower half can now be ignored because the upper half has similar and is greater in value for the term at each of the corresponding positions. In other words, the initial terms of the summation are all constant.  The sum of the upper half  we can compute from arithmetic series to be equal to (n/2)^2.   This gives us the lower bound of Omega(n^2)  which is an asymptotically tight bound because the arithmetic sum of those n numbers is big-oh(n^2).
The caveat here is that if we had chosen to bound each term by the smallest term, because that term happens to be 1, we may have got a lower bound of n which would not be near the better bound we found.
This technique tells us that when performing an analysis of an algorithm, we can split the summation and ignore a constant number of initial terms and is generally applicable when each term is independent.
Another way this technique can help us is with infinite series.
For example to find an asymptotic upper bound on Sum k  = 0 to infinity of ((k ^ 2) / 2 ^ k ), we observer that the ratio of the consecutive terms is less than or equal to 8/9.
In this case if k > = 3 then the summation can be split into a fixed set of initial terms (in this case upto 2) and the rest ( 3 onwards to infinity) in the other set.
The latter can be said to be less than or equal to 9/8 * Sum k = 0 to infinity of ((8/9) ^ k).
In other words he total of the such a infinite series  is a constant.
This technique can also work for more difficult series such as harmonic series.
In the harmonic series of sum k = 1 to n of (1/k), we split the range into lg N pieces and upper bound the contribution of each piece by 1. Each piece consists of the terms starting at 1/(2^i) and going up to but not including 1/(2^(i+1)). This makes each piece to be upper bounded by 1 and the that for the harmonic series to be lg N + 1. With each of the pieces contributing a constant and there being lg N pieces, we get the upper bound as lg N  + 1.
We will next look at approximation by integrals.
Integrals are another technique for algorithm analysis
Today also we continue the discussion on the summation properties:
There are many techniques for bounding the summations that describe the running times of algorithms. We will outline a few methods:
1) Mathematical induction :
This is a technique where we first show that a particular condition holds for a trivial case. Then we assume it holds true for nth case and prove that it then holds for n+1 th case.  As an example, we show that the arithmetic series sum of k from 1 to n evaluates to 1/2 n (n+1). We can easily verify this for n = 1. Now for the nth case, the assumption holds and we prove that it works for n+1 as follows:
Sum k  = 1 to n + 1 of (k)  =  Sum k = 1 to n of (k) + (n+1)
                                           = 1/2 n (n+1) + (n+1)
                                           = 1/2 (n+1)(n+2)
A caveat with proving bounds by mathematical induction is that the constant hidden by the "big-oh" could grow with n and thus may not be constant. 
2) Bounding the terms:
Sometimes a good upperbound in a series can be obtained by bounding each term of the series, and it often suffices to use the largest term to bound the others. For example,  a quick upper bound on the Arithmetic series (A.1) is
Sum k = 1 to n of (k) is <= Sum k = 1 to n of (n) = n ^ 2
In general for a series Sum k = 1 to n of ak , if we let a-max = max 1<=k<=n ak,
then sum k = 1 to n (ak) <= n a-max
This technique of bounding each term by the largest term is a weak method when the series can be bounded by a geometric series.
For example given the arithmetic series as above, we can assume ak+1/ak <= r for all k > =0, where 0 < r < 1 is a constant. The sum can be bounded by an infinite decreasing geometric series, since ak <= a0 r^k
Sum k = 0 to n ak <= sum k = 0 to infinity of (a0 r ^k)
                               = a0 Sum k = 0 to infinity of r ^k
                               = a0.(1/1-r)
This is better than bounding with the quick upper bound on the arithmetic series.
There is a caveat here: a common bug in applying this method is to show that the ratio of consecutive terms is less than 1 and then to assume that the summation is bounded by the geometric series because the series could diverge i.e. the r is not less than 1 or it is not constant or the ratio of some or all pairs of consecutive terms could be may become arbitrarily close to 1. This is the case in the harmonic series for example where the r is arbitrarily close to 1.