Cluster computing

Saturday, March 8, 2014

In today's post we revisit P, NP and NP completeness in algorithms and with examples.
There is a very celebrated open problem which is if P = NP and this has not been answered.
we find polynomial times solution for searching sorting, minimum spanning tree etc.
P is the class of all problems that can be solved by some algorithm that takes polynomial time.
There are certain problems that are not decidable. But there are a set of problems that are decidable that may or may not be polynomial time. For example, the problem of deciding whether a context free language is a regular language is decidable because it can be shown that a regular language can be expressed in context-free language. Therefore we have seen so far that the space of decidable algorithms includes the class of problems P and is a superset. Within this decidable space is a subset NP which is a class of all problems that can be solved by some non-deterministic algorithm that takes polynomial time. In other words NP is a superset including P but includes problems that can be solved non-deterministically. Non-deterministic means unintuitive notion.
Deterministic choice is when we use some conditional branching like if ( I = 5 ) etc ..Non-deterministic choice is when we guess the value and make the choice for eg ( if (*) then etc. Programming languages are usually deterministic because they will look at all available data points before making a choice. A non-deterministic problem will magically pick up the solution because it does not try to exhaust the search space and is much faster.
There are actually large class of very interesting problems in various domains that have actually a non-deterministic algorithm that we don't yet have a deterministic algorithm.
we know that P is a subset of NP but we don't know if P = NP. Most researchers tend to agree that the equality is not really there.
Further there is a sub class of problems in NP that is called NP complete (NP-c) that is the set of the hardest problems. They are hardest because if you solve one of them, you can show that an entire class of problems can be solved similarly.
One interesting observation is that : If A is NP-c, then A belongs to P iff P = NP.
These are again from various domains.
In Practice, if you cannot find a polynomial time algorithm, then you can claim that the problem is NP-complete i.e. solving A fast in P is unlikely and it can be claimed to be a problem that cannot be addressed in this paygrade and is higher than it.

Friday, March 7, 2014

In today's post we look at some more difficulties with numerical algorithms and their computations.
We know the roots of a quadratic with well known formulas.
The first problem is how do we compute the square root. There is a very neat algorithm by Newton which is iterative and converges pretty rapidly,
x0 = (1 + D ) / 2 and x n+1 = (xn + D / xn ) / 2
where D > 1 . In 4 iterations, error is within 10 ^ - 15.
Stopping when we have reasonable values.
For eg: when we have sqrt(2)
we compute x0 = ( 1 + 2 ) / 2 = 1.5
x1 = (1.5 + 2 / 1.5 ) / 2 = 1.416667
x2 = ( x1 + 2 / x1 ) / 2 = 1.414216
and in 4th iteration, there is litttle or no change.
The roots of a quadratic as given by (-b + - Sqrt( b ^ 2 - 4.a.c ) ) / 2 . a
Here another problem is given by the fact that the numerator has a negative component and there is a cancellation possible. This is the substractive cancellation in the x2 numerator.
Take for example, x ^2 - 10 ^ 5 x + 1 = 0
x1 * = 999999.999990
x2 * = 0.000000000001

In this case, we work around it by rewriting the formula as the same multiplied by ( -b - sqrt( b ^2 - 4.a. c) / ( -b - sqrt( b ^2 - 4.a. c) where the numerators on multiplication gets multiplied and squared and the denominator becomess both components negative and we have eliminated substractive cancelation in favor of addition.

If b < 0 we take x1 as before and x2 as 2c/ ( -b + sqrt( b ^ 2 - 4.a.c))
o summarize, numerical algorithms may suffer from
Truncation errors ( we use Finite Approach)
Round off erors ( we use Bounded Representation )
Overflow, underflow,
substractive cancellation,
ill conditional problems
Therefore they require careful analysis of errors.

I read an MSDN article written a while back titled Don't let memory allocation failures crash your legacy STL application. Here it is mentioned that application developers who use STL for C++ and Visual C++ compiler 6.0 have a high risk of crash under low memory conditions. This is because the new operator does not respond in a standard manner when it fails. The response could be to throw an exception or return NULL. Additionally, in 6.0 there was no exception thrown but in 7.0 and later exception is thrown. Furthermore, MFC adds its own behavior to this operator but Visual C++ and MFC are different products. 6.0 was not STL compliant. C++ .Net version was more compliant than earlier versions. The exceptions thrown by C++ compiler and MFC are different.
So a project written in 6.0 and using STL has some challenges.
Generally new is not checked. This is because its assumed to never fail or because it is considered to throw an exception. But both assumptions are flawed. Low memory and no exceptions are possible.
The STL implementation at the time of 6.0 for a string copy was to allocate and increment the pointer. It was correct when it relied on the assumption that new would throw when it failed and it correctly caught all exceptions. So handlers were added.
Sometimes its preferred to use the new operator with the nothrow specification. This is also common. However what is not common or expected is that the new operator may still throw even with that specification. This could be because the constructor might fail or because the 6.0 version is built for Release and the release version throws while the debug version doesn't. When this happens the pointer is not NULL. and with the release version the exception is bad allocation.
This is fixed as follows: First we specify that this handler should not throw. Then we wrap the new operator in a try catch and catch bad alloc exception. Next we set the pointer to null in the catch block.
If the new were to be changed to throw an exception as is mandatory with the STL, then the fix would be to wrap the std:: nothrow. This may work for code with which these handlers are compiled. But exiting third party libraries that use std::nothrow now pose a challenge. In this case, either the codepath using that nothrow specification is to be modified or avoided or recompiled with the static library.
One more issue in this context is that the STL itself may use this form of nothrow allocators and so they may need to be avoided or STL implementations that don't use nothrow can be used instead.

Thursday, March 6, 2014

In today's post we will look at some more challenges with algorithms for numerical computations.
The second type of problems is round off errors. Here we use bounded representations. i.e e use the following format to specify values : sign, mantissa, an implicit base, an exponent. We use this to represent 6.2 x 10 ^ 23 as + . 602 (10) 24
However this does not represent numbers that don't have a finite number of digits such as pi which will then be truncated.
Binary representations are p bits where base is 2
More precision implies more accuracy but it also implies slower computations.
Single digit precision is 6/7 digits after, double digits precision is 13/14 digits after and extended precision is 19/20 digits after.
This rounding off error has to be measured.
We do it this way.
If we denote the actual value of a numeric variable as alpha-* and its representation as alpha, then the error is represented by the magnitude of (actual - representation)
There are other errors to representation such as
1. overflow This means that value is too large to represent.
We handle this by reordering operators. For example a large number 1 multiplied by large number 2 divided by a large number 3 could be calculated by first doing the division and then multiplication.
Another approach is to use alternate representations where we pre-compute or use notations such as to avoid doing 100!/(2!98!) and instead use 100.99/2
The third way to solve overflow is to use logarithms since they reduce the numbers considerably.
The second problem is underflow.
This manifests itself when values are too close to zero and computation proceeds with the value zero, instead it should be marked with a flag to indicate underflow. This is different from overflow in that computation could proceed as if the value was zero.
The third problem is that arithmetic operations are not exact.
i.e if we take the value of pi and remove the seventh place number which happens to be the digit 6, we expect the difference between old and new values to be 6 * 10 ^ (-7). but this is taken as zero because the relative error is large.
The errors cascade too in the numerical computations. For example, early errors are propagated to later stages. Numbers in a computation may change slightly and this could result in a completely different answer.

Wednesday, March 5, 2014

We look at the challenges of numerical algorithms i.e we show that there are limitations of algorithmic power.

Numerical algorithms are those for computations of problems in a continuous mathematical world.
Therefore the first obstacle is that we have finite approximations of what is really infinite objects.
This results in truncation erros.
The second class of errors we get from round off errors. Real numbers cannot be used accurately because we cannot represent the real answers so we put them in a range.
To overcome the first limitation, we use what is called the Taylor's expansion.

e ^ x = 1 + x + x ^ 2 / 2! + x ^ 3 / 3! + ... x ^ n / n! + ...

we stop the expansion at a definite n. Anything beyond that fixed n is the truncation error.

The second class is evident from when we take a graph of the curve and want to compute the area under it. In this case, we slice up the area into geometrical shapes, trapezoidals to be precise and this is called the trapezoidal rule by which we approximately compute the area under the curve.

One way to get an upper bound on the truncations error is
to use the equation:
e ^ x - ( 1 + x + x ^ 2/ 2! + ... x ^n / n! ) <= M/(n+1)! |x| ^ n+1 where M = max e ^ y for 0 <y <x
i.e by choosing an upper bound M for any y that can be between 0 and x, we are able to establish an upper bound on the truncation error.

We can take an example to show how we come up with M
Let's say we target e ^ 0.5 with an error of 0.0001
Since 0< y < x and e ^ y is increasing , this has to be less than e ^ 0.5. Since e is between 2 and 3, we can take value of M as 4 ^ 0.5 = 2
Now the truncation error is M / (n + 1)! |x| ^ n + 1 and this is determined to be upper bounded by ( as per our chosen values) as 2 / (n + 1) ! | 0.5 | ^ n + 1 which should be less than 0.0001 and this lets us find out n. In this case, we may find n = 5.

We can similarly write a formula to find the upper bound on the error of the trapezoidal rule. Here the problem is defined as how thin and how many trapezoidals can be put between lower and upper limit of the curve.

In Today's post we look at the limitations of algorithm power specifically the lower bound arguments.
If a problem P takes time t(n) then there is a time t under which no algorithm can solve P.
Knowing the lower bound is advantageous. For example, we know that sorting algorithms can take O(n^2) in some cases and O(nlogn) in other cases. Searching a sorted array with brute force takes O(n) and with binary search takes O(log n) and thus an improvement. Minimum spanning tree on a connected graph is done with Prim's algorithm in O(mlogn) and single source shortest path is done by Djikstra's algorithm in O(mlogn). How far can we go in each of these cases ? This is perhaps answered by a lower bound.
Some problems can be really hard to solve. For example factoring a number into its prime numbers is hard to solve. Others require a lot of processing. For example, The RSA cryptographic scheme as used by say a bank to secure its clients messages generates two large prime numbers p and q and takes its product p.q as n. Then it creates a decryption encryption pair de = 1mod(p-1)(q-1) where (e,n) is taken as the public key and d is taken as the private key. When a message is sent by a client, he signs the message with m^e mod n. When the bank receives the message, it decrypts it by applying ((m^e)^d) mod n which is equal to m. Since factoring a number n into p.q is hard, this is a good security scheme.
A lower bound can be trivially established by enumerating the cases that need to be generated. This is the case for generating permutations of n elements since it will be n! Similarly a search in an unsorted array will have to cover the n elements and that O(n) becomes the lower bound.
Another way to establish the lower bound is the information theoretic. This is the case in a binary search where we know the number of questions that need to be answered before a problem can be solved.
Another way to establish a lower bound is by using an adversary arguments. In this case we don't show the full input and an adversary answers the questions an algorithm asks on behalf of the input.
as an example, let us take the problem of merging two sorted lists. In this case an algorithm is forced to ask 2n-1 questions on the input to come up with the merged list. That is the first element of the first list and the first element of the second list are compared and the second element of the first list and the first element of the second list are compared. i.e the way in which these comparisions are made are finite and pre-determined. If an algorithm asked a different set of questions and produced another output that was valid that would mean there are two different algorithms that produce two different valid outputs which should not happen for the same given input. Thus this is another way of establishing a lower bound.
The final way of establishing a lower bound is by a class of techniques called Reduction. Here if P reduces to Q in constant time, then any lower bound for P will hold for Q as well.
If Q were solvable any faster then that means P could be solved faster too which is a contradiction to the assumption we started with. Hence reduction can also be used to find the lower bound.
Courtesy: @mecr.org

Tuesday, March 4, 2014

In today's post we discuss RabinKarp algorithm for string matching based on mecr lectures. We match a string pattern P of length m in a target string T of length n. The pattern matches O(mn) times when we go character by character in the target string. This can be improved with two good ideas:
one is that on a mismatch we try to skip ahead and we can skip upto m characters. So we have O(m) extra space and O(m + n) worst case time. This is still an improvement.
second is that we do approximate matches where if there is a mismatch, we know for sure but if there is a match, it may or may not be so. This way we can skip ahead and definitive mis-matches.
We implement the approximation for instance with a hash match. We take strings at index 0 to n - m and hash m characters to see if the hash matches that of the pattern.
If a portion of the string matches, then their portion of hash should match. But we look at the opposite way here. If the hash of the match is different then we know that the string is not the same as the pattern and we can skip ahead. So while we do a little bit more work in the hashing and consecutively our worst case is more than the first approach but in most cases our hash function will work very well. We just have a hash function that matches our needs which are:
1) we should pre-compute a hash of the pattern
2) we compute hash of string at index i to i+m-1for different i
3) the hash function should be such that
h(T[i, i+m-1]) = g(T[i]) "+" g(T[i+1]) "+" .... g(T[i+m-1])
where + allows us to compute
h(T[i+1, i+m]) = g(T[i+1]) + g(T[i +2)) + .... g(T[i+m])
with these properties it takes a constant time to compute
h(T[i+1, i+m]) from h(T[i, i+m-1])
In addition if this hash function has very few collisions, then we have a fast running time of O(n+m)
As an example, we can take a hash function h(string.letter) as 4h(string) + h(letter)
if a string is made up of DNA letters ACGT
h(ACT) = 4h(AC) + h(T) and so on.