Cluster computing

Wednesday, September 23, 2015

[Russia Olympiad 1961]
Real numbers are written in an m × n table. It is permissible to reverse the signs of all the numbers in any row or column. Prove that after a number of these operations, we can make the sum of the numbers along each line (row or column) nonnegative.

Real numbers can be positive, negative or just about any number that is not imaginary or hold special meaning such as infinity.

We make the following observations:
1) Note that the size of the board is not a square. So the conclusion doesn't depend on the size of the rows or the columns to be equal. In other words, the sum of the elements along each line can be assumed far greater than any one element.

2)When we pick a row and reverse its sign, one cell of all column changes sign. After this turn, each column sum still remains positive because the sum of the elements of the other cells is still positive
If we repeat this operation twice, the cells are returned to their original sign leaving the sum positive.
The same holds true if we substitute the column for the row in above. Even if the values changing sign is the largest in their respective orthogonal lines, the sum along these lines does not change sign from 1) above
So any number of operations on the same line does not alter the sign of the sum of the numbers along each line.

3)When a row or a column is picked alternatively and reversed once there is only one element that returns to the sign of its original. If the same row and column is reversed again and again only the sign of this element changes to negative. If we repeat this N times for this row or column pair, the sum will be greater than zero when N is even.

4) When we pick different rows incrementally and flip their signs, the sum of the numbers along each line becomes negative but as we do a number of these operations, the rows wraparound and we have all the rows positive again

5) when all the rows and columns both change , after every alternate turn at most 2m elements such as from the diagonals will have flipped their sign unfavorably where m > n. Therefore, we will have the board again favorably

Tuesday, September 22, 2015

A marker is placed at the origin of an integer lattice. Calvin and Hobbes play the following game. Calvin starts the game and each of them takes turns alternatively. At each turn, one can choose two (not necessarily distinct) integers a and b, neither of which was chosen earlier by any player and move the marker by a units in the horizontal direction and b units in the vertical direction. Hobbes wins if the marker is back at the origin any time after the first turn. Determine whether Calvin can prevent Hobbes from winning.

Here Hobbes can return to the origin at any time by 'undoing' the displacements from the origin.
Suppose Calvin can moves include (a,b), followed by (-c,-d) by Hobbes and (e,f) by Calvin again . Then Hobbes can move back to the origin by (a+e-c, b+f-d).
We observe the following:
1) the Hobbes tries to move back towards origin while Calvin tries to move away.
2) none of the moves repeat that is a, e, c are all distinct and so are b, d, f
3) for every positive move that Calvin makes, there is a negative move that Hobbes can make which is distinct and non-repeating. Hence the origin is a P-Position. A P-position is one where the player who has just played can force a win. A position is called an N-position if the player whose turn it is can force a win and Hobbes uses it to his advantage. Therefore Calvin must find a P position other than the origin.
4) when the moves are strictly horizontal or strictly vertical, Hobbes cannot reset back to the origin because one of the co-ordinates is 0 and it will have to be repeated to get back to the P-position.
Therefore Calvin can find P-positions by either moving horizontally only or vertically only
For example, Calvin can move (1,0) in the first move and Hobbes is forced to move in the vertical direction, say (-1, -1), then Calvin can undo the vertical direction by flipping the marker to the opposite side with (2,-2) In other words Calvin forces two of the diagonals to be P-positions and maintains his moves to these diagonals with non-repeating numbers.
5) Since the numbers are incrementing by 1 and Calvin uses up both the positive and negative of the same integer, Calvin has a winning strategy.
6) the axes can be the players N positions since moving to the axes can let them use the diagonal.
7) once both players arrive at the diagonal they can remain on the diagonal by alternating

Monday, September 21, 2015

Example 10 [China 2010, Problem 5]
There are some (finite number of) cards placed at the points A1, A2, …, An and O, where n ≥ 3. We can perform one of the following operations in each step:

(1) If there are more than 2 cards at some point Ai, we can remove 3 cards from this point and place one each at Ai-1, Ai+1 and O (here A0 = An and An+1 = A1)

(2) If there are at least n cards at O, we can remove n cards from O and place one each at A1, A2, …, An.

Show that if the total number of cards is at least n2+3n+1, we can make the number of cards at each vertex at least n + 1 after finitely many steps.

Solution : The total number of cards stays the same.

(a) if we balance out the cards at each Ai, then there won't be disproportionate cards at any one point.
(b) We can make each of the Ai's have 0, 1, or 2 cards.
(c) from observation b, we can have 1,2 or 3 cards from the stage where they have 0,1 or 2 cards, by applying operation 2.
(d) based on observation c, we can make each of the Ai's have 1, 2, or 3 cards. Suppose x of the Ai's have one card, y of the Ai's have 2 cards and z of the Ai's have 3 cards. The number of cards at O is (n2+3n+1)-(x+2y+3z) Since (x+y+z) = n, the latter component is <= 2n if x >= z and O will have n2+n+1 cards. Then all of Ai's will now have at least n+1 cards and O will have n2+n+1-n2 cards and we are done.
Based on observation d(), we can start with points on the perimeter having 1,2 or 3 cards and ends in a position having 1, 2 or 3 cards but the number of points having 3 cards is not more than those with 1 card. We can do this by ensuring that any two points having three cards has a point with one card in between.

We can do this by applying 1 on all consecutive 3's to transform the set (x,3, ... 3,y) to (x+1, 1, 2 ... 2, 1, y+1) and there are no adjacent 3's. Now suppose there are two 3's with only 2's in between them like (x,3,2,2, ... 2, 3, y). Then we can do operation 1) on the first 3 to covert its adjacent to 3 and then again to the next adjacent 3. This we can please repeat as long as there are no adjacent 3's.

Sunday, September 20, 2015

IMO 2007
In a mathematical competition some competitors are friends; friendship is always mutual. Call a group of competitors a clique if each two of them are friends. The number of members in a clique is called its size. It is known that the size of the largest clique(s) is even. Prove that the competitors can be arranged in two rooms such that the size of the largest cliques in one room is the same as the size of the largest cliques in the other room.
Let M be one of the cliques of largest size. It is given that the largest size is even, therefore abs(M) = 2m. Let's put all of them in the same room A and all the others in room B. Now if c(A) and c(B) denotes the size of the largest cliques in rooms A and B at a given point in time. Since M is a clique of the largest size, we initially have c(A) >= c(B)
Now our task is to balance them and so we move members from room A to room B. As we break the clique membership in room A, we decrease its size by 1. In the other room the clique size may or may not increase by 1. In other words, it may increase by at most 1. Therefore while the clique size of A is greater than B, keep sending one member to room B.
The Balance happens when c(A)<=c(B)<=c(A) +1. In other words they are just about the same or exactly the same. At this point the size of the clique in room A has to be at least m. We can argue the contrary and see that we prove this. If there were less than m in A, then there would be m+1 in B and at most m-1 in room A, implying c(B)-c(A) >= (m+1)-(m-1) = 2. But we just now argued that they should be just about the same or exactly the same when we stop balancing.
if c(A) = c(B), we are done. At this point the required proof holds already.
So far we have covered the cases when the they are unbalanced or overbalanced or exactly the same.
All that is left now is when they are just about balanced which is the case when the clique sizes in the two rooms differ by one member.
Let us say c(A) = k and c(B) = k + 1. Now if there is a competitor in B who is also in M but is not in the biggest clique in B then by sending her to A we increase the clique size there by 1 but do not affect the clique size of B. At this point again we are done.
Now suppose that there is no such competitor. This case remains to be handled. For this case, we take a member from the clique of B and send it to A. The clique size reduces in B but it does not increase the size of A. How do we guarantee that ? Well suppose there is no competitor in B who is also in M but not in the biggest clique of B This would mean that the all the members of the intersection B and M would be in the cliques of sizes k+1 That means the competitor would have no membership in cliques of A and therefore no increase in the size of A.

There are 2n people seated around a circular table, and m cookies are distributed among them. The cookies can be passed under the following rules:

(a) Each person can only pass cookies to his or her neighbors

(b) Each time someone passes a cookie, he or she must also eat a cookie
Let A be one of these people. Find the least m such that no matter how m cookies are distributed initially, there is a strategy to pass cookies so that A receives at least one cookie.

Let us denote the people with symbols such as A-n, A-n+1, A-n+2, ...A-1, A0, A1, ... An where A-n and An are the same person since it is a circular table.

A weight 1/2^abs(i) is assigned to each cookie held by a person Ai. Thus for example if A3 passes a cookie to A2, the cookie's weight increases from 1/8 to 1/4. Since A3 must also eat a cookie of weight 1/8 in this step, we see in this case, the sum of the weights of all the cookies has remained the same. The sum of the weight of all the cookies has remained the same. If Ai has ai cookies for each i, then the total weight of all cookies is Sum of i from -n+1 to n is ai / 2^ abs(i).

Whenever a cookie is passed towards A0, (from Ai to Ai-1 or the reverse direction), one cookie is eaten and another cookie doubles its weight, so the total weight remains invariant. If a cookie has passed away from A0, then the total weight decreases. Thus the total weight is indeed a monovariant.

If m >= 2^n, we can always ensure that the A0 gets a cookie. Any of the directions could be chosen but it should not pass A0 because the weight would reduce. In each step therefore a cookie progresses towards A0 from either side of the diametrically opposite end. We use a new quantity to indicate the direction. Let W+ be the sum of the weights of the cookies held by A0, A1 ... An and let W be the sum of the weights of cookies held by A0, A-1, A-2 ... and we can suppose W+ >= W=. Then this suggests An can pass cookies to only An-1 and we use only this semi-circle containing non-negative indices, since this is the semi-circle having more weight. In each step, as m any cookies are passed as possible to An-1 and similarly forwarded. This works only if and only if W+ is > 1 which is necessary as W+ is a monovariant.
To show that this is sufficient, we note that the algorithm leaves the W+ an invariant. The algorithm terminates when we cannot pass anymore cookies from any of the Ai with i positive, and A0 does not have any cookies.A1, A2, ... An all have at most one cookie at the end. If they had more, they would eat one and pass one and the algorithm would not have terminated. Then W+ would sum upto 1/2 + 1/4 + ... + 1/2 ^ n < 1, contradicting the fact that W+ is an invariant and >= 1. Thus W+ >= 1 is a sufficient condition for the algorithm to work. Finally we prove that we indeed have W+ > 1 We assumed W+ > W-. Now simply note that each cookie contributes at least 1 / 2 ^(n-1) to the sum(W+ and W-), because each cookie has weight at least 1/2^(n-1) except for cookies at An. However, cookies at An are counted twice since they contribute to both W+ and W-, so they also contribute 1/ 2^n-1 to the sum. Since we have at least 2 ^ n cookies, W+ and W- >= 2, so W+ >= 1 and have proved that this is both necessary and sufficient.

Note that the use of a geometric progression ensures that the cookie can be consumed and passed along and using the property of the sum of this progression we bound it at the diametrically opposite end of the table because that suffices to have everyone get a cookie.

The criteria for weights on a semi-circle to be more than 1 is therefore of a consequence of the above.

Saturday, September 19, 2015

IMO Shortlist 1994

Peter has 3 accounts in a bank, each with an integral number of dollars. He is only allowed to transfer money from one account to another so that the amount of money in the latter is doubled. Prove that Peter can always transfer all his money into two accounts. Can he always transfer all his money into one account?

Let A, B, C be the number of dollars in the account 1, account 2 and account 3 respectively and A <= B <= C. If A = 0, we are done. At any time we want to reduce min(A,B,C) to the point where it is 0. The values of A, B and C can keep changing

Euclidean theorem monotonically reduces a number by using form B = qA + r with 0 <= r < A. We use this form. With this form if we can reduce B to r then we are done. Since r < A, we would have reduced min(A,B,C) which was our aim.

The question however says we have to double so we use binary representations. Let

q = 2M^k + ... + 2M1 + M0 be the binary representation of q and where Mi is 0 or 1. To reduce B to r, in step i of the algorithm, we transfer money to account 1. The transfer is from account 2 if Mi-1 = 1 and from account 3 if Mi-1 = 0. The number of dollars in the first account starts with A and keeps doubling in each step. Thus we transfer Aq dollars from account 2 to account 1, and we are left with B-Aq = r dollars in account 2. At this point we have reduced min

The answer to the last question is that it is not possible when the number is odd.

Lets try this out with sample numbers

We have accounts with 2, 5, 16

B = 2* A + 1

so q =2 and it can be written as 0 + 2 * 1 for the binary representation of 2

We transfer Aq = 2 * 2 = 4 dollars to 1 and now have 6, 1,16

Now A, B, C has changed and we have A = 1, B= 6, C = 16
now B has q=6 and r = 0 and since m1 was 1 and m2 is 1, we have the transfer again from B.
This leaves us with A=0, B=7 and C=16.

Note that we cannot reduce any further.

Friday, September 18, 2015

Today we discuss another Olympiad problem.
[IMO Shortlist 2009]
Five identical empty buckets of 2-liter capacity stand at the vertices of a regular pentagon. Cinderella and her wicked Stepmother go through a sequence of rounds: At the beginning of every round the Stepmother takes one liter of water from the nearby river and distributes it arbitrarily over the five buckets. Then Cinderella chooses a pair of neighboring buckets, empties them into the river, and puts them back. Then the next round begins. The Stepmother’s goal is to make one of these buckets overflow. Cinderella’s goal is to prevent this. Can the wicked Stepmother enforce a bucket overflow?

Let us take the volumes as V1, V2, V3, V4 and V5. Cinderella (C) can only empty adjacent buckets. If the alternate buckets are more than 1 liter, she will not be able to empty both. Therefore she prevents this condition. For each i = 1 to 5, C must ensure that Vi + Vi+2 is at most one on each of her turn. She maintains this 'good' state.

The good configuration enforces return to good state. For example, if two buckets are empty, V4=V5=0, Then V1 + V3 <= 1 and V2 <= 1 because V2 + V4 <= 1 After the step mothers turn, V1 + V3 + V4 + V5 <= 2. Therefore either V5 V3 <=1 or V4 + V1 <=1 So C empties both V1 and V2 and the new configuration is still good V4 <=1 and V5 + V3 <=1

Since at the end of each turn for C, we have a good configuration, C has a winning strategy.

#################### Paper Review #####################

Today we discuss "Ontology based text summarization - The case of Texminer".
This paper talks about Texminer which does summarization by a combination of summarization techniques.Before delving into the techniques, it might be a good reminder to go over the summarization techniques.

One technique is to use the structure of the discourse to generate abstracts. The Rhetorical Structure Theory, for example, attempts to identify the internal structure of the text and the relations of the discourse formed within it, giving priority to the nuclear components of these relations.On the other hand, Marcu segments the text into small units of discourse. He then proceeds to build a rhetorical structure in the form of a tree by analyzing the set of relations that exist between the units. Once the discourse structure has been created, he assigns weight and an order to each element of the structure. - the higher the element within the structure the greater its weight.

Different ways of using the text discourse structure has been shown. Some have used the rhetorical status of affirmations contained in documents to identify their internal structure. The main contribution of these authors lies in the algorithm that deals with the non-hierarchical structure : given seven fixed catagories (aim, textual, own, background, contrast, basis and other) it is capable of distributing the contents of the articles within each category.

Another way has been to use templates to generate summaries and retrieve information. This technique can only be applied when the text is previously structured. Information retrieval systems include Fies that extracts financial information from digital articles.

Mateo combined superficial with deep structure analysis to enhance the coherence and cohesion of the abstract. Alonso and Fuentes for example combine lexical chains plus the rhetorical and argumentative structure, derived using discourse markers.

Using combination of complex linguistic techniques, Aretoulaki developed a model for automatic summarization that selects sentences using content features of a pragmatic and rhetorical nature, obtained by means of superficial linguistic analysis such as the Theory of Speech Acts, the Rhetorical structure Theory and the theories centered on cohesion and coherence.

Automatic production of abstracts was initially based on statistical methods. in the wake of research by Luhn and now uses diverse methodologies. With the extraction of terms or strings of significant words, systems build discourse models. These systems have been used to identify rhetorical structures highly related with the content of the documents and their organizational scheme.

The growth of cognitive science has allowed the incorporation of semantic-conceptual models.Together with the use of knowledge bases, these models vastly improve the process of summarizing texts on a specific topics.

The demand for domain knowledge has seen exceptional growth recently. Models for extraction and disambiguation of text has changed recently. It may even have become harder to find tools to help with such narrow domain texts. The lack of specific dictionaries, the absence of a defined theory and the dire need for professionals in the sector to have summarization tools to carry out their work has become pervasive. Documentation, Terminology and Natural Language Processing (NLP) indicate that there is a demand here in this narrow-domain texts.

This is where TexMiner may help solve the problem. It is based on the conviction that summarizing texts within a specialized domain requires a model capable of processing its semantic and socio-cognitive components.

The socio-cognitive user paradigm aspect of TexMiner takes into account the historic, social and cultural factors. These techniques consider a domain as the mindshare in terms of concepts, terms and knowledge from a community.
It uses ontologies as algebraic descriptions of a conceptual network in which the elements are binary relations established through concepts.
Lastly it uses three agents for processing information.
The first is a reading agent that reads the documents and labels them in XML. Here the goal is to discern if a document belongs to a particular domain based on the count of words that fall in the domain.
The second is a summarization agent that carries out automatic tagging of the syntactic discourse structure of the text. The design of the tags is governed by
- the sequence in which the levels are tagged in practice
- the symbols used to denote cohesive elements of the text, and
- the presence in the text of geographic elements, verbs statistical data formulae and processes.
The last agent is the Information retrieval agent. Texminer allows for the searches to be made through the ontology, as well as the lexical databases developed to take advantage of the functionalities of summaries for the purpose of text retrieval.