Tuesday, August 26, 2014

Today we will be reviewing the Atom web standards. There is the Atom publishing format and the Atom syndication format. The latter is the XML language used to describe the Atom feeds. The former is the HTTP based protocol for creating and updating the web resources.Web feeds follow a publisher subscriber model. A web feed provider is usually a site owner that publishes resources in say xml format. Subscribers then use feed readers to view their content. Browsers typically support feed readers. One usage is say when bloggers keep track of each others posts via Atom feeds.
A feed can contain both data and metadata. It has  labeled content. It is usually timestamped to communicate the information on when it was created/updated. Each text element can be internationalized using xml:lang  The Atom format came after the RSS format which was struggling with backward compatibility. While RSS is still used by Apple and iTunes, the Atom format has been adopted widely by Google.
To provide a link to an Atom feed, we use the following in HTML:
<link href="atom.xml" type="application/atom+xml" rel="alternate" title="feedtitle" />
The feed is specified as  <feed xmlns="http://www.w3.org/2014/Atom"><metadata/><data/></feed>
.Net MVC webservers allow RSSResult ActionResult to be cached with an expiration time because it is just a file.
The iCalendar format is different from Atom in that it specifies a meeting request and tasks for sharing with a file and an extension named .ics. The content is specified in begin and end tags with the top level being vcalendar. The items are listed in key:value pairs and form a succint description of the meeting. It has syntax to specify the event/journal/free/busy time.
Flair is a javascript based application framework for Adobes AIR html+ajax SDK. Its an acronym for Framework Leveraged AIR. It can be used to specify a list of items say for example based on source ip of request as gathered from the request headers HTTP_X_FORWARDED_FOR and REMOTE_ADDR

Monday, August 25, 2014

In this post, we describe an application that does a time series inverted search on computer logs that are described as raw text, timestamp, host, source and source type. We call these events and we keep them in a table. The events are stored in a table in a SQL database with a clustered index on the time-stamp. The application is a .Net Web API application that connects to the table over entity framework or any object-relational mapping library. The Web API application exposes an API to post the events into the server and an API to do search over the events. When the events are posted into the table, they are sorted by their time. When the search is executed by the table, the time range is used to select the events and returned to the user as search results. Code is being written at http://github.com/ravibeta/csharpexamples/SplunkLite.Net 

Sunday, August 24, 2014

In the previous post, we mentioned Markov chains and a closed communicating class. If we have a Markov chain with transition matrix P and fix n transitions from the overall N, then the Markov chain with transition matrix Pn has exactly the same closed communicating classes. Recall that each single-element set from the states is a closed communicating class and by definition a closed communicating class has all the states that the origin leads to. A general Markov chain can have an infinite number of communicating classes. This is something we see from sets and relations. When we visualize the topological structure of a Markov chain, we will see closed communicating classes that have no arrows for communication and arrows between two non-closed communicating classes that are in the same direction. If we remove the non-closed communicating classes, we can get disjoint closed communicating classes.
If we have an irreducible chain where all the states form a single closed communicating class, we can further simplify the behavior of the chain by noticing that if the chain is in one set it can move to only a few other sets. For example, if we take the set with four communicating classes, the diagonal states will transition to the opposite. This we say has a period of two. A triangular three communicating class will have a period of three. The numbers two and three are indicative of a divisor in the communicating classes. If an integer n divides m and the reflexive state transitions are possible with n transitions, then its possible with m transitions. So, whether it is possible to return to i in m steps can be decided by one of the integer divisors of m. The period of the state is defined as the greatest common divisor of all natural numbers n with such that it is possible to return to i in n steps. A state i is called aperiodic if it can be returned to only one step.

Saturday, August 23, 2014

The topological structure of a Markov chain can be represented as a digraph and denoted by a pair (V,E ) with V being the vertices and E being the edges of the graph. The graph of a chain refers to the structure of the process that does not depend on the exact values of the transition probabilities but only on which of them are positive. We say that a state i leads to state j if, starting from i the chain will visit j at some finite time.
We denote this by i ~ j and it is equivalent to stating the probability for some final state j after n iterations starting from i is greater than zero. This relation is reflexive. In fact for all the states in the event space, we can have probabilities between pairs to be greater than zero.
This relation is also transitive. If we can reach state j from i and k from j then it implies that there is a chain possible from i to k .
If it's a symmetric relationship which means that there is a chain from i to j and j to i then we can say that i communicates with j.
The communication relationship is an equivalence relation which means it is symmetric, reflexive and transitive. Equivalence classes also called as communicating classes partition the space. The communicating class corresponding to state i is, by definition the set of all states that communicate with i.
Two communicating classes are either identical or completely disjoint.
A communicating class is said to be closed when all the states belong to the communicating class. Closed communicating classes are particularly important because they decompose the chain into smaller more manageable parts.The single element state i is essential if it belongs to closed communicating class. Otherwise the state is inessential. If all the states communicate with all others, then the chain is considered irreducible.

Friday, August 22, 2014

Today we will continue to discuss Skorokhod theorem. Recall that the theorem mentions the stopping time in terms of a probability from the distribution.  The simpler case was the symmetrical random walk but Skorokhod embedding theorem works even better for the distribution. Moreover we use it for describing a random walk with a new random variable that can take values above a value a and below b.
Today we will continue to discuss random walks. In the previous post, we were proving Skorokhod embedding theorem. We introduced a random variable for a simple random walk that takes a value x greater than the value taken by random variable A but not reaching the value taken by the random variable B and assuming that this walk is independent of both A and B. The claim was that this new random variable takes a probability pk from the distribution and thus leading to the said theorem that the stopping time takes a probability from the distribution. To make the claim, we looked at the initial values when k = 0, then Z = 0 if and only if A = 0 and B = 0 and this has probability p0 by definition. Then we take k > 0 and in this case we work out the probability that our random variable takes the value k.  We first expand this probability as the cumulative sum of the independent probabilities over all i < 0 and j > 0 . The independent probabilities are for the random variable to take a value k and for A to take value i and B to take value j.  We can eliminate j let B take value k for the probability we are calculating. Then we apply the theorem from the gambler's ruin that describes the probability to take value b > x before reaching a  < x as equals (x - a) / (b - a). Further we use the modified expression of that probability in terms of a random walk Ta ^ Tb  to reach level b as the same probability and equaling  (-a / (b-a)) or (-i/(k-i)) as in this case.  So we simplify the independent probabilities with this value and the normalization factor times (k-i) pi pk. Since the sum of this has a zero average component, it simplifies to the probability pk thus proving the theorem.

Thursday, August 21, 2014

In the previous post, we mentioned the Skorokhod embedding theorem. This theorem gives the probability for the stopping time T as one of the probabilities in the distribution.  In this post, we try to prove it.  We define a pair of random variables (A,B) that takes values anywhere in the N x N grid with the following distribution:
the initial state with A=0 and B=0 to have a probability p0 and the
any other state with A=i and B=j to have a probability as a normalization factor times j-i times the combination of the independent probabilities for i and j from the distribution.
We can now cumulate all the other states to have a probability of 1 - p0. We can split the (j-1) to two separate terms.
Using the zero mean equation in the cumulation equation, we can apply the above two for all i < 0 < j, we get that the normalization factor is the inverse of this common value : sum of i and pi over all i >  0
Assuming the stopping time as an infinite series of Sn = i and (A,B) to be independent of Sn, we can take a new random variable Z in terms of the intersection of the random walk to reach value TA and before it reaches TB.
The claim is that the probability for this random variable to take a value k is the corresponding probability from the distribution.  For k = 0,  Z = 0 has a probability p0. For k > 0 we can use the theorem that computes the probability of the random variable we defined and we see that this has a value pk.