Cluster computing

Saturday, September 13, 2014

In the previous post, we discussed the time series algorithm. We saw how the ART model is constructed and how we apply it to predict the next one step transition. Note that the ART tree is created on the values of the random variable. To fit the linear regression for a restricted data set, we determine the values of the random variable from the length p transformations of the time series data set.
For a given time series data set, a corresponding nine data sets for length p transformations is created. The p varies from zero to eight for the nine data sets. Each of these transformed datasets is centered and standardized before modeling; that is for each variable we subtract the mean value and divide by the standard deviation. Then we divided the data set into a training set used as input to the learning method and a holdout set to evaluate the model. The holdout set contains the cases corresponding to the last five observations in the sequence.

Central to the step of fitting the linear regression, is the notion of covariance stationarity.
By that we mean
the mean is not dependent on t
the standard deviation is not dependent on t
the covariance (Yt, Yt-j) exists and is finite and does not depend on t
This last factor is called jth order autocovariance
The jth order auto-correlation is described as autocovariance divided by the square of standard deviation

The autocovariance measures the direction of the linear dependence between Yt and Yt-j.
while the autocorrelation measures both the direction and the strength of the linear dependence between the Yt and Yt-j.
An autoregressive process is defined as one in which the time dependence in the process decays to zero as the random variables in the process get farther and farther apart. It has the following properties:
E(Yt) = mean
Var(Yt) = sigma squared
Cov(Yt, Yt-1) = sigma squared . phi
Cor(Yt, Yt-1) = phi

Courtesy: Time Series Concepts UW.
There are other curve fitting techniques notably the non-linear curve fitting that could be helpful in predicting the one-step transition. Such techniques could involve finding the R-value and/or the Chi-square value. Perhaps a numerical approach to predicting the time series could be to use a Fast Fourier Transform.

By the way, somebody surprised me with a question to print Fibonacci series. I hadn't heard it in a long while and yes there is a recursive solution but this works too:

int Fib(int index)
{
if (index < 1) throw new Exception();
int lastToLast = 1;
int last = 1;
if (index == 1) return lastToLast;
if (index == 2 ) return last;
int sum = 0;
for (int i = 2; i <= index; i++)
{
sum = last + lastToLast;
lastToLast = last;
last = sum;
}
return sum;
}

int Fib(int index)
{
Assert (index > 0);
if (index == 1) return 1;
if (index == 2) return 1;
return Fib(index-1) + Fib(index-2);
}
we will next discuss openstack and try out a sample with Rackspace

Cluster computing

Saturday, September 13, 2014

No comments:

Post a Comment