Cluster computing

Saturday, March 10, 2018

Get Max product from an integer array
public int maxProduct(List<int> input)
{
int lo = 1;
int hi = 1;
int result = 1;
for (int i = 0; i < input.Count; i++)
{
if (input[i] == 0)
{
lo = 1;
hi = 1;
}
else{
int temp = hi * input[i];
if (input[i] > 0)
{
hi = temp;
lo = Math.Min(lo * input[i], 1);
}
else
{
lo = temp;
hi = Math.Max(lo * input[i], 1);
}
}
if (result < hi)
result = hi;
}
return result;
}
This is for contiguous subsequence.
We could also write the above if else in dynamic programming with recursion to include either the current element or exclude it. If an element is not included we multiply 1. If it is included we multiply the element to the maximum of the recursion found as for example in positive integer array
int getMaxProduct(List<int> A, int index)
{
assert (A.all (x=> x>=0));
if (index == A.Length)
return 1;
int remaining = GetMaxProduct (A, index +1);
return Math.max (A [index] * remaining, 1 * remaining) ;
}
which is the same as saying replace all 0 with 1 in the sorted positive integer array
Therefore we can even sort the integer array replace all 0 with 1 and trim the odd count negative number to the left of zero
double GetMaxProduct (List <int> A)
{
A.sort ();
A.replaceAll(0, 1);
int count = A.Count (x => x < 0);
if (count > 0 && count % 2 == 1)
{
A.replace (A.min (x => x < 0), 1);
}
return A.Product ();
}

Friday, March 9, 2018

Thursday, March 8, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification. Then we also enumerated the feature extraction techniques. After that, we compared online and offline verification techniques. We also discussed the limitations of image processing and the adaptations for video processing.Then we proceeded to discussing image embedding in general.
We discussed Convolutional Neural Network (CNN) in image and object embedding in a shared space. We followed this discussion with shape signature and image retrieval.
The purpose of the embedding is to map an image to a point in the embedding space so that it is close to a point attributed to a 3D model of a similar object. A large amount of training data consisting of images synthesized from 3D shapes is used to train the CNN.
We now discuss shape retrieval. The methods for shape retrieval mostly operate on input query in 3D domain. However, CNN proposes a new way of searching both images and shapes by placing them in a shared embedding space. This sharing enables comparision between images and shapes easily since the shapes and images that are similar are co-located in the same neighborhood.
This is a very sought after discipline because it has appeal for many real world applications. Aubry et al proposed a part-based method that focuses on detecting image regions that match parts in 3D models. A star model is then designed to combine discriminating patches for measuring the similarity. This is helpful for matching shapes from arbitrary images. However this method requires tuning that may change from domain to domain. Instead, the CNN network provides a uniform shared embedding space where image feature extraction is based on shape distance metric.

We discussed the use of a distance metric and clustering techniques but there does not seem any equivalents for softmax in data mining. Neural net algorithms might be implemented with R-package but their use in native data mining techniques seems limited. clustering and reverse clustering could be combined for softmax but this May may be more tedious than inlined neural net method.

#codingexercise

We talked about exponential and normal distribution. Let us state the uniform distribution in the interval range a to b

double GetUniformDistribution(double a, double b, double x)

{
Debug.Assert(b-a > 0);

if (a <= x && x <=b)

{

return 1/(b-a);

}

return 0;

}
#codingexercise
int GetFirstIndexInListWithStepSizeAtMostK(List<int> A, int x, int k)
{
int i = 0;
while (i < A.Length)
{
if (A[i] == x)
return i;
i = i + Math.Max(1, Math.Abs(A[i]-x)/k);
}
return -1;
}

Wednesday, March 7, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification. Then we also enumerated the feature extraction techniques. After that, we compared online and offline verification techniques. We also discussed the limitations of image processing and the adaptations for video processing.Then we proceeded to discussing image embedding in general.
We discussed Convolutional Neural Network (CNN) in image and object embedding in a shared space. Then we mentioned shape signature.
The purpose of the embedding is to map an image to a point in the embedding space so that it is close to a point attributed to a 3D model of a similar object. A large amount of training data consisting of images synthesized from 3D shapes is used to train the CNN.
Each shape is given a signature. This makes it easy to perform shape matching, organization and retrieval. The shape signatures are computed using Light Field Descriptor. By re-using projections or views, the shape signature computation does not need any generalization.
Next we read about image retrieval in this framework.
The content-based image retrieval involves searching for an image that is similar to the query. Similarity can be low-level such as color and texture or high-level such as latent objects. In this scheme, both images and shapes are embedded into the same space.This space then leverages the robust distance metric for 3D shapes and trains a CNN for purifying and mapping real world images into this pre-built embedding space. Since the space is constructed already, there is a separation between that process and the image embedding process so that this becomes tractable. Moreover some synthesized images with annotations are used thereby eliminating the need for manual work in gaining annotations.

Object embedding in images used neural net. word embedding in documents used neural net As such we don't yet have a technique that determines cohesive and adhesive properties other than vector space clustering. Is it possible to transform the embedding into a data mining problem ?

#codingexercise
How do you get the z-score in a normal probability distribution ?
double GetZScore(double X, double mu, Double sigma)
{
Debug.assert(sigma != 0);
return (X-mu)/sigma;
}

A step array is an array of integer where each element has a difference of atmost k with its neighbor. Given a key x, we need to find the index value of k if multiple element exist return the index of any occurrence of key.
Input : arr[] = {4, 5, 6, 7, 6}
k = 1
x = 6
Output : 2

int GetIndex(List<int> A, int x, int k)
{
int i = 0;
while (i < A.Length)
{
if (A[i] == x)
return i;
i = i + Math.Max(1, Math.Abs(A[i]-x)/k);
}
return -1;
}
If we were to return only the first element, we could scan the elements at positions before the index from where we exit the loop.

Tuesday, March 6, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification. Then we also enumerated the feature extraction techniques. After that, we compared online and offline verification techniques. We also discussed the limitations of image processing and the adaptations for video processing.Then we proceeded to discussing image embedding in general.
Today we continue our discussion of Convolutional Neural Network (CNN) in image and object embedding in a shared space.
The purpose of the embedding is to map an image to a point in the embedding space so that it is close to a point attributed to a 3D model of a similar object. A large amount of training data consisting of images synthesized from 3D shapes is used to train the CNN.
In order to play up the latent objects in images, similar objects were often used in dissimilar images. The dissimilarity was based on viewpoint, lighting, background differences, partial hiding and so on. 3D object representations do not suffer from these dissimilarities and are therefore much easier to establish similarity.
Each shape is given a signature. This makes it easy to perform shape matching, organization and retrieval. The notion comes from computer vision and computer graphics. There shape signatures have been associated with geometric properties of the shape such as volume, distance and curvature or 2D projections. They sometimes include graph representations to get a topographical distribution of the shape. In CNN, shape signatures are computed using Light Field Descriptor. In this approach, shapes are indexed via a set of 2D projections or views. This depends directly on the views. A view based distance metric can then be derived from these metrics. The shape signature for each shape then becomes its embedding point in the shared embedding space. By re-using projections or views, the shape signature computation does not need any generalization. Instead its embedding into the robust space is utilized.

#codingexercise
How much time do we need to wait before a given event occurs.
This question is usually answered in the form of exponential probability distribution. The unknown time is represented by a random variable.
If the probability of the event happening in a given interval is proportional to the length of the interval then we have an exponential diatribution.
double GetProbabilityDistribution (double lambda, double x)
{
if (inclusive (x) == false)
return 0;
return lambda *.math.pow (e, - lambda*x);
}

Monday, March 5, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification. Then we also enumerated the feature extraction techniques. After that, we compared online and offline verification techniques. We also discussed the limitations of image processing and the adaptations for video processing.Then we proceeded to discussing image embedding in general.
Today we continue our discussion of Convolutional Neural Network (CNN) in image and object embedding in a shared space.
The purpose of the embedding is to map an image to a point in the embedding space so that it is close to a point attributed to a 3D model of a similar object. A large amount of training data consisting of images synthesized from 3D shapes is used to train the CNN.
In order to play up the latent objects in images, similar objects were often used in dissimilar images. The dissimilarity was based on viewpoint, lighting, background differences, partial hiding and so on. 3D object representations do not suffer from these dissimilarities and are therefore much easier to establish similarity.
Images and 3D shapes share the embedded space. In that space, both can be measured as if their 3D form was directly available. This embedded space is plotted with each object given a co-ordinate comprising of the dimension reduced form of the distances between the underlying shape and the entire set. This way two neighboring points in the embedded space are likely to represent similar shapes, as they agree to a similar extent, on their similarities with all the other shapes.

#codingexercise
A step array is an array of integer where each element has a difference of atmost k with its neighbor. Given a key x, we need to find the index value of k if multiple element exist return the index of any occurrence of key.
Input : arr[] = {4, 5, 6, 7, 6}
k = 1
x = 6
Output : 2

int GetIndex(List<int> A, int x, int k)
{
int i = 0;
while (i < A.Length)
{
if (A[i] == x)
return i;
i = i + Math.Max(1, Math.Abs(A[i]-x)/k);
}
return -1;
}

Sunday, March 4, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification. Then we also enumerated the feature extraction techniques. After that, we compared online and offline verification techniques. We also discussed the limitations of image processing and the adaptations for video processing. Today let us discuss image embedding in general.
In this case, we discuss CNN Image Purification techniques from the Li et al paper on Joint Embeddings. CNN stands for Convolutional Neural Network.It purifies images by muting distracting factor The purpose of the embedding is to map an image to a point in the embedding space so that it is close to a point attributed to a 3D model of a similar object. A large amount of training data consisting of images synthesized from 3D shapes is used to train the CNN.
In order to play up the latent objects in images, similar objects were often used in dissimilar images. The dissimilarity was based on viewpoint, lighting, background differences, partial hiding and so on. 3D object representations do not suffer from these dissimilarities and are therefore much easier to establish similarity.
Images and 3D shapes share the embedded space. In that space, both can be measured as if their 3D form was directly available. This embedded space is plotted with each object given a co-ordinate comprising of the dimension reduced form of the distances between the underlying shape and the entire set. This way two neighboring points in the embedded space are likely to represent similar shapes, as they agree to a similar extent, on their similarities with all the other shapes.
A set of ground truth co-ordinates in the embedding space is required. A large amount of images to train the CNN is also required. Another alternative for obtaining the necessary links between images and their embeddings is to manually link images to similar 3D models. But since this is time-consuming and error prone, the image training set is synthesized based on rendering a rather modest set of annotated shapes from the ShapeNet
This technique of embedding is a novel attempt to correlate 3D shapes and 2D images. The deep embedding is capable of purifying these images and interlinking the two domains based on their shared object content This linking then supports querying that was not possible before.
#codingexercise
Minimum removals from array to make max – min <= K
A.sort();
int count = 0;
int i = 0;
int j = A.Length -1;
int GetRemovals(List<int>sorted, int K, int i, int j)
{
if (i >= j) return 0;
if (A[j]-A[i]) <= K) return 0;
return 1 + Math.min(GetRemovals(sorted, K, i+1, j),
GetRemovals(sorted, K, i, j -1));
}
if (count >= A.Length-1) return -1;
1 4 6 8
N = 4
K = 5