Monday, March 26, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
We were discussing the Euclidean distance and the distance matrix. As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.
The Sammon Error is a weighted sum of differences between the original pairwise distances and the embedding pairwise distances. Dissimilar shapes have more differences and therefore they are weighted down.
The embedding of shapes and images proceeds with minimizing the Sammon Error using non-linear Multi-dimensional scaling. It is a means of visualizing the level of similarity of individual cases in a dataset. By minimizing the intra member distance in the placement of items in an N-dimensional space.
We saw how the embedding space is created.  Mapping new shapes is slightly more effort. The space was originally constructed with a set of 3D shapes. They were jointly embedded. Introducing a new shape requires us to find an embedding point. The steps for this include:
First, a feature vector is computed.
Second, pairwise distances are computed.
Third we minimize the Sammon error but this time applying Liu-Nocedal method which is a large scale optimization method that combines BFGS steps and conjugate directions steps. BFGS is an iterative method for solving unconstrained non-linear optimization problems.
The 3D shapes in the embedding space have abundant information to train the CNN and also to perform data generation. The shapes are represented as clean and complete meshes which allows control and flexibility. Many images can be generated from the shapes using a rendering process. This is called image setting.  In the embedding space, a shape is mapped to a point. For each image, its association with a shape is automatically known.  The collection of images and shapes form the training data for CNN.
CNN models can approximate high dimensional and non-linear functions as we recall the feature vector has a large number of attributes and the Sammon error minimization objective is a non-linear function.  CNN can infer millions of parameters.  CNN therefore can be precise and informative once it is trained on a large amount of data. If the data is not proper, CNN cannot learn enough latent information and there results have overfitting. When the images are generated with rich variation in lighting and viewpoint and superimposed on random backgrounds, the CNN has sufficient data. Approximately 1 million images are synthesized per category.

Sunday, March 25, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
We were discussing the Euclidean distance and the distance matrix. As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.
The Sammon Error is a weighted sum of differences between the original pairwise distances and the embedding pairwise distances. Dissimilar shapes have more differences and therefore they are weighted down.
The embedding of shapes and images proceeds with minimizing the Sammon Error using non-linear Multi-dimensional scaling. It is a means of visualizing the level of similarity of individual cases in a dataset. By minimizing the intra member distance in the placement of items in an N-dimensional space.

We saw how the embedding space is created.  Mapping new shapes is slightly more effort. The space was originally constructed with a set of 3D shapes. They were jointly embedded. Introducing a new shape requires us to find an embedding point. The steps for this include:
First, a feature vector is computed.
Second, pairwise distances are computed.
Third we minimize the Sammon error but this time applying Liu-Nocedal method which is a large scale optimization method that combines BFGS steps and conjugate directions steps. BFGS is an iterative method for solving unconstrained non-linear optimization problems.

#proposal for login screens ui enhancements:  https://1drv.ms/w/s!Ashlm-Nw-wnWtWrw5g03hYU5CBKL

Saturday, March 24, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
We were discussing the Euclidean distance and the distance matrix. As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.
The Sammon Error is a weighted sum of differences between the original pairwise distances and the embedding pairwise distances. Dissimilar shapes have more differences and therefore they are weighted down.
The embedding of shapes and images proceeds with minimizing the Sammon Error using non-linear Multi-dimensional scaling. It is a means of visualizing the level of similarity of individual cases in a dataset. By minimizing the intra member distance in the placement of items in an N-dimensional space. The goal of MDS is to get the co-ordinate matrix. It uses the notion that the co-ordinate matrix can easily be formed from the eigenvalue decomposition of the scalar product matrix B. The steps of a classical MDS algorithm are as follows:
1. Form pairwise distance matrix that gives proximity between pairs
2. Multiply the squared proximities with a form of identity matrix to get matrix B
3. Extract the m-largest positive eigenvalues and the corresponding m eigenvectors
4. Form the co-ordinate matrix from the eigen vectors and the diagonal matrix of the eigen values

#proposal for login screens ui enhancements:  https://1drv.ms/w/s!Ashlm-Nw-wnWtWrw5g03hYU5CBKL

Friday, March 23, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
CNN has the ability to separate an image into various layers of abstraction while capturing different features and elements. This lets CNN to be utilized for different learning tasks where the tasks may differ on the focus they require. It is this adaptive ability of CNN that is leveraged for joint embedding.  The CNN is first trained to map an image depicting an object similar to a shape to a corresponding point in the embedding space such that the position of the point for the image is closer to the point for the shape. During this training, the CNN discovers latent connection between that exists between an image and the object it features. Then when a test image is presented, the latent connection helps to place that image in the embedding space closer to the object it features.
Moreover, CNN can generalize from different tasks. This makes it useful to repurpose a well-trained network. Since it learns from a high dimensional space, CNN can differentiate even similar images for a variety of tasks.
We were discussing the Euclidean distance and the distance matrix. As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.
The Sammon Error is a weighted sum of differences between the original pairwise distances and the embedding pairwise distances. Dissimilar shapes have more differences and therefore they are weighted down.
The embedding of shapes and images proceeds with minimizing the Sammon Error using non-linear Multi-dimensional scaling. It is a means of visualizing the level of similarity of individual cases in a dataset. By minimizing the intra member distance in the placement of items in an N-dimensional space. The goal of MDS is to get the co-ordinate matrix. It uses the notion that the co-ordinate matrix can easily be formed from the eigenvalue decomposition of the scalar product matrix B. The steps of a classical MDS algorithm are as follows:
1. Form pairwise distance matrix that gives proximity between pairs
2. Multiply the squared proximities with a form of identity matrix to get matrix B
3. Extract the m-largest positive eigenvalues and the corresponding m eigenvectors
4. Form the co-ordinate matrix from the eigen vectors and the diagonal matrix of the eigen values

#proposal for login screens ui enhancements:  https://1drv.ms/w/s!Ashlm-Nw-wnWtWrw5g03hYU5CBKL

Thursday, March 22, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
The 2D distance matrix formed from word embeddings in text documents that is dimensionality reduced and classified using the softmax function  is similarly put to use with the distance matrix between 3D models although the feature vector, distance calculation, algorithm and error function are different. Neural nets are applied to embedding in both text documents and images.
CNN has the ability to separate an image into various layers of abstraction while capturing different features and elements. This lets CNN to be utilized for different learning tasks where the tasks may differ on the focus they require. It is this adaptive ability of CNN that is leveraged for joint embedding.  The CNN is first trained to map an image depicting an object similar to a shape to a corresponding point in the embedding space such that the position of the point for the image is closer to the point for the shape. During this training, the CNN discovers latent connection between that exists between an image and the object it features. Then when a test image is presented, the latent connection helps to place that image in the embedding space closer to the object it features.
Moreover, CNN can generalize from different tasks. This makes it useful to repurpose a well-trained network. Since it learns from a high dimensional space, CNN can differentiate even similar images for a variety of tasks.

As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.

#proposal for login screens:  https://1drv.ms/w/s!Ashlm-Nw-wnWtWiX5uxOG6zc4a8K
Thumbnail images instead of literals can also enhance the login screens. Avatars are an example of this.

Wednesday, March 21, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
The CNN approach consists of four major components: embedding space construction, training image synthesis, CNN training phase, and the final testing phase. In the first phase, a collection of 3D images is embedded into a common space. In the second phase, the training data is synthesized using 3D shapes in a rendering process which yields annotations as well.  In the third phase, a network is trained to learn the mapping between images and 3D shape induced embedding space. Lastly, the trained network is applied on new images to obtain an embedding into the shared space. This facilitates image and shape retrieval.
The embedding space is where both real-world images and shapes co-exist.  The space organizes latent objects between images and shapes. In order to do this, the objects are initialized from a set of 3D models.  The are pure and complete representation of objects. They don't suffer from the noise in images. The distance between 3D models is both informative and precise. With the help of 3D models, the embedding space becomes robust.
The shape distance metric computes the similarity between two shapes by the aggregate of similarities among corresponding views. This method is called Light field descriptor. The input is a set of 3D shapes although two would do.The shapes are aligned by applying a transformation using a rotation matrix and a translation vector. Then they are projected from k viewpoints to generate projection images
The CNN uses this distance metric to form a pairwise comparison between the 3D models. Since the metric is informative and accurate, the models can be organized in space along increasing dimensions.
The 2D distance matrix formed from word embeddings in text documents that is dimensionality reduced and classified using the softmax function  is similarly put to use with the distance matrix between 3D models although the feature vector, distance calculation, algorithm and error function are different. Neural nets are applied to embedding in both text documents and images.
CNN has the ability to separate an image into various layers of abstraction while capturing different features and elements. This lets CNN to be utilized for different learning tasks where the tasks may differ on the focus they require. It is this adaptive ability of CNN that is leveraged for joint embedding.  The CNN is first trained to map an image depicting an object similar to a shape to a corresponding point in the embedding space such that the position of the point for the image is closer to the point for the shape. During this training, the CNN discovers latent connection between that exists between an image and the object it features. Then when a test image is presented, the latent connection helps to place that image in the embedding space closer to the object it features.
Moreover, CNN can generalize from different tasks. This makes it useful to repurpose a well-trained network. Since it learns from a high dimensional space, CNN can differentiate even similar images for a variety of tasks.
#proposal for login screens:  https://1drv.ms/w/s!Ashlm-Nw-wnWtWiX5uxOG6zc4a8K
Thumbnail images instead of literals can also enhance the login screens. Avatars are an example of this.

Tuesday, March 20, 2018

Today also we continue discussing  Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
The CNN approach consists of four major components: embedding space construction, training image synthesis, CNN training phase, and the final testing phase. In the first phase, a collection of 3D images is embedded into a common space. In the second phase, the training data is synthesized using 3D shapes in a rendering process which yields annotations as well.  In the third phase, a network is trained to learn the mapping between images and 3D shape induced embedding space. Lastly, the trained network is applied on new images to obtain an embedding into the shared space. This facilitates image and shape retrieval.
The embedding space is where both real-world images and shapes co-exist.  The space organizes latent objects between images and shapes. In order to do this, the objects are initialized from a set of 3D models.  The are pure and complete representation of objects. They don't suffer from the noise in images. The distance between 3D models is both informative and precise. With the help of 3D models, the embedding space becomes robust.
The shape distance metric computes the similarity between two shapes by the aggregate of similarities among corresponding views. This method is called Light field descriptor. The input is a set of 3D shapes although two would do.The shapes are aligned by applying a transformation using a rotation matrix and a translation vector. Then they are projected from k viewpoints to generate projection images
The CNN uses this distance metric to form a pairwise comparison between the 3D models. Since the metric is informative and accurate, the models can be organized in space along increasing dimensions.
The 2D distance matrix formed from word embeddings in text documents that is dimensionality reduced and classified using the softmax function  is similarly put to use with the distance matrix between 3D models although the feature vector, distance calculation, algorithm and error function are different. Neural nets are applied to embedding in both text documents and images.
#proposal for login screens:  https://1drv.ms/w/s!Ashlm-Nw-wnWtWiX5uxOG6zc4a8K