Thursday, March 22, 2018

We continue discussing Convolutional Neural Network (CNN) in image and object embedding in a shared space, shape signature and image retrieval.
The 2D distance matrix formed from word embeddings in text documents that is dimensionality reduced and classified using the softmax function  is similarly put to use with the distance matrix between 3D models although the feature vector, distance calculation, algorithm and error function are different. Neural nets are applied to embedding in both text documents and images.
CNN has the ability to separate an image into various layers of abstraction while capturing different features and elements. This lets CNN to be utilized for different learning tasks where the tasks may differ on the focus they require. It is this adaptive ability of CNN that is leveraged for joint embedding.  The CNN is first trained to map an image depicting an object similar to a shape to a corresponding point in the embedding space such that the position of the point for the image is closer to the point for the shape. During this training, the CNN discovers latent connection between that exists between an image and the object it features. Then when a test image is presented, the latent connection helps to place that image in the embedding space closer to the object it features.
Moreover, CNN can generalize from different tasks. This makes it useful to repurpose a well-trained network. Since it learns from a high dimensional space, CNN can differentiate even similar images for a variety of tasks.

As we know with Euclidean distance, the chi-square measure, which is the sum of squares of errors, gives a good indication of how close the objects are to the mean. Therefore it is a measure for the goodness of fit. The principle is equally applicable to embedding space. By using a notion of errors, we can make sure that the shapes and images embedded in the space do not violate the intra member distances. The only variation the authors applied to this measure is the use of Sammon error instead of the chi-square because it encourages the preservation of the structure of local neighborhoods while embedding. The joing embedding space is a Euclidean space of lower dimension while the shapes and the images are represented in the original high dimensional space.

#proposal for login screens:  https://1drv.ms/w/s!Ashlm-Nw-wnWtWiX5uxOG6zc4a8K
Thumbnail images instead of literals can also enhance the login screens. Avatars are an example of this.

No comments:

Post a Comment