Cluster computing

Friday, March 2, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification yesterday. We also enumerated the feature extraction techniques. Then we compared online and offline verification techniques.

One of the reasons offline image processing is preferred is that good image processing algorithms are often computationally expensive and require more time than say network roundtrip for packets. This makes it costly to include as an interactive web page analysis widget. Time taken to execute image processing algorithms have taken even eight seconds. That is why image processing finds it difficult to keep up with the frame rate of a video. However, significant advances have been made that improve processing for streaming of images to be processed. For example, Active contour model can help track movement of object in images for a frame rate that matches the rate used for video. Signatures are considered a lot simpler to work with in image processing. They are generally small sized, binary color and easy to capture and process. As long as the image processing can tell apart a real signature from forged specimens, an image processor can work in the backend for a signature pad widget in the front-end.

We talked about the acceptance criteria for an image processing technique that is largely measured by the precision and recall. By training the processor on a signature dataset, these processors become highly effective in determining even forged from real specimens. Today we will take a closer look at how this verification is done. Since we read how classifiers work in text processing to convert the document into a vector space model and then classify the document based on euclidean distance between feature vectors, the signature verification should also be familiar. The features extracted from the image as described in the previous posts is transformed into the vector space and then compared with the master. If the euclidean distance is within tolerance threshold, the signature is accepted. Since the image processor is already trained and tested on a variety of images and measured with precision and recall, it is reliable to convert the given specimen into a representative feature vector. This concludes the signature verification technique.

#codingexercise

We were discussing combinations with duplicates and that too in a greedy manner. instead of enumerating combinations to the whole length, we can leverage stars and bars theorem to be more efficient. With this theorem, we already know the number of combinations that can exist with duplicates and therefore do not enumerate them but directly count them towards the goal such as the price of the accessories shopped. The theorem mentioned used a binomial coefficient.

Thursday, March 1, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification yesterday. We also enumerated the feature extraction techniques. Now let us proceed to comparing online and offline verification techniques.

An offline signature processing algorithm requires all the information before the algorithm starts. This gives us opportunity to perform all the pre-processing required to normalize the dataset for the algorithm to work effectively. The online algorithm might work on the data while the data is being made available. The processor may reside as close to the sensing device as necessary to make this happened. In the offline case, the processor may even be in a backend system of the office. The image recognition for handwritten signatures has traditionally been offline processing. Even as such, it has been more optical based and not magnetic based. With the list of features compared between the two in the online and offline systems, we see the difference in what can be used online. Online techniques have been said to be more accurate because the system is getting the data as the user feeds it. Offline comparision can eliminate the quirks of the device on which the data is being submitted and can work effectively across a variety of devices and vendors. Online processing helps standalone processors that can be mobile and may have its own local database.

The acceptance criteria for an image processing technique is largely measured by the precision and recall. Precision in this case is the ratio that explains number of selected items that are relevant. It is the ration of the true positives to all that were selected by the image processor for this image. A true positive is one that improves the feature matching. A false positive doesn’t but shows up in match threshold. Recall on the other hand is a metric that determines how many relevant items are selected. It is the ratio of the true positives to all those that would have improved the feature matching from the global set feature matches including ones that the processor did not select. Together precision and recall yields a different metric called F-score which gives the effectiveness of retrieval with respect to a given image. By training the processor on a signature dataset, these processors become highly effective in determining even forged from real specimens.

#codingexercise

We were discussing combinations with duplicates and that too in a greedy manner. instead of enumerating combinations to the whole length, we can leverage stars and bars theorem to be more effecient.

Wednesday, February 28, 2018

We were discussing Signature verification methods. We reviewed the stages involved with Signature verification yesterday. Let us continue to list and compare online and offline verification techniques.

The feature extraction techniques involved include:
1) using an SVM classifier to extract random transform and fractal dimension
2) using neural network to extract curvlet transform, Hough transform
3) using Euclidean distance and least square error classifier for point density and spatial frequency
4) using statistical analysis techniques and chi-square test
5) using feature vector correlation for projection and local point density
6) using svm for Radon transformation
7) using learning techniques
8) using neural network for directional features

Online signature feature extraction also include:
1) signing time
2) signature width and height
3) number of pen-ups and pen-downs
4) total signature length and
5) velocity of pen

Feature extraction depends on pre-processing. Images may need to be loaded, resized, thinned, rotated and cropped.

grayscale is made into binary image with the use of threshold as
(mu1 +mu2) / 2

#codingexercise
We were discussing a coding exercise as shown below:
A person wants to buy L items from her favorite store such that a subset of N items must contain D distinct items. the items range from 1 to A in price. Determine the maximum amount of money the person can spend.

We discussed a technique for building the combinations in a greedy manner by choosing the highest priced items first. We also discussed an alternate way to enumerate all possible combinations and select only the ones that match the criteria and return the one that has the maximum purchase.
Another way to reduce enumerations of unnecessary combinations would be to use the enumerations only from combinations with repetitions instead of exhaustive combinations.

Tuesday, February 27, 2018

We were discussing Signature verification methods. Let us review the stages involved with Signature verification. We will also compare online and offline verification techniques afterwards.
The first stage for the image processing is the image acquisition. This is a crucial stage of any recognition system as the quality of image may considerably affect the subsequent stages. Moreover the devices capturing the image may wear over time since this is touch based technique. Therefore, the consistency of image quality over time is also an important factor.
The second stage for the image processing is the pre-processing that removes noise and may even introduce normalization. Some pre-processing steps may also involve resizing, binary color conversion and cleaning, rotation, thinning and cropping. Binary image that highlights only the signature may be achieved by determining the extremes of gray values and finding the mid point between them as the threshold. For example, if mu1 and mu2 are the gray values for both groups of pixels, the threshold may be set as ( mu1 + mu2 ) /2
The third stage of the image processing is the feature extraction. This is a critical stage for the signature verification because the type and quality of feature may make the verification accurate, predictable and consistent. While both online and offline verification techniques may vary in feature extraction, both may also involve common techniques. Feature extraction is generally termed global or local depending on the features extracted.
The last stage of the image processing is the signature verification. This may be the Euclidean distance computed in the feature space. If the distance is less than a threshold, the signature may be considered as verified.

#codingexercise
We were discussing a coding exercise as shown below:
A person wants to buy L items from her favorite store such that a subset of N items must contain D distinct items. the items range from 1 to A in price. Determine the maximum amount of money the person can spend.

Since the price has to be maximized, the algorithm has to be greedy in its strategy to select the next item. when we can no longer purchase the highest priced item because it violates the given restriction, we make the subsequent selection from the next lower priced item. we determine the threshold from the range 1 to n/d. The rest is recursive combination as shown earlier.
Another way to do this would be to enumerate all possible combinations and select only the ones that match the criteria and return the one that has the maximum purchase.

Monday, February 26, 2018

Signature detection and segmentation is a known field of study and techniques involve shape matching. While some of this processing involve offline techniques, there are online techniques also mentioned in the associated literature. Moreover, MYCT-Signature corpus, Susig database and GPDS-960 provide well known databases for evaluating algorithms. For example, one method of non-rigid shape matching involves a spatial histogram aka shape context computed for each point which describes the distributions of the relative positions of all remaining points. The correspondences between points are solved through weighted bipartite graph matching before the signatures are matched. Another method of non-rigid shape matching formulates it as an optimization problem that preserves a local neighborhood structure. This method has an intuitive graph matching interpretation where each point represents a vertex and two vertices are considered connected in the graph if they are neighbors. The problem of finding optimal match between shapes is therefore equivalent to maximizing the number of matched edges between their corresponding graphs under a one-to-one matching constraint. In this optimization approach, an iterative framework is used to estimate the correspondences and the transformation. In each iteration, graph matching is initialized using shape context distance and subsequently updated through relaxation labeling which is a well-known formal method of expressing low level contextual information, and applying it to complete the extraction of image features.
Image processing generally involves multiple subsequent stages of processing the images. Signatures have the nice property that they are like the results of sobel edge detection and the edges are expected to be more continuous in their formation. Moreover, signature pads are small images, with similar curves and accents and purely black and white, so they are near consistent and this helps with their processing.
#codingexercise
A person wants to form teams by selecting as many participants from a list as possible. The participants have skills represented by an integer. The skills selected as such must be distinct and contiguous even if they are negative. By making the team as large as possible, more problems can be solved. What is the size of the team he can form ?
one way to do this would be to sort the skills and find the largest distinct unit incremental subsequence.

Another way to do this is with longest increasing sequence.
Int GetLongestIncreasingSubsequence(List<int> A)
{
var best = new int[A.Length+1];

for (int i = 0; i < best.Length; i++)

best[i] = 1;

for (int i = 1; i < A.Length; i++)

for (int j=0; j < i; j++)

if (A[i] == A[j] + 1)

{

best[i] = Math.Max(best[i], best[j]+1);

}
return best.ToList().max();
}

The above assumes distinct elements.

another exercise

A person wants to buy L items from her favorite store such that a subset of N items must contain D distinct items. the items range from 1 to A in price. Determine the maximum amount of money the person can spend.

Since the price has to be maximized, the algorithm has to be greedy in its strategy to select the next item. when we can no longer purchase the highest priced item because it violates the given restriction, we make the subsequent selection from the next lower priced item. we determine the threshold from the range 1 to n/d. The rest is recursive combination as shown earlier.

Courtesy: hackerrank

Sunday, February 25, 2018

Yesterday we were discussing how to enable user logins with something that they draw such as their signature on a signature pad. Efficient image processing algorithms can then compare signatures. Moreover, what people draw on the signature pads is completely their call and can even handwrite passwords instead of signature. Since the data is private both at rest and transit, this cannot be divulged with anybody else and provides a layer of security on top of the known passwords. Signature detection and segmentation is a known field of study and techniques involve shape matching. While some of this processing involve offline techniques, there are online techniques also mentioned in the associated literature. Moreover, MYCT-Signature corpus, Susig database and GPDS-960 provide well known databases for evaluating algorithms. For example, one method of non-rigid shape matching involves a spatial histogram aka shape context computed for each point which describes the distributions of the relative positions of all remaining points. The correspondences between points are solved through weighted bipartite graph matching before the signatures are matched. Another method of non-rigid shape matching formulates it as an optimization problem that preserves a local neighborhood structure. This method has an intuitive graph matching interpretation where each point represents a vertex and two vertices are considered connected in the graph if they are neighbors. The problem of finding optimal match between shapes is therefore equivalent to maximizing the number of matched edges between their corresponding graphs under a one-to-one matching constraint. In this optimization approach, an iterative framework is used to estimate the correspondences and the transformation. In each iteration, graph matching is initialized using shape context distance and subsequently updated through relaxation labeling which is a well-known formal method of expressing low level contextual information, and applying it to complete the extraction of image features.
Image processing generally involves multiple subsequent stages of processing the images. Signatures have the nice property that they are like the results of sobel edge detection and the edges are expected to be more continuous in their formation. Moreover, signature pads are small images, with similar curves and accents and purely black and white, so they are near consistent and this helps with their processing.

Saturday, February 24, 2018

We were discussing identity management with Civic.

It introduced three new components: 1) a variety of smart contracts 2) an indigenous utility token and
3) new software applications.
Blockchain works as the ledger in these cases. The smart contracts are the code executed on the blockchain. There is a high degree of privacy for the individual whose transactions are maintained in the ledger. The transaction does not divulge any Personally Identifiable Information but an individual can easily prove ownership of the entries.
The ledger itself is decentralized and maintained by a community where no one actor can gain enough influence to submit a fraudulent transaction or alter recorded data.
Civic introduced a proprietary token which will be used as a form of settlement between participants to an indentity related transaction. It also provides a means to reward the participants While the service provider may follow any standards such as NIST, FIPS or PIV, Civic manages the attestation and its sharing between service providers. For example:
There is a service provider A who sells a service to user. The user sends the PII for verifying identity. A calculates a hash of the PII and records its attestation on the blockchain. User visit service provider B who wants access to all or certain of the PII. The user is willing to share the requested data and A offers a price for its attestation to B which accepts the price. User B can locate and view the blockchain transaction. It would also be able to recreate the hashes for the PII and compare them to that on the blockchain. If B is satisfied, it purchases the attestation and pays the amount into escrow via Civic tokens into escrow. The civic app with the user then transmits the PII to B. To complete the transaction, the CVC from the escrow is shared between the user and A - the original validator.
While the benefits for recording attestations on a distributed ledger are widely acknowledged to foster a new ecosystem, an identity provider may lean to centralized model for offering innovative technologies.
For example, it may require users to login merely with their signatures. Efficient image processing algorithms can then compare signatures. The signature pad : http://szimek.github.io/signature_pad/ is one such example which could be considered to replace password entry on many mobile devices. The user may have to flip the screen to landscape orientation but the experience can be very close to the real thing. Moreover, signature pads are small images and purely black and white, so they are near consistent and this helps with the processing. Moreover, what people draw on the signature pads is completely their call and can even handwrite passwords instead of signature. Since the data is private both at rest and transit, this cannot be divulged with anybody else and provides a layer of security on top of the known passwords. Signature detection and segmentation is a known field of study and techniques involve shape matching. For example: http://matlab-recognition-code.com/signature-recognition-based-on-neural-networks/