Tuesday, April 5, 2016

Today we continue to discuss software testing effectiveness measures - these are P-Measure, E-Measure and F-Measure. We had read that the E-measure helps make P-Measure more useful.Their relative values change similarly. Moreover E-measure can differentiate the detection of more than one failure while the P-Measure regards a testing strategy as good as another if they can detect at least one failure. E-Measure assumes homogeneous subdomains whereas to maximize P-Measure, only one subdomain has the failure causing inputs.
Moreover, it has been difficult to generalize such conditions for overlapping subdomains. But in this case too E-Measure is able to identify some characteristics of overlapping subdomains testing For example, T.Y. Chen and Y. T. Yu found that the relative performance of subdomain testing to random testing depends on the aggregate of the differences in the subdomain failure rate and the overall failure rate. The differences must be weighted by the number of test cases selected from that sub-domain. In the overlapping case, the subdomain failure rates can be higher than the overall which makes the score in favor of subdomain testing as opposed to random testing.
Lastly if the E-Measure says that the subdomain testing strategy is better than the random testing then it must be the case with P-Measure also. This leads to characterizations using P-Measure. This does not mean that the same program can get the same ranking from both measures. That said, if the failure rates are small, the E-Measure serves as a good approximation to the P-Measure. More analysis with E-measure can also be helpful if the subdomain testing strategies are do-able.

Monday, April 4, 2016

Today we continue to discuss software testing effectiveness measures - these are P-Measure, E-Measure and F-measure. We were discussing random testing. The P-measure in this case is equal to 1 minus the success rate raised to the power of the inverse of failure rate p. The E-measure is the combination of the number of test cases and the failure rate. The F-measure is the ratio of the size of the input domain and the size of the failure region. Random black box sampling of the input domain gives the worst case. That said the randomness may still be required because the failure patterns are not always known. This leads to a technique called adaptive random testing. This is a technique to distribute test cases evenly within the input space without losing all randomness. The fixed size candidate set case of this technique involves the following steps; First, randomly generate a set of test cases from each iteration of the test case generation. For each such set, the distance of the candidate from the set of previously executed tests is calculated. Lastly, the candidate with the greatest distance is executed and added to the set of executed tests. This technique has achieved result near the upper bound for effectiveness.
In subdomain testing, the effectiveness of testing is measured in terms of the faults detected and the maximum value of E-measure gives such a case. In contrast, to maximize P-measure, it is only necessary to have one subdomain full of failure causing inputs, even though all other subdomains may have low failure rates. Therefore the E-measure has more merit for this purpose. Besides, it can distinguish the capability of detecting more than one failure while the P-measure treats them the same if they result in at least one failure. Furthermore, it has been difficult to extend the formal analysis to more general and more common case of overlapping subdomains. By using E-measure, T.Y.Chen and Y.T.Yu say that new properties of P-measure can be found rather than with P-measure alone. For example when comparing subdomain testing with random testing, a greater value or equal value of E-measure assures a greater value of P-measure. Thus the relative value of E-measure also gives a good indication of P-measure.
There is no single measure that serves as the best measure.

Sunday, April 3, 2016

Today we discuss software testing effectiveness measures - these are P-Measure, E-Measure and F-measure. The E-measure is the expected number of failures detected by the set of test cases. The P-Measure is the probability of detecting at least one failure from the set of test cases. Both the P-measure and the E-measure have been used widely. The F-measure is the number of tests required to detect the first failure. Also faults are different from failures. A failure is any manifestation of deviation from the correctness of a program execution. One or more faults may be the reasons for the failures. Until a failure occurs, a fault is said to be not known. A fault is typically found by debugging. A failure is seen by observation. The F-measure is therefore concerning the first fault. It leads to fewer test cases and a reduced cost of overall testing. Even within the same number of test cases, the order of test cases matters.
The detection of more failures does not always help us identify more faults. That said it is still better to reveal as many failures as possible. Intuitively the more failures the testing reveals the more information for debugging faults and the higher the chance of detecting more faults. In this sense, a measure can be to detect as many failures as possible. This is what we attempt with maximum value of E-measure.(courtesy T.Y. Chen and Y.T. Yu)
Failures occur for a subset of the possible input domain of the program. Generally, inputs that produce failures will cluster in regions of the input domain. Contiguous regions form patterns in the input domain. For two dimensions, there are (a) block pattern, (b) strip pattern and (c) point pattern.
The assumptions made in finding patterns include the following:
(a) the failure region's size, shape and orientation is known
and (b) the region's location in the input domain is not known. (courtesy: cs.umd.edu CMSC737)
courtesy:
#codingexercise
int GetP-Measure(int p)
{
return 1 - (1-p) ^ (1/p);
}
int GetE-Measure(int n, int p)
{
return n*p;
}
int GetF-Measure(int sizeOfInputDomain, int sizeOfFailureRegion)
{
return sizeOfInputDomain / sizeOfFailureRegion;
}

Saturday, April 2, 2016

Background Tasks

Contents

Problem statement:

We have a self service file share provisioning portal which creates and maintains a file share on a remote storage clusters that are named by their location. Each cluster has a driver that we call the connector and it exposes a set of SSH commands for the said operations. An API layer translates the user portal request into a command for a target connector. The API layer is reached through a gateway. Frequently we have seen timeouts from different sources - sometimes the gateway times out, sometimes the connector takes too long, sometimes the SSH connection to a remote cluster takes too long. For now we have been working with an upper bound on the timeout. In this post, we try to find an alternative solution using an asynchronous mechanism.

Solution:

Current Interaction diagram looks like this:

In this interaction, API is on hold until the resource is created.

Instead if a promise could be made that can be fulfilled later, then the API can be relieved right away.

This introduces two challenges:

1) Earlier the object was created and its properties cannot be changed and operations cannot be made on the object until it has been created and marked active. Moreover we send emails to the customer when the share has been created.

2) Earlier the object creation was scoped within a transaction so we had a clean state to begin or end with.

We now overcome these to have an interim state before it is marked active. We call this state pending. Since each resource is identified by it location and file share name, the API can check multiple requests for the same object creation by finding the existence of a record in the database and making sure that it is not pending. If the object is pending, the API can return an error as a bad request or method not allowed. If the object is already created, then the bad request error returned with a different error message. No other operations are permitted unless the object is active. Therefore the create call is an idempotent call

In order to receive a notification, the API layer can spawn workers to separate the send and receive operations on different tasks. This lets the send operation relieve the caller immediately. Moreover the receive operation can be made a background task. On completion of this task, the user can be emailed that the share is ready. If the share creation fails, the share entry is removed from the current database so that it is in a clean state and the create operation can be retried.

This background task happens as processing of a message queue. During the send operation the database is marked and a message is placed on the message queue. The message queue processor reads the message and issues a command to the connector waiting on the response. When the connector responds back, it suitably marks the record as active or deletes the entry. Only the processor can delete the entry. It notifies the API or the portal or the user via http or email. Both the processor and the API have to check the entries in a database. There is an understanding between the API and the processor that redundant create messages may appear but there can only be one file share and one corresponding entry in the database.

The background processor or spool :

The interaction diagram now looks like this:

Choice of a background task processor versus concurrent APIs:

In the above description, a background task processor has been proposed. The same task separation could have been executed independently with concurrency features of the language. However, the background task enables separation from the caller and the recipient of the results. Moreover it can handle different types of tasks. A message queue because allows retries on failures and can scale as a cluster.

Conclusion:

APIs can become more responsive with a background task processor and a new state for the resource.

Friday, April 1, 2016

Today we continue to review the paper Queues are Databases by Jim Gray.
He talks about message oriented middleware (MOM) and made the observation that queues are better inside the database for transactions. When queues are outside database, they pose performance and concurrency control problems. The mechanisms for concurrency control within a database include Read_Past, Read_Through, Notify,
Read_Past allows skipping over dirty records to find the first committed record and is used in dequeue(). Read_Through allows a program to examine records that have not yet been committed. Notify allows a program to wait for the state change in a lock.
Queue managers are simple TP-monitors
Queues may have workerpools associated with them. This helps to service the entries in the queue. TP monitors configure, manage and load balance these pools. Queued processing has many variants. These include periodic, event and batch. Queues may also have a scheduling policy.
When the queues are implemented as part of the database, triggers may be invoked to give a simple queued transaction processing system.
Queue managers can be built in an Object Relational database system
#coding exercise
Implement promises on message queues.
function promise ( function (resolve, reject){

});
The constructor sends a message to the queue
The callback receives a message from the queue.

Thursday, March 31, 2016

Today we review the paper Queues are Databases by Jim Gray.
He talks about message oriented middleware (MOM) and makes the following mentions:
Queued transaction processing is less general than direct transaction processing. Queued systems are built on top of direct systems. Direct, conversational or distributed transactions are not built on top of a queued system.
Queues are interesting databases with interesting concurrency control. He suggests building it into the database directly.
Queues systems need DBMS functionality. Queues need security, configuration, performance monitoring, recovery and reorganization utilities. Database systems already have these features.
Queue Managers are simple TP-monitors managing server pools driven by queues. Database systems are encompassing many server pool features as they evolve to TP-like system.
He mentions that there are three ACID units in queued transaction processing of a request, response : 1) Client places request in queue 2) Server dequeues request, performs task, enqueues response and 3) Requester dequeues response from output queue.
Due to these three units, a multistep dialog cannot be wrapped in a single ACID transaction unit. There is a lot of overhead in implementing distributed transactions on top of queues. By contrast, it is easier for a direct transaction processing systems to add a queue mechanism as a new transaction type.
He recognizes that queues are important when he says that queue processing is increasingly important for workflow, disconnected operation and replication applications. He adds that it has always been the mainstay of batch and spool operations. With the acronyms MOM for queues and DADs for databa, ses, he says both MOM and DAD are needed.
When the queues become part of databases, the queue is a database class that gets create(), enqueue(), dequeue(), poll() and destroy() methods. By using a database, the queue manager becomes a native resource manager with no special code for startup, shutdown, checkpoint, commit, query, security or utilities. The database does all the hard stuff such as locking, logging, access paths, recovery, utilities, security, performance monitoring and so on. Even the query manager gets query, backup, restore, reorganize and replicate data for free.
If the queues were outside database they would pose performance and concurrency control issues.
In cloud computing with a distributed message queue cluster, we could perhaps do without transactions and instead use idempotent incremental operations with retries.
#codingexercise
Implement a Futures interface on top of queues
void Promise(new Function(resolve, reject))
{

}

Wednesday, March 30, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. We were reviewing the performance of LRC in WAS. We now review related work. Erasure coding has been applied in many large scale distributed storage systems. Its better than simple replication because it provides higher reliability and it requires much lower storage for the same reliability. Even when the distributed systems apply erasure coding, they typically apply only Reed-Solomon. LRC has proved better than that both in terms of efficiency and performance tradeoff. We also read on related work and how LRC differs from those. Reed-Solomon codes are maximum distance separable (MDS) codes which require minimum storage overhead for given fault tolerance. LRC is not MDS and requires additional storage overhead for the same fault tolerance.This is exploited for efficient reconstruction. This tradeoff between storage overhead and reconstruction efficiency is also attempted by other systems such as Weaver codes, HoVer codes and Stepped Combination codes. LRC achieves better trade-offs than all of the above to meet WAS erasure coding design goals. One of the alternatives to reconstruction is to read from more fragments instead of fewer fragments but to reduce the amount read from each fragment.However, this does not have the same savings in terms of I/O and bandwidth as LRC does.
Thus we have seen that erasure coding reduces the cost of cloud storage, where our target storage overhead is 1.33x of the original data.The challenge in erasure coding is the fast reconstruction of offline data fragments such as due to disk, node, rack or switch failures as well as during upgrades. LRC reduces the number of fragments to read and perform the reconstruction. It has slightly more overhead but comparable latency for small I/Os and better latency for large I/Os At the same time it provide better durability than the traditional approach of keeping 3 copies. Lastly we reviewed some of the implementation and design considerations.
#codingexercise
Reed Solomon corrected errors theoretically by finding the most popular message polynomial. The potential polynomials are repeatedly produced using subsets. How many subsets are there
Solution : The number of subsets is the binomial coefficient n choose k.
int getSubsets (int n, int k)
{
return Factorial(n) / (Factorial(n-k) * Factorial(k));
}