Sunday, March 31, 2024

 #codingexercise

Given  a string S and a library of strings B = {b1, ... bm}, construct an approximation of the string S by using copies of strings in B. For example, B = { abab, bbbaaa, ccbb, ccaacc} and S = abaccbbbaabbccbbccaabab.  

The solution consists of taking strings from B and assigning to non-overlapping solutions of S. Strings from B may be used multiple times. There is a cost for unmatched character in S as alpha and there is a cost for mismatched character in B as beta. The mismatch between i and j positions is the number of mismatched characters in b[j] when aligned starting with position i in s. We compute the optimal costs of positions 1 to n in the given string. We determine the optimal cost of position k as a function of the costs of the previous positions.

void GetCost(String S, List<String> B, int alpha, int beta, int[] Optimals)

{

for (int k = 1; k < S.length; k++)

{

    Optimals[k] = Optimals[k-1] + alpha;

    for (int j = 1; j < B.Length; j++)

    {

        int p = k - B[j].length;

        if (p > 0)

        Optimals[k] = Math.min(Optimals[k], Optimals[p-1] + beta x Mismatch(p,j));

     }

}

}

The purpose of the above code is to show that we find the optimal solution by evaluating the cost of each position as a candidate for the use of a string from the given list. Either the current position is a continuation of the previous or a beginning of the next. We assume the former by default even if it is a mismatch and note down the cost as alpha. Then for each of the strings, we take the substring that is permissible up to the current location and add it to the previously evaluated cost at the position where the substring fits in say p. This cost is therefore the cost computed prior to p and the number of mismatched characters in the substring times the unit cost beta. The previous computations aid subsequent computations and we use memoization to avoid recomputing cost for a position.


No comments:

Post a Comment