Tuesday, June 27, 2017

Today we continue with the design discussion for transcription service. It involves the frontend user interface that is hosted from the cloud so that users  can upload files directly to the service over a browser. The Service itself is implemented using WebAPIs that allow ingestion, extraction, merging of transcript as captions and indexing the transcript. Speech recognition software may be used to convert some or all of these audio files or files generated from other formats. Text generated from such software can be cleaned using edit distance algorithm to match the nearest dictionary word or corrected via the context. Text may also be substituted or enhanced with grammar corrections and suitable word from alterations. This may be optional for certain files as the enhancement may alter the original text which may not be acceptable for poems and ballads or other such forms of speech. Each enhancement could be a separate instance of the file and can be persisted as yet another document. This service should be capable of scaling out to service hundreds of thousands of customers who use personal assistant devices such as Alexa or Siri or their favorite telecom provider to record their conversations or meetings. Transcription service for workplace conference rooms will eliminate the need for keeping minutes if it can be used to convert the segment audio streams into transcripts. On the other hand, cleaned text can then be stored and indexed along with the location of the original file so that it can participate in retrievals. Data may be secured based on users and consequently they may be organized based on user information. This means we don't necessarily have to keep the documents in a single database or a NoSQL store. We also don't have to keep it in a public cloud database. The suggestion here is that files can be uploaded as S3 objects while their resource locators can be used in a database. The transcripts on the other hand along with the resource location can be handy inside a public cloud database. These databases will allow querying, document classification, ranking, sorting and collaborative filtering much better than if the data were in the form of files. Since the databases can be in public cloud, they can grow arbitrarily large and still should be able to scale. We don't have a need for NoSQL data but if the transcripts and the audio files can be translated to a bag of words and we want to use MapReduce, then we can use extract-transform-load from the database to a NoSQL store on a periodic basis . However, even machine learning algorithms are now executable inside a Python module within the database server. This means that depending on the analysis we want to perform and whether the processing is offline and batch, we can choose the data stores. Finally, the user accounts and metadata for transcripts are operational data and should belong in the database server.
#codingexercise
Get count of subarray divisible by K
int GetCountDivByK(List<int> A, int k)
{
var mod = new int[k];
int sum = 0;
for (int i = 0; i < A.Count; i++)

    sum += A[i];
    mod[((sum%k)+k)%k] += 1;
}
int result = 0;
for (int i =0; i < k; i++)
    if (mod[i] > 1)
          result += (mod[i] * (mod[i]-1)) / 2;
result += mod[0];
return result;
}
Testing for divisibility is simple
bool HasDivByK(List<int> A, int k)
{
var mod = new int[k];
int sum = 0;
for (int i = 0; i < A.Count; i++)
{
   sum += A[i];
   mod[((sum%k)+k)%k] += 1;
   if (mod[((sum%k)+k)%k]  > 1 || mod[0] > 0)  return true; 
}
return false;
}
#cc_document https://1drv.ms/w/s!Ashlm-Nw-wnWsB5CqwSCCxBs9bjc

No comments:

Post a Comment