Friday, November 3, 2017

We were discussing MDM usage.  MDM users still prefer to use MS Excel. This introduces ETL based workflows and silo-ed views of data. Materialized views don't help because they are not updated in time. Also, any separation of stages to data manipulation introduces human errors and inconsistencies in addition to delay to reach the data.  The logic in the ETL also becomes more idempotent as it is needlessly exercised even if there are a few rows only to be inserted. Moreover the operation on each row now has to be made robust by making sure the corresponding row does not already exist. For example to move a record from source to destination, there must be a check to see if its exists in the destination already and to insert it and delete from the source. The delete cannot happen unless the record has already been inserted. and each of these operations has to be done for each row. Error checking for the workflow now includes checks against duplicate entries, syntactic or semantically equivalent entries and the progression of state for an entry to be forward only. These kind of checks could all be avoided if it were left to a service rather than an ETL workflow but access to the service is not always preferred to be programmatic so we have ADO.NET clients or cursor like tools that translate to LINQ queries.
One of the ways MDM users overcome this challenge is cited in the example from Denodo. 
Denodo is a data virtualization platform which  means that it lets you seamlessly work with data regardless of which database the data is physically located in. It gives you the ability to access complete information with business entities and pre-integrated views. It allows you to explore related information via discovery and self-service. It lets you access data in real time from different apps and devices. Basically it avoids point to point integration such as with ETL workflows by IT for case by case usage from business departments. As an abstraction layer it gives a unified repository view regardless of the actual data sources as databases, warehouses, OLAP, applications, web services, SaaS, and NoSQL.  Each CRUD operation on the unified data can then be executed against their respective data sources.
If we compare this model with OData which sought to expose the database directly to the web so that users can do pretty much the same thing, then we realize that the interface used by the Denoda against all data sources regardless of origin as well as the OData REST based interface correspond to standard DML statements on a database. It would be ideal if every database vendor also supported OData browsability.
#codingexercise
Segregate and sort odd and even numbers on either side of the input array 
Void SegregateAndSort(ref List<int> input) 
Var oddCount = input.Count(x => x%2 == 1); 
Int I = 0;  
Int j = I+1; 
While (I < oddCount && j < input.count) 
{  
    If (input[I] %2 == 0 && input[j] %2 == 1) { 
        Swap (ref input, I, j) 
         I  = I + 1; 
         J = j + 1; 
         Continue; 
     }     
     If (input[j] %2 == 0) { 
        J = j + 1; 
         Continue; 
     } 
     If (input[I] %2 == 1) { 
        I = I +1; 
        If (j <= I) 
           J =  I + 1; 
        Continue; 
     } 
input.Sort(0, oddCount); 
input.Sort(oddCount,Count-oddCount); 


No comments:

Post a Comment