Today we start reading the paper "Big Data and Cloud
Computing: A survey of the State-of-the-Art and Research Challenges" by
Skourletopoulos et al. This paper talks about the comparisons of data warehouse
and big data as a cloud offering. As Gartner mentioned, there will be more than
20 billion connected devices expected by the year 2020 and the amount of data
exchanged by the sensors is going to be way more than the amount of data
exchanged by human beings. The size of data is only growing. Many find it easier
to directly work on such large scale data 
Big Data refers to very large and complex data sets that traditional
data sets are incapable of processing For a more detailed comparision, I refer
an earlier blog post. The main takeaway is that BigData is not only about
storage but also about a different type of algorithms. These load, store and
query a massive scale of data in batches by a technique called MapReduce and
can run in parallel across a distributed cluster. Social network is one example
of Big Data. Many cloud providers have established new datacenters for hosting
social networking, business media content or scientific applications and
services. In fact storage from cloud providers is measured in gigabyte-month
and compute cycle is priced by the CPU-hour.
IBM data scientists argue that the key dimensions of big
data are : volume, velocity, variety and veracity. The size and type of
existing deployments show ranges along these dimensions. Many of these
deployments get data from external providers. A Big data as a service stack may
get data from other big data sources, operational data stores, staging
databases,  data warehouses and data
marts.  Typically the operational datastores,
staging databases and warehouses are relational data. Data marts allow analysis
over dimensions along a cube. Big Data sources can include source systems in
Compliance, Trading, CRM, Research, Finance, MDM, Pricing and other IoT data
sources.
Zheng et al described a big data as a service offering for
service generated data. He showed that the stack for this service includes all three layers of analytics, platform and infrastructure in that hierarchy. The data feeding into this service comes from service generated big-data that includes service logs, service quality of service QoS and service relationship. The log analysis comes useful for visualization and diagnosis.  The QoS provides fault tolerance and prediction.  The service relationship provides service identification and migration.
#codingexercise
#codingexercise
Count all Palindromic subsequences in a given string 
Int GetCountPalin(string A, int start, int end) 
{ 
If (String.IsNullOrEmpty(A) || A.Length == 0 ) return 0; 
// Assert(start >= 0 && start < A.Length && end >= 0 && end < A.Length && start <=end); 
If (start == end) return 1; 
Int count = 0; 
If (A[start] == A[end]){ 
     count += GetCountPalin(A, start+1, end); 
     count += GetCountPalin(A, start, end-1); 
     count += 1; 
}else{ 
    count += GetCountPalin(A, start+1, end); 
    count += GetCountPalin(A, start, end-1); 
    count -= GetCountPalin(A, start+1, end-1); 
} 
return count; 
} 
Void  Combine(string A, ref stringbuilder b, int start, int level, ref List<int> palindromecombinations)  
{  
for (int I =start; I < A.length; I++)  
{   
     b[level] = A[i];  
     If(IsPalindrome(b.toString()))  
            palindromecombinations.add(b.toString()); 
    if (I < A.length)  
           Combine(A, ref b, start+1, level+1, ref palindromecombinations);  
     b[level] = '/0';  
}  
} 
No comments:
Post a Comment