Sunday, September 10, 2017

Today we start reviewing U-SQL.It unifies the benefits of SQL with the expressive power of your own code. This is said to work very well with all kind of data stores – file, object and relational. U-SQL works on the Azure ecosystem which involves the Azure data lake storage as the foundation and the analytics layer over it. The Azure analytics layer consists of both HD Insight and Azure data Lake analytics (HDLA) which target data differently. The HDInsight works on managed Hadoop clusters and allows developers to write map-reduce with open source. The ADLA is native to Azure and enables C#, SQL over job services. We will also recall that Hadoop was inherently batch processing while Microsoft stack allowed streaming as well. The benefit of the Azure storage is that it spans several kinds of data formats and stores. The ADLA has several other advantages over the managed Hadoop clusters in addition to working with a store for the universe. It enables limitless scale and enterprise grade with easy data preparation. The ADLA is built on Apache yarn, scales dynamically and supports a pay by query model. It supports Azure AD for access control and the U-SQL allows programmability like C#.
U-SQL supports big data analytics which generally have the characteristics that they require processing of any kind of data, allow use of custom algorithms, and scale to any size and be efficient.
This lets queries to be written for a variety of big data analytics. In addition, it supports SQL for Big Data which allows querying over structured data Also it enables scaling and parallelization. While Hive supported HiveSQL and Microsoft Scoop connector enabled SQL over big data and Apache Calcite became a SQL Adapter, U-SQL seems to improve the query language itself. It can unify querying over structured and unstructured data. It has declarative SQL and can execute local and remote queries. It increases productivity and agility  It brings in features from T-SQL, Hive SQL, and SCOPE which has been Microsoft's internal Big Data language.U-SQL is extensible and it can be extended with C# and .Net
Courtesy : U-SQL slideshare
#codingexercise
Count binary strings with k times appearing adjacent set bits:
Given a string of bits with length n and to find the the number of times k adjacent set bits appear, we solve it recursively:
1) if n == 1 then there is no count
2) if k == 0 or k >n  return no count
3) if string ends with 0 add the recursive count for string ending at n-1 and for k adjacent bits
4) else
          if the substring upto n-1 endswith 0
             add the recursive count for string ending at n-1 and for k adjacent bits
          if the substring ends with 1
             add the recursive count for string ending at n-1 and k-1 adjacent bits plus one
5) return the count

No comments:

Post a Comment