Cluster computing

Thursday, August 1, 2013

OData
The Open Data Protocol is a data access protocol for the web. OData provides a uniform way to query and manipulate the data sets through CRUD operations. OData endpoints can be created for a subset of these operations. OData differs from its predecessors such as ODBC , OLEDB, ADO.Net and JDBC in that it provides data access over the web. ODBC provided Data access in C language, OLE DB was COM based, ADO.Net was .Net based API and JDBC was for Java. The need for standardized data access is not restricted to relational data where it is served by SQL. Structured data and their access over web such that it can be integrated with SharePoint 2010, WCF data services framework, IBM's web sphere and Open Government data initiative.
OData is built from a standard referred to as the Atom publishing protocol which itself is built on top of the Atom Syndication Format.
Atom is a simple way of exposing a "feed" of data. A feed consists of many entries. Feed and Entry both have required and optional pieces of metadata. The top level element, feed has a title, link, updated author and id child elements, all of which are required except link. The ID associated with an entry is unique to that feed. If a link attribute is provided, it can be absolute or relative. A content element of type xhtml enables the content to be packed into the entry itself.
The Categories table of the Northwind database has the same schema as the Atom protocol. In fact the entire northwind database is available via the OData endpoints.
Atom feed documents are organized into collections, which in turn is organized into workspaces and those in turn to service documents.
CRUD operations are performed via GET, POST, PUT, and DELETE. HTTP and XML support is sufficient to publish OData endpoints.
Entries that are posted can be reviewed via the location of the newly created category.

ASP.Net Web API is a framework that makes it easy to write RESTful applications. You can host WebAPI. The Web API can be hosted within an ASP.NET application or inside a separate process. The workflow is something like this. The HttpServer receives the request messages and these are converted to an HttpRequestMessage. The message handlers process these messages. Custom handlers can be chained here. Handlers can be global or per route. The latter are configured along with the routes. If the message handler does not create a response on its own, the message continues through the pipeline to the controller. Since the controllers are found because they derive from a wellknown class or implement a well known Interface, the appropriate controller is instantiated and the corresponding action is selected. If the authorization fails, an error response is created and the rest of the pipeline is skipped. Next the model binding takes place and the OnActionExecuting and OnActionExecuted events are triggered before and after the action is invoked. The results are then converted through the handler and then the server. Error responses and exceptions bypass the pipeline. In the model binding stage, the request message is used to create values for the parameters of the actions. These values are passed to the action when the action is invoked. Request Message is broken into Uri, Headers and entity body and these are converted by binders and value providers to simple type or by formatters to complex type. Model binders use the URI and the query string while the formatter reads the message body To convert the results, the return value from the action is checked. If the return type is HttpResponseMessage, it is passed through. If the return type is void, the response is created with status 204 ( no content ). For all other return types, a media-type formatter serializes the value and writes it to the message body.

Wednesday, July 31, 2013

The datetime conversions (SQL Server) include the following functions :
The Convert function is a good general choice but DateName and DatePart provide flexibility.
SET DATEFORMAT dmy is used
SELECT CAST('7/31/2013' AS DateTime)
You can also use different styles to convert to and from text:
SELECT CONVERT(VARCHAR, CONVERT(DATETIME, '31-Jul-2013', 106), 106)
will display 31 Jul 2013
The allowed styles include 0, 100 Default:mon dd yyyy hh:mi AM (or PM)
Style 101 : USA mm/dd/yyyy
Style 102 : ANSI yyyy. mm. dd
Style 103 : British /French dd/mm/yyyy
Style 104: German dd.mm.yyyy
Style 105: Italian dd-mm-yyyy
Style 106: dd mon yyyy
Style 107: mon dd, yyyy
Style 108 : hh:mm:ss
Style 9, 109: Default with milliseconds
Style 110 USA:mm-dd-yyyy
Style 111 : Japan: yyyy/mm/dd
Style 112: ISO:yyyymmdd
Style 13, 113: Europe default with milliseconds and 24hour clock
Style 114: hh:mm:ss:mmm with a 24 hour clock
Style 20, 120 ODBC canonical, 24 hour clock: yyyy-mm-dd hh:mi:ss
Style 21, 121 ODBC canonical with milliseconds, 24 hour clock:yyyy-mm-dd hh:mi:ss.mmm

A look at some of the SQL constructs in no particular order:
1) common table expression - This is a temporary result set maintained for the duration of a query execution such as an insert, update or delete.
It can be thought of as a substitute for a view and doesn't require the view definition to be stored in the metadata. Follows the syntax: With CTE_Name(Col1, Col2, Col3) (SELECT query to populate) specify the outer query.
A CTE can be written in a recursive manner as follows : With CTE_name(column_name) AS
(CTE_query_definition -- anchor member is defined
union all
CTE_query_definition -- recursive member is defined in referencing cte_name
)
-- statement using the CTE
SELECT * from cte_name
For example With DirectReports (ManagerID, EmployeeID, Level) AS (
select e.ManagerID, e.EmployeeID, 0 as Level from dbo.Employee as e where ManagerID is null
union all
select e.ManagerID, e.EmployeeID, Level + 1 from dbo.Employee as e inner join DirectReports as d on e.ManagerId = d.EmployeeID
)
SELECT ManagerID, EmployeeID, Level from DirectReports
2) Some query hint options include
hash or order group which specifies if the aggregations use hash or order group
Merge or hash or concat specifies which should be used for union
Loop or merge or hash specifies which should be used for join
FAST number_rows that specifies fast retrieval of number_rows
FORCE ORDER specifies that the default source order is preserved.
MAXDOP overrides the max degree of parallelization setting, 0 => max
optimize for unknown informs the query optimizer to use statistical data instead of the initial values for variables
RECOMPILE discards the plan generated for the query after it executes, forcing the query optimizer to recompile

Tuesday, July 30, 2013

stored procedure to get stores by latitude and longitude

USE Store
GO

/****** Object: StoredProcedure [dbo].[GetStoresByGeo] Script Date: 7/30/2013 6:36:37 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Ravishankar Rajamani
-- Create date: July 2013
-- Description: Get a list of all Stores by co-ordinates
-- =============================================

CREATE PROCEDURE [dbo].[GetStoresByGeo]
@Latitude VARCHAR(50),
@Longitude VARCHAR(50),
@Total int OUTPUT
AS
BEGIN
SET NOCOUNT ON;

DECLARE @Total int; -- get neighboring coordinates DECLARE @g geography; -- center SET @g = geography::STPointFromText('POINT( ' + '49.261923' + ' ' + '-123.068225' + ')', 4326); DECLARE @n geography; -- neighbors SET @n = @g.STBuffer(1000 * 1609.344); -- get count SELECT @Total = count(*) FROM Store (nolock) WHERE Coordinates.STWithin(@n) = 1 -- the STBuffer computed column could be persisted.
SELECT @Total as Total -- Get a table of StoreIds of results DECLARE @StoreIds Table(ID int); INSERT INTO @StoreIds SELECT StoreID FROM Store (nolock) as S WHERE Coordinates.STWithin(@n) = 1 ORDER BY StoreID -- Get stores SELECT * FROM Store (nolock) as S inner join @StoreIds as Ids on Ids.ID = S.StoreID END

Caveat : This stored procedure has not been tried and merely tries to illustrate an idea from a similar real world example.

Here's the location usage in NerdDinner sample that we could modify with the geography data type as per the previous post:

public IQueryable<Dinner> FindByLocation(float latitude, float longitude)
{
var dinners = from dinner in FindUpcomingDinners()
join i in NearestDinners(latitude, longitude)
on dinner.DinnerID equals i.DinnerID
select dinner;

return dinners;
}

/****** Object: UserDefinedFunction [dbo].[NearestDinners] Script Date: 7/30/2013 7:39:11 AM ******/
DROP FUNCTION [dbo].[NearestDinners]
GO

/****** Object: UserDefinedFunction [dbo].[NearestDinners] Script Date: 7/30/2013 7:39:11 AM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE FUNCTION [dbo].[NearestDinners]
(
@lat real,
@long real
)
RETURNS TABLE
AS
RETURN
SELECT Dinners.DinnerID
FROM Dinners
WHERE dbo.DistanceBetween(@lat, @long, Latitude, Longitude) <100

GO

Monday, July 29, 2013

The advantage of using Geography data types for the REST based location services is that the URI can continue to remain the same for the customer. The latitude and longitude is anyways represented in a well-known text format that is described by the Open Geospatial Consortium.
For example,

DECLARE @g geography;
SET @g = geography::STPointFromText('POINT(-122.34900 47.65100)', 4326);
SELECT @g.ToString();

Internally these WKTs are translated to Geography data type instances and stored like regular columns along with the attributes of the location.
They are queried using functions based on the geography data types. The functions could be based on distance between two points or their membership to a polygon such as one representing a zip.
They invoke corresponding static methods on the geography data type such as for example STDistance()

DECLARE @g geography;
DECLARE @h geography;
SET @g = geography::STGeomFromText('LINESTRING(-122.360 47.656, -122.343 47.656)', 4326);
SET @h = geography::STGeomFromText('POINT(-122.34900 47.65100)', 4326);
SELECT @g.STDistance(@h);

or find all intersection points with a polygon such as follows

DECLARE @Seattle geography = 'POLYGON(....)'; 
SELECT c.customerid FROM customer c WHERE c.geog.STIntersects(@Seattle.STBuffer(10 * 1609.344));

In either case the we can retrieve the points that are found in the vicinity of a location of interest.
By changing the column data type, we may have to handle all dependencies on the earlier data type. If location data was used for any index creation or used in conjunction with other tables or database objects, they would need to be modified to use the geography data type. In general, this should not be a problem since pretty much all the semantic operations performed on the co-ordinates are supported by the geography data type.
Also, one more thing to point out is that the geography column does not need an index creation. In fact, this might be a good opportunity to refactor all references to the geography column and limit their usage only to the database objects that support querying by the user.
Finally, database unit-tests (DUT) must be authored before the changes are made so that we can keep the same behaviour before and after we make the changes. These DUTs could include specific use case scenarios, as well as column scanning to see if any of the conversions failed, or relative distances between select points are maintained with the modified column type.