An essay on hosting a runtime
Many storage products offer viable alternatives to file system storage with the hope that their offerings from storage engineering best practice, will meet and exceed the store-and-retrieve requirements from analytical workloads. The world of analytics drives business because it translates data to information by running queries. Traditionally, they have looked for running queries on accumulated historical data and the emerging trend is now to see the past, present and future data all at once and in a continuous streaming manner. This makes the analytics seek out more streamlined storage products than the conventional database.
The queries for analytics have an established culture and a well-recognized database-oriented query language and many data analysts have made the transition from using spreadsheets to writing queries to view their data. With the rapid popularity of programming languages to suit different industries, the queries became all the easier to write and grew in shape and form. This made queries very popular too.
Queries written in any language need to be compiled so that a query processing engine can execute it properly over a storage product. In the absence of this engine, it became hard for the storage product to embrace the queries. Similarly, the expansion of queries to leverage the nuances and the growing capabilities in the storage product was limited by the gap in a suitable translator for the product.
Around the same time, remote access of any product became a requirement, which was addressed with the adoption of a standard application programming interface or APIs as they are more commonly called. These APIs allowed the product to be viewed as resources and use a small set of verbs over the web protocol that powers the internet. These verbs were the equivalent of create, retrieve, update and delete of a resource and virtually all offerings from the storage product could be viewed as resources.
Thus, querying became a way to execute these verbs over the network on the data residing in the storage product. This convention enabled clients to write the queries in their choice of languages which could then be translated into a set of calls over the network against these APIs. This mechanism gained immense popularity to bridge the gap between analysis and storage because the network was the world wide web and queries could come from all around the world.
Queries are diverse and storage products offer the best for data at rest and in transit. The bridge to the gap between storage and analysis was inefficient, delayed, chatty and layered. There needed to be some technology that enabled queries to be translated into something more native to the product and perhaps executed as close to the data as possible.
The notion that programs could be written in a variety of languages but executed on a computer using a runtime was now being sought for closing this gap. Many databases had already become leaders in hosting business logic directly within the database so that the computations could become fast as they operated very close to the data. Even when storage products were no longer databases, quite a few of them still used algorithms and data structures that had stood the test of time with databases. The matter of hosting a runtime to understand queries written in different languages and directly return the results of the calculations, now seemed to become part of a requirement for storage products. This holds a lot of promise in the industry and with surprisingly few pioneers.
Many storage products offer viable alternatives to file system storage with the hope that their offerings from storage engineering best practice, will meet and exceed the store-and-retrieve requirements from analytical workloads. The world of analytics drives business because it translates data to information by running queries. Traditionally, they have looked for running queries on accumulated historical data and the emerging trend is now to see the past, present and future data all at once and in a continuous streaming manner. This makes the analytics seek out more streamlined storage products than the conventional database.
The queries for analytics have an established culture and a well-recognized database-oriented query language and many data analysts have made the transition from using spreadsheets to writing queries to view their data. With the rapid popularity of programming languages to suit different industries, the queries became all the easier to write and grew in shape and form. This made queries very popular too.
Queries written in any language need to be compiled so that a query processing engine can execute it properly over a storage product. In the absence of this engine, it became hard for the storage product to embrace the queries. Similarly, the expansion of queries to leverage the nuances and the growing capabilities in the storage product was limited by the gap in a suitable translator for the product.
Around the same time, remote access of any product became a requirement, which was addressed with the adoption of a standard application programming interface or APIs as they are more commonly called. These APIs allowed the product to be viewed as resources and use a small set of verbs over the web protocol that powers the internet. These verbs were the equivalent of create, retrieve, update and delete of a resource and virtually all offerings from the storage product could be viewed as resources.
Thus, querying became a way to execute these verbs over the network on the data residing in the storage product. This convention enabled clients to write the queries in their choice of languages which could then be translated into a set of calls over the network against these APIs. This mechanism gained immense popularity to bridge the gap between analysis and storage because the network was the world wide web and queries could come from all around the world.
Queries are diverse and storage products offer the best for data at rest and in transit. The bridge to the gap between storage and analysis was inefficient, delayed, chatty and layered. There needed to be some technology that enabled queries to be translated into something more native to the product and perhaps executed as close to the data as possible.
The notion that programs could be written in a variety of languages but executed on a computer using a runtime was now being sought for closing this gap. Many databases had already become leaders in hosting business logic directly within the database so that the computations could become fast as they operated very close to the data. Even when storage products were no longer databases, quite a few of them still used algorithms and data structures that had stood the test of time with databases. The matter of hosting a runtime to understand queries written in different languages and directly return the results of the calculations, now seemed to become part of a requirement for storage products. This holds a lot of promise in the industry and with surprisingly few pioneers.
No comments:
Post a Comment