This is a continuation of a series of articles on
operational engineering aspects of Azure public cloud computing that included
the most recent discussion on Azure SQL Edge which is a full-fledged general
availability service that provides similar Service Level Agreements as expected
from others in the category.
SQL Edge is an optimized relational database engine that
is geared towards edge computing. It provides a high-performance data storage
and processing layer for IoT applications. It provides capabilities to stream,
process and analyze data where the data can vary from relational to document to
graph to time-series and which makes it a right choice for a variety of modern
IoT applications. It is built on the same database engine as the SQL Server and
Azure SQL so applications will find it convenient to seamlessly use queries
that are written in T-SQL. This makes applications portable between devices,
datacenters and cloud.
Azure SQL Edge uses the same stream capabilities as Azure
Stream Analytics on IoT edge. This native implementation of data streaming is
called T-SQL streaming. It can handle fast streaming from multiple data
sources. A T-SQL Streaming job consists of a Stream Input that defines the
connections to a data source to read the data stream from, a stream output job
that defines the connections to a data source to write the data stream to, and
a stream query job that defines the data transformation, aggregations,
filtering, sorting and joins to be applied to the input stream before it is
written to the stream output.
Azure SQL Edge is also noteworthy for bringing the
machine learning technique directly to the edge by running ML models for edge
devices. SQL Edge supports Open Neural Network Exchange (ONNX) and the model
can be deployed with T-SQL. The model can be pre-trained or custom-trained
outside the SQL Edge with a choice of frameworks. The model just needs to be in
ONNX format. The ONNX model is simply inserted into the models table in the
ONNX database and the connection string is sufficient to send the data into
SQL. Then the PREDICT method can be run on the data using the model.
ML pipeline is a newer technology as compared to
traditional software development stacks and pipelines have generally been
on-premises simply due to the latitude in using different frameworks and
development styles. Also, experimentation
can get out of control from the limits allowed for free-tier in the public
cloud. In some cases, Event processing
systems such as Apache Spark and Kafka find it easier to replace
Extract-Transform-Load solutions that proliferated with data warehouse. The use
of SQL Edge avoids the requirement to perform ETL and machine learning models
are end-products. They can be hosted in a variety of environments not just the
cloud or the SQL Edge. Some ML users would like to load the model on mobile or
edge devices. Many IoT traffic and experts agree that the streaming data from
edge devices can be quite heavy in traffic where a database system will
out-perform any edge device-based computing. Internet tcp relays are of the
order of 250-300 milliseconds whereas ingestion rate for database processing
can be upwards of thousands of events per second. These are some of the
benefits of using machine learning within the database.
No comments:
Post a Comment