This is a continuation of a series of articles on
operational engineering aspects of Azure public cloud computing the included
the most recent discussion on Azure Maps which is a full-fledged general
availability service that provides similar Service Level Agreements as expected
from others in the category. In this article, we explore Azure SQL Edge.
SQL Edge is an optimized relational database engine that
is geared towards edge computing. It provides a high-performance data storage
and processing layer for IoT applications. It provides capabilities to stream,
process and analyze data where the data can vary from relational to document to
graph to time-series and which makes it a right choice for a variety of modern
IoT applications. It is built on the same database engine as the SQL Server and
Azure SQL so applications will find it convenient to seamlessly use queries
that are written in T-SQL. This makes applications portable between devices,
datacenters and cloud.
Azure SQL Edge uses the same stream capabilities as Azure
Stream Analytics on IoT edge. This native implementation of data streaming is
called T-SQL streaming. It can handle fast streaming from multiple data
sources. A T-SQL Streaming job consists of a Stream Input that defines the
connections to a data source to read the data stream from, a stream output job
that defines the connections to a data source to write the data stream to, and
a stream query job that defines the data transformation, aggregations,
filtering, sorting and joins to be applied to the input stream before it is
written to the stream output.
Data can be transferred in and out of SQL Edge. For
example, data can be synchronized from SQL Edge to Azure Blob storage by using
Azure Data factory. As with all SQL instances, the client tools help create the
database and the tables. The SQLPackage.exe is used to create and apply a DAC
package file to the SQL Edge container.
A stored procedure or trigger is used to update the watermark levels for
a table. A watermark table is used to store the last timestamp up to which data
has already been synchronized with Azure Storage. The stored procedure is run
after every synchronization. A Data factory pipeline is used to synchronize
data to Azure Blob storage from a table in Azure SQL Edge. This is created by
using its user interface. The PeriodicSync property must be set at the time of
creation. A lookup activity is used to get the old watermark value. A dataset
is created to represent the data in the watermark table. This table contains
the old watermark that was used in the previous copy operation. A new Linked
Service is created to source the data from the SQL Edge server using a
connection credentials. When the connection is tested, it can be used to
preview the data to eliminate surprised during synchronization. The pipeline
editor is a designer tool where the WatermarkDataset is selected as the source
dataset. The lookup activity gets new watermark value from the table that
contains the source data so it can be copied to the destination. A query can be
added to the pipeline editor for selecting the maximum value of the timestamp
from the Watermark table. Only the first row is selected as the new watermark.
Incremental progress is maintained by continually advancing the watermark. Not
only the source but the sink must also be specified on the editor. The sink
will use a new linked service to the blob storage. The success output of a Copy
activity is connected to a stored procedure activity which then writes a new
watermark. Finally, the pipeline is scheduled to be triggered periodically.
No comments:
Post a Comment