This is a continuation of the previous articles on Azure
Databricks and Overwatch analysis. This section focuses on the role-based
access control required for the setup and deployment of Overwatch.
The use of a storage account as a working directory for
Overwatch implies that it will need to be accessed from the databricks
workspace. There are two ways to do this – one that involves the azure active
directory credentials passthrough with ‘abfss@container.storageaccount.dfs.core.windows.net’
name resolution and another that mounts the remote storage account as a folder
on the local file system.
The former requires that the cluster be enabled for active
directory credentials passthrough and will work for directly resolving the
deployment and reports folder but for contents whose layout is dynamically
determined, the resolution is expensive each time. The abfss scheme also fails
with error 403 when there are tokens demanded for certain activities. Instead, the
second way of mounting helps with one time setup. The mount is setup with the
help of a service principal and getting OAuth tokens from the active directory.
It becomes the prefix for all the temporary files and folders.
Using the credentials with the Azure Active Directory only
works when there are corresponding role assignments and container/blob access
control lists. The role assignment for the control plane differs from that of
the data plane so there are roles for both. This separation of roles allows
access to certain containers and blobs without necessarily allowing access to
change the storage account and container organization or management. With acls
applied to individual files/blobs and folders/container, the authentication-authorization-auditing
is completely covered and scoped at the finest granularity.
Then queries like the following can come very helpful:
1. Frequent
operations can be queried with:
StorageBlobLogs
|
where TimeGenerated > ago(3d)
|
summarize count() by OperationName
|
sort by count_ desc
|
render piechart
2. High
latency operations can be queried with:
StorageBlobLogs
|
where TimeGenerated > ago(3d)
|
top 10 by DurationMs desc
|
project TimeGenerated, OperationName, DurationMs, ServerLatencyMs,
ClientLatencyMs = DurationMs – ServerLatencyMs
3. Operations
causing the most error are caused by:
StorageBlobLogs
|
where TimeGenerated > ago(3d) and StatusText !contains "Success"
|
summarize count() by OperationName
|
top 10 by count_ desc
4.
Gives the number of read transactions
and the number of bytes read on each container:
StorageBlobLogs
| where
OperationName == "GetBlob"
| extend
ContainerName = split(parse_url(Uri).Path, "/")[1]
| summarize ReadSize
= sum(ResponseBodySize), ReadCount = count() by tostring(ContainerName)
No comments:
Post a Comment