Tuesday, October 3, 2023

 

This is a continuation of previous articles on Azure Databricks and Overwatch observability:

One of the frequent usages of Overwatch’s dashboard is to view trends and plots from the data collected. The dashboards that come from Overwatch provide a detailed set of charts under the Workspace, Clusters, Jobs, and Notebooks categories but the tables and custom SQL queries can empower creating new and advanced charts that suit specific business requirements. The following are some dimensions that a comprehensive dashboard for an organization’s databricks workspace monitoring must show, from a best practice perspective.

1. Databricks workload types:

 - Jobs Compute for data engineers

 - Jobs Light Compute for data analysts

 - All Purpose Compute (backwards compatible to execute jobs)

 

2. Consumption based:

 - DBUs

 - Virtual Machines

 - Public IP addresses

 - Blob Storage

 - Managed Disk

 - Bandwidth

 

3. Pricing plans

  - Pay as you go

  - Reservations - DBU/DBCU 1/3 years

      - dbu sku

      - vm sku

      - dbu count for each vm

      - region

      - duration

 

4. Tags based:

  - Cluster Tags

  - Pool Tags

  - Workspace Tags

  Tags can propagate with

  a. clusters created from pools

  - DBU Tag = Workspace Tag + Pool Tag + Cluster Tag

  - VM Tag = Workspace Tag + Pool Tag

  b. clusters not from pools

  - DBU Tag = Workspace Tag + Cluster Tag

  - VM Tag = Workspace Tag + Cluster Tag

 

5. Cost calculation:

   Quantity = Number of Virtual Machines x Number of hours x DBU count

   Effective Price = DBU price based on the SKU

   Cost = Quantity x Effective Price

   Effective Cost = Organizational markup factor * Cost

 

Cost/Usage Dashboard - get started in Azure Portal:

   Cost Management + Billing

   Cost Management + Cost analysis Tab

 

Cost/Usage Dashboard – get started in Dashboards on Databricks workspace hosting Overwatch:

Sample query:

select sku, isActive, any_value(contract_price) * count(*) as cost from overwatch.`dbucostdetails`

group by sku, isActive

having isActive = true;

 

sku        isActive cost

jobsLight            True      0.30000000000000004

interactive          True      1.6500000000000001

sqlCompute        True      0.66

automated         True      0.30000000000000004

No comments:

Post a Comment