Cluster computing

Friday, October 6, 2023

Azure application gateway and app services are created for access from the public internet. When organizations want to take these resources private, they often struggle to maintain business continuity with their own network structures, rules and the limitations and errors when attempting to wire them together. This article explains how these resources can be effectively made private with little or no disruptions.

Both these resources are complicated with many features and configurations feasible. Even the networking section provides many choices under incoming and outgoing sections. Some of the encountered and dreaded errors are 403 and 502. Code hosted in the app service might find that they are able to connect to a store or event hub if they have vnet integration and they might want to have a private dedicated connection with another resource or network, yet when these options are added they have requirements different from one another. For example, to create a private endpoint, the private endpoint network policies must be disabled, the subnet must have no delegation and must have available IP addresses. Disabling the private endpoint network policies might be hard to find on the Management Portal User Interface when the endpoints are created, they must be associated with the privatelink.azurewebsites.net dns zone for them to be reached from other resources. Certain subnets cannot be used simply because they have a conflicting resource already placed there. The private endpoint and the vnet integration must not share the same network.

Consequently, the approach of taking a resource private requires the organization to pre-create subnets and even a DNS zone specifically for ‘privatelink.azurewebsites.net.’ Then the other resources must be connected to the app service. In the case of application gateway, it requires a DNS zone group to be created so that the application gateway can resolve the app services by their names. This step is often overlooked after the endpoints are created on the app services. Similarly private virtual links must be created.

It is in the interest of the deployment to create a single unified virtual network on which all the resources and their networks are placed. Often distinct virtual networks aka vnets result from independent initiatives, and they require peering or links to be established. The same is true when creating too many subnets because they exhaust the IP address ranges which are often underutilized. The connected devices to subnets have their IP address in the subnet’s CIDR and this information comes handy to know which subnets are unused and can be reused for other purposes. Once the subnet and vnet are created, then the options to add network security groups and gateways can be decided. The traffic from the virtual networks and subnets are hard to visualize but by enumerating the resources and their default route to the internet, it is possible to place the gateways appropriately. Otherwise those resources might not have outbound internet connectivity.

Finally, for the application gateway to be allowed access resources and networks as its backend pool members, its address must be allowed on all the access restrictions of those resources and networks.

A working example of this description is available here: network4apps.zip

Wednesday, October 4, 2023

This is a summary of the book “Think Bigger – How to Innovate” by Sheena Iyengar who is a professor of Business in the Management Division at Columbia Business School and teaches choice and decision-making.

This book builds on decades of research on creativity and human psychology and models the real-life creativity process in six specific and actionable steps. It provides a structure for a rigorous idea generation and vetting, from corporate teams to individual artists and entrepreneurs.

She argues that creativity is not a rare and innate gift. The popular distinction between left and right brained people is also incorrect. Creativity is also not a particular type of brain activity. When it is broken down, creativity appears familiar to everyone as building blocks. It is also a skill that we can learn and practice. The killer applications, groundbreaking artwork, disruptive business ideas are all the end results of the same process. Creators recycle existing parts to create something novel. “All thinking is an act of memory in some form.”

The Think Bigger process builds on Learning + Memory, the leading neuroscientific model of the brain. This theory places memory at the center of the human’s mental activity. It argues that even solving a math problem is not purely logical but involves remembering and recombining those memories to find the answer. Going to the point of attributing the quality of an idea to be proportionate to the memories stored on the shelves of the brain, it describes innovation as cognitive tools that we already possess.

Prior research have emphasized the following areas: personal qualities such as curiosity and persistence, workspace where an optimal space, with no distractions, still fosters casual connections with others, structure when people face too many options, and going solo when individuals produce more unique ideas alone than in a group. We can complete each step of Think Bigger on our own before discussing it with others.

Innovation starts by identifying a problem we are motivated to and can feasibly solve. Without a problem, there is a long list of creations that all failed. If we are struggling to define a problem, then taking daily notes may spark a sense of purpose. Phrasing a problem in terms of a question that begins with How is one of the classic ways of getting started with a problem.

With a problem, we can then break it down into parts that we can gather input from experts, potential users and non-experts. As these generate leads for thinking bigger, we move on to the next step when we have clarity over 80% of the problem space.

A good solution satisfies the requirements of the target audiences, the interest from the third-party stakeholders and the desires of the innovator. These three groups are essential for the solution and might warrant different ways of going about them. Articulating one’s own desires in writing while interviewing target audience and stakeholders, we build a list of three to five key wants for each group.

Next, we structure the solution by using a Choice Map and Big Picture score. “The best way to think outside the box is to literally go into other boxes.” We split the search for solutions to sub-problems in two areas: “in domain” and “out of domain”. When the choice map is filled out, we are ready to start combining tactics to find an overall solution.

Before committing to our idea, we must learn how others react to it. By explaining to others, we change, refine, or expand our idea. There are four feedback exercises.

The first is verbalization. Describing the idea to ourselves by reading and writing may be enough to change the way we see it. Describing it to others almost certainly will.

The second exercise gathers experts’ reactions. After describing the problem, the solution and its significance, we ask neutral questions like how we improve our idea.

The third exercise gauges whether others’ impressions of our idea align with our own. Asking non-experts to say it back to us but give it some time to check what they recollect.

The final exercise is to describe the solution again but giving our listeners free rein to reimagine our idea. Their answers will lead to further insights and possibilities.

Software for summarizing text: https://booksonsoftware.com/text/

Tuesday, October 3, 2023

This is a continuation of previous articles on Azure Databricks and Overwatch observability:

One of the frequent usages of Overwatch’s dashboard is to view trends and plots from the data collected. The dashboards that come from Overwatch provide a detailed set of charts under the Workspace, Clusters, Jobs, and Notebooks categories but the tables and custom SQL queries can empower creating new and advanced charts that suit specific business requirements. The following are some dimensions that a comprehensive dashboard for an organization’s databricks workspace monitoring must show, from a best practice perspective.

1. Databricks workload types:

- Jobs Compute for data engineers

- Jobs Light Compute for data analysts

- All Purpose Compute (backwards compatible to execute jobs)

2. Consumption based:

- DBUs

- Virtual Machines

- Public IP addresses

- Blob Storage

- Managed Disk

- Bandwidth

3. Pricing plans

- Pay as you go

- Reservations - DBU/DBCU 1/3 years

- dbu sku

- vm sku

- dbu count for each vm

- region

- duration

4. Tags based:

- Cluster Tags

- Pool Tags

- Workspace Tags

Tags can propagate with

a. clusters created from pools

- DBU Tag = Workspace Tag + Pool Tag + Cluster Tag

- VM Tag = Workspace Tag + Pool Tag

b. clusters not from pools

- DBU Tag = Workspace Tag + Cluster Tag

- VM Tag = Workspace Tag + Cluster Tag

5. Cost calculation:

Quantity = Number of Virtual Machines x Number of hours x DBU count

Effective Price = DBU price based on the SKU

Cost = Quantity x Effective Price

Effective Cost = Organizational markup factor * Cost

Cost/Usage Dashboard - get started in Azure Portal:

Cost Management + Billing

Cost Management + Cost analysis Tab

Cost/Usage Dashboard – get started in Dashboards on Databricks workspace hosting Overwatch:

Sample query:

select sku, isActive, any_value(contract_price) * count(*) as cost from overwatch.`dbucostdetails`

group by sku, isActive

having isActive = true;

sku isActive cost

jobsLight True 0.30000000000000004

interactive True 1.6500000000000001

sqlCompute True 0.66

automated True 0.30000000000000004

Monday, October 2, 2023

Sample customized queries for dashboard visualizations from Overwatch schema:

1. select sku, isActive, any_value(contract_price) * count(*) as cost from overwatch.`dbucostdetails`

group by sku, isActive

having isActive = true;

sku isActive cost

jobsLight True 0.30000000000000004

interactive True 1.6500000000000001

sqlCompute True 0.66

automated True 0.30000000000000004

2. SELECT created_by, count(*) FROM (SELECT DISTINCT cluster_id, created_by FROM overwatch.`cluster`)

GROUP BY created_by

ORDER BY count(*) desc

limit 1000;

created_by count(1)

JobsService 20051

User1 13

User2 13

User3 6

User4 3

User5 2

User6 1

3. SELECT cluster_id, SUM(uptime_in_state_S) as uptime FROM overwatch.clusterstatefact

GROUP BY cluster_id

ORDER BY uptime DESC

limit 1000;

cluster_id uptime

0822-134022-ssn7p7zy 2656586.3910000008

0909-211040-g7gw6ze 2655716.523000001

0914-142202-nx0u3s1a 2634530.8240000005

0907-170325-qf4ypd19 2611126.8639999996

0109-204324-dba1c5o 2602285.5589999994

0831-160354-2gds4r56 2601205.147000001

0728-171334-wqfvw8lm 2599745.636

1220-150950-1xfqwfeq 2533890.514

0828-204151-rqw3um2a 1986805.3609999998

0302-190420-h8rv9prn 1983515.9470000002

0803-144506-g98h4fl2 1975430.0520000001

0908-095703-w31xe9fb 1842740.3310000005

0917-185549-g4n3dqjl 1052153.248

0918-031805-t3zdjacw 1002694.213

4. SELECT created_by, sum(total_dbu_cost) as sum_dbu_cost FROM

(SELECT distinct cluster_id, job_id, created_by, terminal_state, total_dbu_cost from overwatch.jobruncostpotentialfact where terminal_state = "Succeeded")

GROUP BY created_by

HAVING created_by != 'null'

ORDER BY sum_dbu_cost desc

limit 1000;

created_by sum_dbu_cost

User1 253.60490000000007

User2 83.07065199999978

User3 80.84025400000019

User4 58.004314

User5 56.34171099999961

User6 49.40466399999997

User7 12.238729

User8 2.528845

User9 1.4531079999999597

User10 0.4258950000000001

User11 0.30644

User12 0.17414799999999972

Sunday, October 1, 2023

Network for applications

Both these resources are complicated with many features and configurations feasible. Even the networking section provides many choices under incoming and outgoing sections. Some of the encountered and dreaded errors are 403 and 502. Code hosted in the app service might find that they are able to connect to a store or event hub if they have vnet integration and they might want to have a private dedicated connection with another resource or network, yet when these options are added they have requirements different from one another. For example, to create a private endpoint, the private endpoint network policies must be disabled, the subnet must have no delegation and must have available IP addresses. Disabling the private endpoint network policies might be hard to find on the Management Portal User Interface When the endpoints are created, they must be associated with the privatelink.azurewebsites.net dns zone for them to be reached from other resources. Certain subnets cannot be used simply because they have a conflicting resource already placed there. The private endpoint and the vnet integration must not share the same network.

Consequently, the approach of taking a resource private requires the organization to pre-create subnets and even a DNS zone specifically for ‘privatelink.azurewebsites.net’. Then the other resources must be connected to the app service. In the case of application gateway, it requires a DNS zone group to be created so that the application gateway can resolve the app services by their names. This step is often overlooked after the endpoints are created on the app services Similarly private virtual links must be created.

It is in the interest of the deployment to create a single unified virtual network on which all the resources and their networks are placed. Often distinct virtual networks aka vnets result from independent initiatives, and they require peering or links to be established. The same is true when creating too many subnets because they exhaust the IP address ranges which are often underutilized. The connected devices to subnets have their IP address in the subnet’s CIDR and this information comes handy to know which subnets are unused and can be reused for other purposes. Once the subnet and vnet are created, then the options to add network security groups and gateways can be decided. The traffic from the virtual networks and subnets are hard to visualize but by enumerating the resources and their default route to the internet, it is possible to place the gateways appropriately Otherwise those resource might not have outbound internet connectivity.

A working example of this description is available here: network4apps.zip

Saturday, September 30, 2023

This is a continuation of a previous article on the use of Artificial Intelligence and Product Development. This article talks about the bias against AI as outlined in reputed journals.

A summary of the bias against AI is that some of it comes from inaccurate information from generative AI. Others come from the bias served up by the AI tools. These are overcome with a wider range of datasets. AI4ALL for instance, works to feed AI a broad range of content to be more inclusive of the world. Another concern has been over-reliance on AI. A straightforward way to resolve this is to balance the use of AI with those requiring skilled supervision.

The methodical approach to managing bias involves three steps: First, data and design must be decided. Second, outputs must be checked and third, problems must be monitored.

Complete fairness is impossible due in part to decision-making committees not being adequately diverse and choosing the acceptable threshold for fairness and determining whom to prioritize being challenging. This makes the blueprint for fairness in AI across the board for companies and situations to be daunting. An algorithm can check whether there is adequate representation or weighted threshold, and this is in common use but unless equal numbers of each class is included in the input data, these selection methods are mutually exclusive. The choice of approach is critical. Along with choosing the groups to protect, a company must determine what the most important issue is to mitigate. Differences could stem from the sizes of the group or accuracy rate between the groups. The choices might result in a decision tree where the decisions must align with the company policy.

Missteps remain common. Voice recognition, for example, can leverage AI to reroute sales calls but might be prone to failures with regional accents. In this case, fairness could be checked by creating a more diverse test group. The final algorithm and its fairness tests need to consider the whole population and not just those who made it past the early hurdles. Model designers must accept that data is imperfect.

The second step of checking outputs involves checking fairness by way of intersections and overlaps in data types. When companies have good intentions, there’s a danger that an ill-considered approach can do more harm than good. An algorithm that is deemed neutral can still result in disparate impact on different groups. One effective strategy is a two-model solution such as the generative adversarial networks approach. This is a balanced approach between the original model and a second model where one checks for individual’s fairness. They converge to produce a more appropriate and fair solution.

The third step is to create a feedback loop. Frequently examining the output and looking for suspicious patterns on an ongoing basis, especially where the input progresses with time, is important. Since bias goes unnoticed usually, this can catch it. A fully diverse outcome can look surprising, so people may reinforce bias when developing AI. This is evident in rare events where people may object to its occurrence and might not object if it fails to happen. A set of metrics such as precision and recall can be helpful. Predictive factors and error rates are affected. Ongoing monitoring can be rewarding. For example, demand forecasting by adapting to changes in data and correction in historical bias can show improved accuracy.

A conclusion is that bias may not be eliminated but it can be managed.

Friday, September 29, 2023

This is a summary of the book titled “The Power of Not Thinking” by Simon Roberts who is a business anthropologist and describes embodied knowledge that is not inculcated into Artificial Intelligence. Embodied knowledge derives from the body through the movement, muscle memory, sight, hearing, taste, smell, and touch. It includes experiences that evoke deep sensory memories that allow us to take actions without thought and pattern recognition. These embedded memories enable us to feel, rather than merely reason our way through many decisions. He makes a case for companies to pair data with experiential learning.

A long tradition has created a dichotomy between mind and body where thinking is part of the brain, but we learn in ways different from computers. He takes the example of driving and explains that we take the wheel, feel the road, engage both our body and brain and our common sense, and master it over time until we can engage in it on autopilot. AI on the other hand is dependent on sensors and recognizing patterns, processing them in milliseconds and responding immediately. Neither can cope with every driving situation, but the more experienced drivers can afford to do so automatically.

The idea that mind and body are different, also called the Cartesian dualism, regards the body as a thing that the mind operates. By dismissing senses and emotions as unreliable inputs, this worldview initiated the scientific method, experimentation, and evidence-based thinking. Human intellect is not merely a product of the brain but also the body’s engagement with the surroundings that forges comprehension of the world. Both the body and the brain gain knowledge. Experience and Routine helps us create embodied knowledge.

Embodied knowledge is acquired through the following five methods:

Observation – this is an experience involving the whole body where for example we feel the grip, hear the racket hitting the ball and trigger the same reactions in the brain and the body when we actually do so.

Practice – We learn to ride a bike by observing others ride. Acquiring new skills like skiing or sailing demands experience, practice, observation, and instruction. With more experience and practice, we can do the activity without thinking.

Improvisation - AI is still governed by supervised learning and big data. On the other hand, judgements based on incomplete information proves crucial. For example, firefighters learn to sense how structures will collapse because they can feel it.

Empathy – is about how another person uses a tool or navigates the world, go beyond reading about it or talking to them

Retention – when we taste or smell, memories flood the mind, demonstrating that recollection resides in the body as well as the brain.

Firms spend a lot to collect and crunch data but through experience, decision makers can better utilize the data. When leaders at Duracell wanted to understand their market for outdoor adventures, they pitched tents in the dark, cooked in the rain, and slept in a range of temperatures. This helped them pair their insights with the data analysis and the resulting campaign was one of the most successful. The author asserts that statistics can tell a relevant story, but they have limited ability to tell a nuanced human story. Policymakers just like business leaders can also benefit from this dual approach and the author provides examples for that as well.

Software developers are improving AI and robots by introducing state read from sequences and they have also found that AI that learns through trials and errors is also able to do better than some of the humans in the most complex games. At the same time, it is our embodiment that makes our intelligence hard to reproduce.