Cluster computing

Tuesday, October 10, 2023

This is a summary of the book “How to stay smart in a smart world – why human intelligence still beats algorithms” written by Gerd Gigerenzer and published by MIT Press 2022. He is a psychologist known for his work on bounded rationality and directs the Harding Center for Risk Literacy at the University of Potsdam. He is also a partner at Simply Rational – The Decision Institute.

Recent advances in artificial intelligence have juxtaposed a different form of intelligence to ours and poses a question about the role of either intelligence. With the spectrum of reactions ranging from embracing it openly to being apprehensive about its prevalence or dominance, the author picks out a cautious approach playing on the strengths and avoiding the weaknesses. With several examples and case studies, he argues that one form of intelligence works well in stable environments with well-defined rules while the other will never lose its relevance outside that world.

The salient points from this book include assertions that AI excels in stable environments and follows rules dictated by humans, AI systems don’t perform well in dynamic environments, filled with uncertainty. Humans must try out AI to get best results. In unexplored territory, simple and transparent algorithms perform better than complex ones. Among the negative impacts, ad-based model from social media platforms can be cited. It’s possible to separate human interaction with machine supervision with clear demarcation. For example, self-driving cars could be given their own dedicated lanes where possible. Market hype and profit incentives can lead companies to overcompromise and underdeliver on digital technologies.

AI wins hands down in many games such as chess, Go etc because it learns the game rules that are fixed, it is tuned by human experts and uses brute calculation to determine the best possible move. The better defined and more stable the premise, the better the performance. The flip side is self-evident with facial recognition for instance that works 99.6% of the time. In dynamic environments, the number drops significantly. When UK police scanned the faces of 170000 soccer fans in a stadium for matches with criminal database, 93% of the matches were false.

AI is good at making correlations with huge amounts of data, even some that would have escaped humans, but it cannot recognize scenarios and deal with ambiguity. For example, Maine’s divorce rate and the United States’ per capita consumption of margarine have a significant correlation but it makes no sense. Its these false findings by AI that makes them even harder to replicate leading to a lot of waste and error in areas such as health science and biotechnology and to the tune of hundreds of billions of dollars. Assertions made today such as eat blueberries to prevent memory loss, eat bananas to get higher verbal SAT score, eat kiwis late at night to sleep better etc may just be the opposite in due time.

Whenever the effectiveness of AI decreases, human intervention can significantly boost their performance. The human brain has a remarkable ability to adapt to constantly changing cues, contexts and situations in what is termed as vicarious functioning. Staying smart means leveraging singularity capabilities but staying in charge. AI lacks four components of common sense – a capacity to think casually, an awareness of others’ intentions and feelings, a basic understanding of space, time and objects, and a longing to join in group norms. Some tasks like recommending the nearest restaurant do not need common sense but the detection of a person crossing the road in a war zone as a threat requires it.

Complex problems do not justify complex solutions. Google Flu trends tried to predict the spread of flu with approximately 160 search terms but they still overpredicted doctors’ visits. In comparison, an algorithm from Max Planck Institute for human development simply used one data point: recent visits to the doctor from the CDC website and performed much better in predicting the flu’s spread.

Information when served subliminally or unknowingly have potential to alter our behavior. This is why ad-based model for social media can be harmful by creating distractions. With attention control technology, the user is held captive by these algorithms. Texting while driving has caused 3000 deaths per year in the United States between 2010 and 2020. In areas other than driving, smartphones have proven to be very distracting.

Finally, the business aspect of artificial intelligence must be realized in the context of historical trends with killer technologies and the commerce behind it. The author says we should be able to profit from AI but not be easily misled with expectations and predictions.

Earlier book summaries: BookSummary10.docx

Monday, October 9, 2023

Locking:

Resources can be locked to prevent unexpected changes. A subscription, resource group or resource can be locked to prevent other users from accidentally deleting or modifying critical resources. The lock overrides any permissions the users may have. The lock level can be set to CannotDelete or ReadOnly with ReadOnly being more restrictive. Lock inheritance can be applied at a parent scope, all resources within that scope can then inherit the same lock. Some considerations still apply after locking. For example, a CannotDelete lock on a storage account does not prevent data within that account from being deleted. A read only lock on an application gateway prevents you from getting the backend health of the application gateway because it uses POST. Only Owner and User Access Administrator role members are granted access to Microsoft.Authorization/locks/* actions.

When the IaC is applied, it can be quite frustrating to find the resources locked in the public cloud and preventing the IaC actions to complete. For example, a resource might have a private endpoint which in turn might be associated with a DNS and have a private NIC card and these sub-resources might be locked that prevents the private endpoint from being deleted which in turns fails the IaC application. The resolution for the owner of the subscription is to delete the lock from the said resource via the Azure Portal or the command-line interface and then proceed to apply the locks. And iterate over the ‘apply’ and the ‘unlock’ steps until there are no further obstructions.

While this works for the role with the elevated privileges, many developers using the credentials for the CI/CD pipeline to make changes to the subscription do not have that privilege and might find the experience harrowing to resolve without external intervention. One way that they overcome this unlocking is by applying the unlock commands via a pipeline step prior to the application of the IaC. Fortunately, there are ways to unlock at a global subscription level scope rather than at a resource-by-resource level. Even so, it might not be clear when the locks reappear, and the unlocking might need to be repeated. Checking the policies to make sure that the locking is not enforced automatically, which in turn interferes with the infrastructure changes by code, is a good practice and one that can potentially advise about the intent behind the locking. If the locking were simply to prevent accidental deletions against a broad range of resources, then the unlocking is straightforward for the applying of the changes

Let us make a specific association between say a firewall and a network resource such as a gateway. The firewall must be associated with the gateway to prevent traffic flow through that appliance. When they remain associated, they remember the identifier and the state for each other. Initially, the firewall may remain in detection mode where it is merely passive. It becomes active in the prevention mode. When the modes are attempted to be toggled, the association prevents it. Neither end of the association can tell what state to be in without exchanging information and when they are deployed or updated in place, neither knows about nor informs the other.

The above resolutions are easy when the error messages are descriptive and indicate that the failure of the IaC is exclusively due to locks. There are other forms of errors where the cause may not be straightforward. In such cases, the activity log on the resources or at the subscription level can be quite helpful when the json content of a logged event explains exactly what happened. This particular feature is also helpful to know if something transpired by actions of something other than the deployment of the infrastructure changes.

Sunday, October 8, 2023

These are some more additions to the common errors faced during the authoring and deployment of Infrastructure-as-Code aka IaC artifacts along with their resolutions:

First, resources might pass the identifier of one to another by virtue of one being created before the other and in some cases, these identifiers might not exist during compile time. For example, the code that requires to assign an rbac based on the managed identity of another resource might not have it during compile time and only find it when it is created during execution time. The rbac IaC will require a principal _id for which the managed identity of the resource created is required. This might require two passes of the execution – one to generate the rbac principal id and another to generate the role assignment with that principal id.

The above works for newly created resources with two passes but it is still broken for existing resources that might not have an associated managed identity and the rbac IaC tries to apply a principal id when it is empty. In such cases, no matter how many times the role-assignment is applied, it will fail due to the incorrect principal id. In this case, the workaround is to check for the existence of the principal id before it is applied.

A second type of case occurs when the application requires ip address to be assigned for explaining the elaborate firewall rules required based on ip address value rather than references and the ip address is provisioned in the portal before the IaC is applied. This IaC then requires to import the existing pre-created ip address into the state so that the IaC and the state match.

Third, there may be objects in the Key Vault that were created as part of the prerequisites for the IaC deployment and now their ids need to be reconciled with the IaC. Again, the import of that resource into the state would help the IaC provider to reconcile the actual with the expected resource.

Fourth, the friendly names are often references to actual resources that may have long been dereferenced, orphaned, changed, expired, or even deleted. The friendly names, also called keys, are just references and hold value to the author in a particular context but the same author might not guarantee that the moniker is in fact consistently used unless there are some validations and review involved.

Fifth, there are always three stages between design and deploy of Infrastructure-as-code which are “init”, “plan” and “apply” and they are distinct. Success in one stage does not guarantee success in the other stage especially holding true between plan and apply stages. Another limitation is that the plan can be easily validated on the development machine but the apply stage can be performed only as part of pipeline jobs in commercial deployments. The workaround is to scope it down or target a different environment for applying.

Sixth, the ordering and sequence can only be partially manifested with corresponding attributes to explain dependencies between resources. Even if resources are self-descriptive, combination of resources must be carefully put-together by the system for a deterministic outcome.

These are only some of the articulations for the carefulness required for developing and deploying IaC.

Saturday, October 7, 2023

This is a summary of the book “How to say it for First-Time Managers” written by Jack Griffin and published by Prentice Hall Press, 2010. This book teaches winning words and strategies for earning your team’s confidence.

Managers must be able to communicate with their reports. If newly appointed managers can’t communicate their ideas, directions and instructions, the areas they supervise will fall apart. By paying attention to what needs to be said and how and when it needs to be said, this books provides invaluable advices to newbies. The author suggests the best words to use and those to avoid and even the body language that an inexperienced manager must adopt.

The language of leadership is both verbal and non-verbal. Effective leadership requires effective communication. The best posture is one that imparts a sense of relaxed energy. The eyes must be wide open during direct communications. Fidgeting or yawning must be avoided. Signaling an engagement by nodding or leaning forward is necessary. Eyes, ears, or nose must not be rubbed because they signal doubt. Similarly, scratching your head signals confusion. Smiling is very helpful.

Leadership language fluency helps new managers establish authority and credibility. The language of business concerns money and time. Words explain, motivate, encourage, discourage, inspire, depress, demand, invite, guide, mislead, clarify, confuse, hearten, and terrify. The author mentions ten touchstones for day-to-day communications which include 1. Accountability where someone is responsible for something, 2. Collaboration where teamwork is essential to business, 3. Decisions where conflicts are resolved and trade-offs are balanced, 4. Ethics for guarding against falls, 5. Evaluations for making value judgements, 6. Excellence – for leading the reports to high-quality work, 7. Learning to involve distilling knowledge from experience, 8. Mission – for a well defined sense of purpose, 9. Performance for continuous improvement, and 10. Quality for business that can succeed with excellence.

“Every Manager needs a useful, effective, and productive vocabulary.” Part of the vocabulary builds with “active listening” because by repeating what the other person says, co-operation is earned. Avoiding shaking the head that signals a rejection, keeping eye contact for the person to feel engaged, never lowering the chin because it signals defensiveness and avoiding or alleviating “rapid breathing” because it suggests anxiety, are some of the ways in which negatives can be balanced.

On the first day as a manager, always speak from knowledge, says the author and if there is doubt, not to say anything. Plan how to conduct the meetings, the preamble, body or the epilogue. Pausing before speaking can imply confidence and self-assuredness. Focusing on what one is going to say is mutually helpful to the speaker and the listener.

Clarity in written and spoken communication depends on speaking to the point and staying focused. The five W’s approach delineating who, what, when, where, and why can help in this regard. Using a step-by-step format in chronological order is much better than a long narrative. All rules, policies and procedures must be written out. Do not delegate work by starting out with a pep talk. Goals must be specified in the order where the intent is laid out, the benefits explained, the fit within the big picture, the reachability of the goal discussed, calling out the tasks that are necessary, delegating those tasks, and explaining what and when a task must be completed. Praise is much better than criticism for motivation but give it with a story. Supportive words include reset, overcome, self-starter, and retry. Negative responses must be provided with an explanation. Meetings must have agenda; it must never be a monologue and ideas must be requested. Ideas must also be examined.

Friday, October 6, 2023

Azure application gateway and app services are created for access from the public internet. When organizations want to take these resources private, they often struggle to maintain business continuity with their own network structures, rules and the limitations and errors when attempting to wire them together. This article explains how these resources can be effectively made private with little or no disruptions.

Both these resources are complicated with many features and configurations feasible. Even the networking section provides many choices under incoming and outgoing sections. Some of the encountered and dreaded errors are 403 and 502. Code hosted in the app service might find that they are able to connect to a store or event hub if they have vnet integration and they might want to have a private dedicated connection with another resource or network, yet when these options are added they have requirements different from one another. For example, to create a private endpoint, the private endpoint network policies must be disabled, the subnet must have no delegation and must have available IP addresses. Disabling the private endpoint network policies might be hard to find on the Management Portal User Interface when the endpoints are created, they must be associated with the privatelink.azurewebsites.net dns zone for them to be reached from other resources. Certain subnets cannot be used simply because they have a conflicting resource already placed there. The private endpoint and the vnet integration must not share the same network.

Consequently, the approach of taking a resource private requires the organization to pre-create subnets and even a DNS zone specifically for ‘privatelink.azurewebsites.net.’ Then the other resources must be connected to the app service. In the case of application gateway, it requires a DNS zone group to be created so that the application gateway can resolve the app services by their names. This step is often overlooked after the endpoints are created on the app services. Similarly private virtual links must be created.

It is in the interest of the deployment to create a single unified virtual network on which all the resources and their networks are placed. Often distinct virtual networks aka vnets result from independent initiatives, and they require peering or links to be established. The same is true when creating too many subnets because they exhaust the IP address ranges which are often underutilized. The connected devices to subnets have their IP address in the subnet’s CIDR and this information comes handy to know which subnets are unused and can be reused for other purposes. Once the subnet and vnet are created, then the options to add network security groups and gateways can be decided. The traffic from the virtual networks and subnets are hard to visualize but by enumerating the resources and their default route to the internet, it is possible to place the gateways appropriately. Otherwise those resources might not have outbound internet connectivity.

Finally, for the application gateway to be allowed access resources and networks as its backend pool members, its address must be allowed on all the access restrictions of those resources and networks.

A working example of this description is available here: network4apps.zip

Wednesday, October 4, 2023

This is a summary of the book “Think Bigger – How to Innovate” by Sheena Iyengar who is a professor of Business in the Management Division at Columbia Business School and teaches choice and decision-making.

This book builds on decades of research on creativity and human psychology and models the real-life creativity process in six specific and actionable steps. It provides a structure for a rigorous idea generation and vetting, from corporate teams to individual artists and entrepreneurs.

She argues that creativity is not a rare and innate gift. The popular distinction between left and right brained people is also incorrect. Creativity is also not a particular type of brain activity. When it is broken down, creativity appears familiar to everyone as building blocks. It is also a skill that we can learn and practice. The killer applications, groundbreaking artwork, disruptive business ideas are all the end results of the same process. Creators recycle existing parts to create something novel. “All thinking is an act of memory in some form.”

The Think Bigger process builds on Learning + Memory, the leading neuroscientific model of the brain. This theory places memory at the center of the human’s mental activity. It argues that even solving a math problem is not purely logical but involves remembering and recombining those memories to find the answer. Going to the point of attributing the quality of an idea to be proportionate to the memories stored on the shelves of the brain, it describes innovation as cognitive tools that we already possess.

Prior research have emphasized the following areas: personal qualities such as curiosity and persistence, workspace where an optimal space, with no distractions, still fosters casual connections with others, structure when people face too many options, and going solo when individuals produce more unique ideas alone than in a group. We can complete each step of Think Bigger on our own before discussing it with others.

Innovation starts by identifying a problem we are motivated to and can feasibly solve. Without a problem, there is a long list of creations that all failed. If we are struggling to define a problem, then taking daily notes may spark a sense of purpose. Phrasing a problem in terms of a question that begins with How is one of the classic ways of getting started with a problem.

With a problem, we can then break it down into parts that we can gather input from experts, potential users and non-experts. As these generate leads for thinking bigger, we move on to the next step when we have clarity over 80% of the problem space.

A good solution satisfies the requirements of the target audiences, the interest from the third-party stakeholders and the desires of the innovator. These three groups are essential for the solution and might warrant different ways of going about them. Articulating one’s own desires in writing while interviewing target audience and stakeholders, we build a list of three to five key wants for each group.

Next, we structure the solution by using a Choice Map and Big Picture score. “The best way to think outside the box is to literally go into other boxes.” We split the search for solutions to sub-problems in two areas: “in domain” and “out of domain”. When the choice map is filled out, we are ready to start combining tactics to find an overall solution.

Before committing to our idea, we must learn how others react to it. By explaining to others, we change, refine, or expand our idea. There are four feedback exercises.

The first is verbalization. Describing the idea to ourselves by reading and writing may be enough to change the way we see it. Describing it to others almost certainly will.

The second exercise gathers experts’ reactions. After describing the problem, the solution and its significance, we ask neutral questions like how we improve our idea.

The third exercise gauges whether others’ impressions of our idea align with our own. Asking non-experts to say it back to us but give it some time to check what they recollect.

The final exercise is to describe the solution again but giving our listeners free rein to reimagine our idea. Their answers will lead to further insights and possibilities.

Software for summarizing text: https://booksonsoftware.com/text/

Tuesday, October 3, 2023

This is a continuation of previous articles on Azure Databricks and Overwatch observability:

One of the frequent usages of Overwatch’s dashboard is to view trends and plots from the data collected. The dashboards that come from Overwatch provide a detailed set of charts under the Workspace, Clusters, Jobs, and Notebooks categories but the tables and custom SQL queries can empower creating new and advanced charts that suit specific business requirements. The following are some dimensions that a comprehensive dashboard for an organization’s databricks workspace monitoring must show, from a best practice perspective.

1. Databricks workload types:

- Jobs Compute for data engineers

- Jobs Light Compute for data analysts

- All Purpose Compute (backwards compatible to execute jobs)

2. Consumption based:

- DBUs

- Virtual Machines

- Public IP addresses

- Blob Storage

- Managed Disk

- Bandwidth

3. Pricing plans

- Pay as you go

- Reservations - DBU/DBCU 1/3 years

- dbu sku

- vm sku

- dbu count for each vm

- region

- duration

4. Tags based:

- Cluster Tags

- Pool Tags

- Workspace Tags

Tags can propagate with

a. clusters created from pools

- DBU Tag = Workspace Tag + Pool Tag + Cluster Tag

- VM Tag = Workspace Tag + Pool Tag

b. clusters not from pools

- DBU Tag = Workspace Tag + Cluster Tag

- VM Tag = Workspace Tag + Cluster Tag

5. Cost calculation:

Quantity = Number of Virtual Machines x Number of hours x DBU count

Effective Price = DBU price based on the SKU

Cost = Quantity x Effective Price

Effective Cost = Organizational markup factor * Cost

Cost/Usage Dashboard - get started in Azure Portal:

Cost Management + Billing

Cost Management + Cost analysis Tab

Cost/Usage Dashboard – get started in Dashboards on Databricks workspace hosting Overwatch:

Sample query:

select sku, isActive, any_value(contract_price) * count(*) as cost from overwatch.`dbucostdetails`

group by sku, isActive

having isActive = true;

sku isActive cost

jobsLight True 0.30000000000000004

interactive True 1.6500000000000001

sqlCompute True 0.66

automated True 0.30000000000000004