Cluster computing

Thursday, June 6, 2024

This is a continuation of previous articles on IaC shortcomings and resolutions.

As with all automation, it is important to register them in source control so that their maintenance can become easy. It is also important to secure the credentials with which these scripts run. Finally, lockdown of all resources in terms of network access and private planes is just as important as their accessibility for automation.

Many organizations don’t invest in Azure DevOps for a variety of reasons, with attachments to on-premises based automation technology or avoiding public cloud automations for misplaced cost concerns. Other reasons can be genuine though. For example, one of the most common tasks is to download and run an executable instead of calling an API. This is convenient for portability and the same executable can be run using various Azure accounts or subscriptions. But if we take an automation account or runbook, then the downloaded executable might not run because the execution policy cannot be changed. The same goes for hybrid worker and the only way to overcome the limitation is to spin up a dedicated compute and modify it to allow execution. It might be noteworthy to add that compute instance or cluster created on a databricks workspace or Azure machine learning workspace might also not work. Their pass-through active directory authentication works only for notebooks but not the shell on the compute. This kind of limitation extends to other data oriented automations because they in turn leverage the workspaces and do not allow any activity on the scripts or notebooks that these workspaces cannot run.

Such a limitation, on the other hand, does not hold on the GitOps which has been one of the favorites for code delivery pipelines. In the GitOps, it is not only easy to download the executable from say an object repository such as Artifactory but also easy to pass command line parameters whose values are already known to the pipeline. Organizing the automation in GitOps is also fairly straightforward as pipelines become scoped to the code they push.

Finally, there is a lot of maintenance work required with scripts and automations and the use of source control becomes inevitable. Keeping the automations and the code they service as repositories enable them to be shared as appropriate.

These are some of the examples where a cloud native approach might not be straight. When organizations do enable Azure DevOps, they enhance their capabilities which gives them a more secure, manageable and futuristic platform albeit one that dictates rewriting or rehosting the legacy scripts.

Wednesday, June 5, 2024

This is a summary of the book “Grace Under Pressure: Leading through change and crisis” written by John Baldoni and published by Savio Republic in 2023. The author teaches how to maintain composure in stressful situations. The first priority when pressure hits is to make sure our people are okay. Then, make sure we are okay and take a deep breath. We must look as wisely into the future and consider how to prevent today’s stress from compounding. Any decision must not harm our people. He also stresses the importance of courage, compassion, empathy, hope, resilience and selflessness into our lives.

Grace serves us and our stakeholders. We must practice astute situational assessment and intelligent follow-up. Leaders with grace under pressure plan ahead and take care.

Logic helps to find the truth. Our values help to answer the “why” but grace helps to answer “how”. Control what we can. Instead of focusing on “winner takes all” and “winner shares all”. Those who practice this remain resolute.

"Grace under pressure" is a leadership concept that emphasizes maintaining one's coolness in challenging situations. It originated from the words of Ernest Hemingway, who described it as a dependable, composed leader directing a firefight. Grace is a mysterious, spiritual, and magical force that can appear unexpectedly and propel people towards their higher aspirations. High-quality leaders must objectively assess their teams, units, departments, or organizations to plan and implement actions to help their people succeed. They should ask three crucial questions: "What is happening?", "What is not happening?", and "What can I do to influence the outcome?" By designing their investigation to help them and their team take the best steps going forward, leaders can help their teams succeed in challenging situations. However, it is important to remember that sometimes the best group action might be no action at all.

Grace under pressure is a key leadership trait that involves taking care of one's people, taking care of oneself, and planning for the future. These leaders are able to adapt to change and adapt to it, demonstrating integrity, courage, and logical reasoning. They are also humilent, reason-driven, courageous, humorous, and loving.

Grace is essential for leading a meaningful life, as it fuels purpose and becomes the "how" to achieve goals. It enriches connections and creates a sense of community among employees. Leaders who can muster grace under pressure can take their organization in a different direction, fostering connections among their people and building internal cohesion.

Corporate cultures that foster a sense of community can foster connections among their people, helping organizations become a setting for connected communities. Transformations require grace, which involves listening before talking, solving problems, encouraging open communication, instilling hope, banishing fear, and acting forthrightly and with courage.

We must focus on the present, recharge or renew, orient yourself to the future, anchor yourself, and have humility. Leaders must be flexible and adaptable to the unpredictable world they face. We must be careful, deliberate, measured, and reflective in our thoughts and actions, making "mutuality" our watchword. Orienting ourselves to the goal of making things better for those we lead, coaching them to benefit from our values and long-range thinking.

Leaders should also walk behind their people, as stress resilience expert Dr. Sharon Melnick believes leaders don't have time to not do this. Many business-school students are afraid to come across as friendly and civil once they gain authority at work, but a leader with grace knows that they can overcome challenges and become better people. By following these ground rules, leaders can control their actions and achieve success in their organizations.

Grace under pressure leaders maintain resoluteness despite challenges, focusing on dignity and creating a workplace where people feel valued and want to contribute. They communicate effectively, avoid anxiety and fear, remain positive, stress mental health, and engage with team members. To exhibit grace under pressure, leaders should ask three questions: what to do, how to effectively engage with their team, and how their leadership will be portrayed during a crisis.

Summarizing Software: SummarizerCodeSnippets.docx 

Tuesday, June 4, 2024

Subarray Sum equals K

Given an array of integers nums and an integer k, return the total number of subarrays whose sum equals to k.

A subarray is a contiguous non-empty sequence of elements within an array.

Example 1:

Input: nums = [1,1,1], k = 2

Output: 2

Example 2:

Input: nums = [1,2,3], k = 3

Output: 2

Constraints:

• 1 <= nums.length <= 2 * 104

• -1000 <= nums[i] <= 1000

• -107 <= k <= 107

class Solution {

    public int subarraySum(int[] nums, int k) {

        if (nums == null || nums.length == 0) return -1;

        int[] sums = new int[nums.length];  

        int sum = 0;

        for (int i = 0; i < nums.length; i++){

            sum += nums[i];

            sums[i] = sum;

        }

        int count = 0;

        for (int i = 0; i < nums.length; i++) {

            for (int j = i; j < nums.length; j++) {

                int current = nums[i] + (sums[j] - sums[i]);

                if (current == k){

                    count += 1;

                }

        return count;

    }

[1,3], k=1 => 1

[1,3], k=3 => 1

[1,3], k=4 => 1

[2,2], k=4 => 1

[2,2], k=2 => 2

[2,0,2], k=2 => 4

[0,0,1], k=1=> 3

[0,1,0], k=1=> 2

[0,1,1], k=1=> 3

[1,0,0], k=1=> 3

[1,0,1], k=1=> 4

[1,1,0], k=1=> 2

[1,1,1], k=1=> 3

[-1,0,1], k=0 => 2

[-1,1,0], k=0 => 3

[1,0,-1], k=0 => 2

[1,-1,0], k=0 => 3

[0,-1,1], k=0 => 3

[0,1,-1], k=0 => 3

Monday, June 3, 2024

Problem::

Make Array Zero by Subtracting Equal Amounts

You are given a non-negative integer array nums. In one operation, you must:

• Choose a positive integer x such that x is less than or equal to the smallest non-zero element in nums.

• Subtract x from every positive element in nums.

Return the minimum number of operations to make every element in nums equal to 0.

Example 1:

Input: nums = [1,5,0,3,5]

Output: 3

Explanation:

In the first operation, choose x = 1. Now, nums = [0,4,0,2,4].

In the second operation, choose x = 2. Now, nums = [0,2,0,0,2].

In the third operation, choose x = 2. Now, nums = [0,0,0,0,0].

Example 2:

Input: nums = [0]

Output: 0

Explanation: Each element in nums is already 0 so no operations are needed.

Constraints:

• 1 <= nums.length <= 100

• 0 <= nums[i] <= 100

import java.util.*;

import java.util.stream.*;

class Solution {

public int minimumOperations(int[] nums) {

List<Integer> list = Arrays.stream(nums).boxed().collect(Collectors.toList());

var nonZero = list.stream().filter(x -> x > 0).collect(Collectors.toList());

int count = 0;

while(nonZero.size() > 0) {

var min = nonZero.stream().mapToInt(x -> x).min().getAsInt();

nonZero = nonZero.stream().map(x -> x - min).filter(x -> x > 0).collect(Collectors.toList());

count++;

}

return count;

}

Input

nums =

[1,5,0,3,5]

Output

Expected

Input

nums =

[0]

Output

Expected

Problem::

Make Array Zero by Subtracting Equal Amounts

You are given a non-negative integer array nums. In one operation, you must:

• Choose a positive integer x such that x is less than or equal to the smallest non-zero element in nums.

• Subtract x from every positive element in nums.

Return the minimum number of operations to make every element in nums equal to 0.

Example 1:

Input: nums = [1,5,0,3,5]

Output: 3

Explanation:

In the first operation, choose x = 1. Now, nums = [0,4,0,2,4].

In the second operation, choose x = 2. Now, nums = [0,2,0,0,2].

In the third operation, choose x = 2. Now, nums = [0,0,0,0,0].

Example 2:

Input: nums = [0]

Output: 0

Explanation: Each element in nums is already 0 so no operations are needed.

Constraints:

• 1 <= nums.length <= 100

• 0 <= nums[i] <= 100

import java.util.*;

import java.util.stream.*;

class Solution {

public int minimumOperations(int[] nums) {

List<Integer> list = Arrays.stream(nums).boxed().collect(Collectors.toList());

var nonZero = list.stream().filter(x -> x > 0).collect(Collectors.toList());

int count = 0;

while(nonZero.size() > 0) {

var min = nonZero.stream().mapToInt(x -> x).min().getAsInt();

nonZero = nonZero.stream().map(x -> x - min).filter(x -> x > 0).collect(Collectors.toList());

count++;

}

return count;

}

Input

nums =

[1,5,0,3,5]

Output

Expected

Input

nums =

[0]

Output

Expected

SQL Schema

Table: Books

+----------------+---------+

| Column Name | Type |

+----------------+---------+

| book_id | int |

| name | varchar |

| available_from | date |

+----------------+---------+

book_id is the primary key of this table.

Table: Orders

+----------------+---------+

| Column Name | Type |

+----------------+---------+

| order_id | int |

| book_id | int |

| quantity | int |

| dispatch_date | date |

+----------------+---------+

order_id is the primary key of this table.

book_id is a foreign key to the Books table.

Write an SQL query that reports the books that have sold less than 10 copies in the last year, excluding books that have been available for less than one month from today. Assume today is 2019-06-23.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input:

Books table:

+---------+--------------------+----------------+

| book_id | name | available_from |

+---------+--------------------+----------------+

| 1 | "Kalila And Demna" | 2010-01-01 |

| 2 | "28 Letters" | 2012-05-12 |

| 3 | "The Hobbit" | 2019-06-10 |

| 4 | "13 Reasons Why" | 2019-06-01 |

| 5 | "The Hunger Games" | 2008-09-21 |

+---------+--------------------+----------------+

Orders table:

+----------+---------+----------+---------------+

+----------+---------+----------+---------------+

| 1 | 1 | 2 | 2018-07-26 |

| 2 | 1 | 1 | 2018-11-05 |

| 3 | 3 | 8 | 2019-06-11 |

| 4 | 4 | 6 | 2019-06-05 |

| 5 | 4 | 5 | 2019-06-20 |

| 6 | 5 | 9 | 2009-02-02 |

| 7 | 5 | 8 | 2010-04-13 |

+----------+---------+----------+---------------+

Output:

+-----------+--------------------+

| book_id | name |

+-----------+--------------------+

| 1 | "Kalila And Demna" |

| 2 | "28 Letters" |

| 5 | "The Hunger Games" |

+-----------+--------------------+

SELECT DISTINCT b.book_id, b.name

FROM books b

LEFT JOIN Orders o on b.book_id = o.book_id

GROUP BY b.book_id, b.name,

DATEDIFF(day, DATEADD(year, -1, '2019-06-23'), o.dispatch_date),

DATEDIFF(day, b.available_from, DATEADD(month, -1, '2019-06-23'))

HAVING SUM(o.quantity) IS NULL OR

DATEDIFF(day, DATEADD(year, -1, '2019-06-23'), o.dispatch_date) < 0 OR

(DATEDIFF(day, DATEADD(year, -1, '2019-06-23'), o.dispatch_date) > 0 AND DATEDIFF(day, b.available_from, DATEADD(month, -1, '2019-06-23')) > 0 AND SUM(o.quantity) < 10);

Case 1

Input

Books =

| book_id | name | available_from |

| ------- | ---------------- | -------------- |

| 1 | Kalila And Demna | 2010-01-01 |

| 2 | 28 Letters | 2012-05-12 |

| 3 | The Hobbit | 2019-06-10 |

| 4 | 13 Reasons Why | 2019-06-01 |

| 5 | The Hunger Games | 2008-09-21 |

Orders =

| -------- | ------- | -------- | ------------- |

| 1 | 1 | 2 | 2018-07-26 |

| 2 | 1 | 1 | 2018-11-05 |

| 3 | 3 | 8 | 2019-06-11 |

| 4 | 4 | 6 | 2019-06-05 |

| 5 | 4 | 5 | 2019-06-20 |

| 6 | 5 | 9 | 2009-02-02 |

| 7 | 5 | 8 | 2010-04-13 |

Output

| book_id | name |

| ------- | ---------------- |

| 2 | 28 Letters |

| 1 | Kalila And Demna |

| 5 | The Hunger Games |

Expected

| book_id | name |

| ------- | ---------------- |

| 1 | Kalila And Demna |

| 2 | 28 Letters |

| 5 | The Hunger Games |

Sunday, June 2, 2024

This is a continuation of several articles on openai search for Drone formation organization using elements as reference locations and nodes as predicted positions for drones. The elements can be stored in any non-proprietary vector database and a sample implementation would look something as follows and also called out in: https://github.com/ravibeta/semantic_search

The first step would be to install all the required packages and libraries. We use Python in this sample:

import warnings

warnings.filterwarnings(‘ignore’)

from datasets import load_dataset

from pinecone import Pinecone, ServerlessSpec

from DLAIUtils import Utils

import DLAIUtils

import os

import time

import torch

From tqdm.auto import tqdm

We assume the elements are mapped as embeddings in a 384-dimensional dense vector space.

A sample query would appear like this:

query = `what is node nearest this element?`

xq = model.encode(query)

xq.shape

(384,)

The next step is to set up the Pinecone vector database to upsert embeddings into it. These database index vectors make search and retrieval easy by comparing values and finding those that are most like one-another

utils = Utils()

PINECONE_API_KEY = utils.get_pinecone_api_key()

if INDEX_NAME in [index.name for index in pinecone.list_indexes()]:

pinecone.delete_index(INDEX_NAME)

print(INDEX_NAME)

pinecone.create_index(name=INDEX_NAME, dimension=model.get_sentence_embedding_dimension(), metric=’cosine’,spec=ServerlessSpec(cloud=’aws’, region=’us-west-2’))

index = pinecone.Index(INDEX_NAME)

print(index)

Then, the next step is to create embeddings for all the elements in the sample space and upsert them to Pinecone.

batch_size=200

vector_limit=10000

elements=element[:vector_limit]

import json

for i in tqdm(range(0, len(elements), batch_size)):

i_end = min(i+batch_size, len(elements))

ids = [str(x) for x in range(i, i_end)]

metadata = [{‘text’: text} for text in elements[i:i_end]]

xc = model.encode(elements[i:i_end])

records = zip(ids, xc, metadata)

index.upsert(vectors=records)

index.describe_index_stats()

Then the query can be run on the embeddings and the top matches can be returned.

def run_query(query):

embedding = model.encode(query).tolist()

results = index.query(top_k=10, vector=embedding, include_metadata=True, include_value)

for result in results[‘matches’]:

print(f”{round(result[‘score’], 2)}: {result[‘metadata’][‘node’]}”)

run_query(“what is node nearest this element?”)

With this, the embeddings-based search over elements is ready. In Azure, cosmos DB offers a similar semantic search and works as a similar vector database.

The following code outlines the steps using Azure AI Search

# configure the vector store settings, vector name is in the index of the search

endpoint: str = "<AzureSearchEndpoint>"

key: str = "<AzureSearchKey>"

index_name: str = "<VectorName>"

credential = AzureKeyCredential(key)

client = SearchClient(endpoint=endpoint,

index_name=index_name,

credential=credential)

# create embeddings

embeddings: AzureOpenAIEmbeddings = AzureOpenAIEmbeddings(

azure_deployment=azure_deployment,

openai_api_version=azure_openai_api_version,

azure_endpoint=azure_endpoint,

api_key=azure_openai_api_key,

)

# create vector store

vector_store = AzureSearch(

azure_search_endpoint=endpoint,

azure_search_key=key,

index_name=index_name,

embedding_function=embeddings.embed_query,

)

# create a query

docs = vector_store.similarity_search(

query=userQuery,

k=3,

search_type="similarity",

)

collections.insert_many(docs)

Saturday, June 1, 2024

Automation can also be achieved with Azure Data Factory aka ADF and a self-hosted integration runtime that comprises of a vm hosted on-premises and a Script activity. While typically associated with Data Transformation activities, a self-hosted integration runtime can participate in running any scripts and its invocation from ADF guarantees human and programmatic access from anywhere that has cloud connectivity. A self-hosted integration runtime is a component that connects data sources on-premises/ on Azure VM with cloud services in a secure and managed way

The Json syntax for defining a script looks something like this:

{

"name": "<activity name>",

"type": "Script",

"linkedServiceName": {

"referenceName": "<name>",

"type": "LinkedServiceReference"

"typeProperties": {

"scripts" : [

{

"text": "<Script Block>",

"type": "<Query> or <NonQuery>",

"parameters":[

{

"name": "<name>",

"value": "<value>",

"type": "<type>",

"direction": "<Input> or <Output> or <InputOutput>",

"size": 256

...

]

...

]

"scriptBlockExecutionTimeout": "<time>",

"logSettings": {

"logDestination": "<ActivityOutput> or <ExternalStore>",

"logLocationSettings":{

"linkedServiceName":{

"referenceName": "<name>",

"type": "<LinkedServiceReference>"

"path": "<folder path>"

}

The output can be collected everytime a script block is executed. There is a 5000 rows/4MB size limit but this is sufficient for most purposes.

A sample curl call would be something like this:

##! /usr/bin/python

import requests

# Set your ADF details

subscription_id = '<subscription_id>'

resource_group = '<resource_group>'

factory_name = '<factory_name>'

# Set the pipeline name you want to trigger

pipeline_name = 'your_pipeline_name'

# Construct the API URL

api_url = f"https://management.azure.com/subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.DataFactory/factories/{factory_name}/pipelines/{pipeline_name}/createRun?api-version=2017-03-01-preview"

# Make the POST request

response = requests.post(api_url)

# Check the response status

if response.status_code == 200:

print("Pipeline triggered successfully!")

else:

print(f"Error triggering pipeline. Status code: {response.status_code}")

## EOF