Cluster computing: November 2024

Saturday, November 30, 2024

CodingExercise: Rotate List

Medium

Topics

Companies

Given the head of a linked list, rotate the list to the right by k places.

Example 1:

Input: head = [1,2,3,4,5], k = 2

Output: [4,5,1,2,3]

Example 2:

Input: head = [0,1,2], k = 4

Output: [2,0,1]

Constraints:

The number of nodes in the list is in the range [0, 500].

-100 <= Node.val <= 100

0 <= k <= 2 * 109

/**

* Definition for singly-linked list.

* public class ListNode {

* int val;

* ListNode next;

* ListNode() {}

* ListNode(int val) { this.val = val; }

* ListNode(int val, ListNode next) { this.val = val; this.next = next; }

* }

class Solution {

public ListNode rotateRight(ListNode head, int k) {

if (head == null || k == 0) return head;

Listhead current = head;

int n = 0;

while (current){

n++;

current = current.next;

}

if (k > n) { k = k % n;}

int offset = n - k;

current = head;

if (offset > 0) {

while (offset > 1) {

current = current.next;

offset--;

}

ListNode remaining = current.next;

current.next = null;

ListHead end = remaning;

while (end && end.next) {

end = end.next;

}

end.next = head;

return remaining;

}

return head;

}

Friday, November 29, 2024

Mesh networking and UAV (Unmanned Aerial Vehicle) swarm flight communication share several commonalities, particularly in how they handle connectivity and data transfer:

Dynamic Topology: Both systems often operate in environments where the network topology can change dynamically. In mesh networks, nodes can join or leave the network, and in UAV swarms, drones can move in and out of range.

Self-Healing: Mesh networks are designed to automatically reroute data if a node fails or a connection is lost. Similarly, UAV swarms use mesh networking to maintain communication even if some drones drop out or move out of range.

Redundancy: Both systems use redundancy to ensure reliable communication. In mesh networks, multiple paths can be used to send data, while in UAV swarms, multiple drones can relay information to ensure it reaches its destination.

Decentralization: Mesh networks are decentralized, meaning there is no single point of failure. UAV swarms also benefit from decentralized communication, allowing them to operate independently and collaboratively without relying on a central control point.

Scalability: Both mesh networks and UAV swarms can scale to accommodate more nodes or drones, respectively, without significant degradation in performance.

These commonalities make mesh networking an ideal solution for UAV swarm communication, ensuring robust and reliable connectivity even in challenging environments.

Similarly, distributed hash tables, cachepoints arranged in a ring and consensus algorithms also play a part in the communications between drones.

Cachepoints are used with consistent hashing. They are arranged along the circle depicting the key range and cache objects corresponding to the range. Virtual nodes can join and leave the network without impacting the operation of the ring.

Data is partitioned and replicated using consistent hashing to achieve scale and availability. Consistency is facilitated by object versioning. Replicas are maintained during updates based on a quorum like technique.

In a distributed environment, the best way to detect failures and determine memberships is with the help of gossip protocol. When an existing node leaves the network, it may not respond to the gossip protocol so the neighbors become aware. The neighbors update the membership changes and copy data asynchronously.

Some systems utilize a state machine replication such as Paxos that combines transaction logging for consensus with write-ahead logging for data recovery. If the state machines are replicated, they are fully Byzantine tolerant.

Thursday, November 28, 2024

This is a summary of the book titled “The Circular Business Revolution – a practical framework for sustainable business models” written by Manuel Braun and Julia Binder and published by FT Publishing in 2024. The authors encourage business to shift from the linear “take-make-waste” model of business to a sustainable circular, “net positive” model that aims at both business success and environmental and social good. They assert that the shift is both possible and necessary. Sometimes the shift might require redesigning the business model. That can be done by focusing on resource use, ecosystems, uses of waste, and product life for sustainability and by focusing on the value chain and operating environment for the business’ future. It’s also important to set the goals high in how the company can create a net positive impact. Both business and environmental aims must align with the company’s vision. Execution can be complex, but it focuses on integrating sustainability

The shift from a linear economy to a circular one is both possible and necessary. The current linear model, which starts with the extraction of natural resources and ends with waste, is inefficient and has negative environmental effects. A circular economy, which focuses on value creation through continuous use and reuse, is becoming more prevalent. This transformation is inevitable and can be achieved through three horizons: near, far, and mid-term. To transition, businesses should assess signs for change, create a vision for a circular, regenerative future, investigate business models that achieve good business outcomes and low environmental impact, and strengthen their readiness for change. The circular economy aims at "eco-effectiveness" and focuses on meeting human needs rather than just profit. Participants in a circular economy use nine strategies, called "R-strategies," to achieve circularity and reduce pressure on resources.

Circular business models are a type of business strategy that focuses on resource use, ecosystems, waste, product life, or servitization. They fall into five categories: optimizing resource use, restoring ecosystems, capturing waste value, extending product life, and bundling services and products. These models aim to make existing operations more environmentally sustainable, reducing waste and pollution, and promoting sustainable practices. They also focus on restoring ecosystems, such as the cultivation of the illipe nut in Borneo, which combats deforestation and supports local communities. They also recognize the economic value of waste materials, such as by-products, and create value by recovering them. They also focus on providing integrated solutions tailored to specific needs, often through sharing and pooling platforms.

To understand your business's future, consider your value chain and operating environment from two perspectives: inside out and outside in. The inside-out view helps identify inefficiencies in the linear system, such as overuse of resources, negative externalities, waste generation, underutilized capacities, and product losses. The outside-in view considers the business's operating environment, which will face disruption due to increasing resources scarcity, power dynamics, and other drivers. Maintaining your "license to operate" and "license to innovate" is crucial for gaining a competitive advantage.

To transition from linear to circular, start with a clear vision for achieving business success and environmental goals. Develop this vision from both an environmental and business perspective, building on existing values and mission. Backcasting helps determine what needs to be done to make the vision a reality. Companies like FREITAG are promoting circularity by focusing on intelligent design and employee development.

A successful strategy for implementing a circular business model should address both business and environmental aims and align with the organization's vision. This involves a pathway to implementing a chosen business model and direction for the organization. The transition to servitization often faces the balance sheet dilemma, but it requires careful cash-flow management. Active stakeholder involvement is crucial, and the business model pathway should align with the organization's future vision. The implementation of a circular business model has six major dimensions: customer-centricity, design, ecosystem, data and technology, organizational structures and processes, and tools, systems, and KPIs. Leaders must communicate authentically, avoid greenwashing and greenhushing, and maintain transparency to demonstrate commitment to the transformation.

Wednesday, November 27, 2024

This is a summary of the book titled “Strong Supply Chains Through Resilient Operations” written by Suketu Gandhi, Marc Lakner, Sherri He, Tiffany Hickerson, and Michael F. Strohmer and published by Wiley in 2023. The authors make a case against the cost trimming and lean-and-mean making of supply chains because the world is becoming more volatile, with geopolitical tensions, social unrest, and extreme weather. Instead, they ask that the supply chains be more resilient and flexible and propose a comprehensive program to do so. Case in point is the early 2020s when COVID struck, and operations needed to tackle a range of threats. By encouraging suppliers to be partners, positioning customer value as the key driver, cultivating a resilient workforce, revolutionizing with innovative technology, and a sustainability strategy, businesses can make their supply chain resilient to anything the world throws at them.

Supply-chain resilience is crucial in today's volatile, uncertain, complex, and ambiguous (VUCA) environment. Companies must prepare their operations for potential threats by restructuring their brittle supply chains to bend instead of breaking. In today's VUCA environment, businesses must adapt their strategies to new circumstances. To build resilient operations, businesses should prioritize a holistic view of their operations, including product planning, consumer research, manufacturing, and logistics.

To establish a relationship with suppliers, work together to establish a mutually beneficial relationship. Share goals and discuss how their work aligns with those goals. Encourage suppliers to cultivate similar relationships to build a network capable of providing early warnings about supply interruptions.

Reduce dependence on a single source and seek quality alternative suppliers as backups. Maximize value from suppliers and ensure everyone in the company understands the entire value chain.

Fast-fashion companies like Zara and H&M disrupted the traditional fashion industry by focusing on customer value and releasing new products continually. This approach allows them to adapt to changing customer attitudes and tackles issues like cost, sustainability, and shifting supply-demand profiles. To cultivate resilience, simplify your product portfolio, gather feedback, and use data analytics to understand customer preferences. The growth of e-commerce offers new opportunities for delivering products, but companies should be aware that customers may not be concerned about quick delivery for every product.

During the COVID pandemic, companies experimented with multichannel operations, such as online sales and home delivery. To set up an omnichannel operation, gather and analyze relevant data, consider last-mile options, and consider last-mile options like delivery or customer pickups.

Cultivate a resilient workforce by valuing diversity, promoting open communication, and offering a vision that aligns with personal values. This approach helps teams work harder, thrive in a fast-changing environment, and contribute to a positive culture.

Companies today recognize the importance of a diverse workforce for producing original ideas. However, this requires more than just hiring people from diverse backgrounds. Companies should focus on building resilience through "economies of skill" by implementing strategies such as outcome-based work, leveraging global expertise, promoting perennial learning, planning for future skills, and revolutionizing the supply chain with innovative technology. This includes using the Internet of Things, advanced robotics, wearable tech, and 3D printers to monitor operations and predict problems. A culture of continual learning is essential for employees to handle crises, adapt to customer preferences, and make the most of innovations. A sustainability strategy is also crucial for resilience, as stakeholders are increasingly aware of environmental, social, and governance (ESG) issues. Customers are willing to pay more for sustainable products, and employees worry about the environmental consequences of unsustainable practices. Sustainability is now a strategic advantage, as it can be framed as a way of minimizing waste.

Resilience is crucial in any future scenario, as it allows businesses to withstand risks and maintain a stable state. Key elements of resilience include transparency, data analysis, customer value, a proactive attitude, and the right people. These elements help businesses understand their operations, make informed decisions, and adapt to changes in the market. By empowering their employees with purpose, businesses can ensure a smooth transition and thrive in the future.

#Codingexercise: CodingExercise-11-27-2024.docx

Tuesday, November 26, 2024

This is a summary of the book titled “Math-ish: Finding creativity, diversity and meaning in mathematics” written by Dr. Jo Boaler and published by Harper One in 2024. The author has consistently sought to fight the widespread and often incorrect way of teaching mathematics where higher levels are about filtering out students and often one that demands students to give exacting solutions and forces students into a rigor that provides no benefits. She even cites this to be the reason that contributes to the fear and hate of the subject. Instead, she advocates playful, creative thinking, ideas and collaboration with a diverse set of people, so that the subject is not just mathematics but math-ish.

The narrow approach must give way to a more inclusive approach. The latter is rooted in “Metacognition” practices. Learning will always involve mistakes and embracing the struggle is critical to teaching. Math-ish education brings diverse approaches to numbers, shapes, and data, which also leads to a more visual experience. This results in a shift in thinking about rules to a thinking that is broader. Changing the status quo could address current inequalities and set a precedent for other subjects.

Educators must replace the narrow approach to mathematics with a more flexible and diverse approach. The narrow mathematics approach, taught for over 100 years, has spoiled millions of students and negatively impacted society. It doesn't involve creativity or the free play of ideas, and students are constantly scrutinized and evaluated based on tests. This has led to 60% of university students dropping their science majors after taking introductory math courses.

An alternative to narrow mathematics is "math-ish," which de-emphasizes contextless, numeric problem-solving in favor of creative, interesting ways people use math in real-world settings. Metacognition practices, formulated by Stanford University psychology professor John Flavell, form the foundation of better math education. These practices boost problem-solving, enable mathematical diversity, and enhance work performance.

To approach and solve a mathematical problem, educators should reflect on the problem, create a visual representation, and ask why a particular procedure works. Cultivating a reflective mindset and learning to problem-solve with others in a way that respects their ideas and perspectives is essential for better math education.

Learning math involves making mistakes and embracing the struggle. A growth mindset is better for learners and problem-solvers, as it encourages learning from setbacks and persisting in efforts. Math teachers should encourage students to "struggle" with problems, allowing them to understand the nature of the problem and come to a deeper understanding of the concepts involved. Teachers can present problems that invite diverse approaches, encourage conversation and collaboration, praise students' struggles, and reward their work. Math-ish education involves understanding math as it manifests itself in everyday life in a population of people with diverse identities and backgrounds. Mathematical diversity enhances life and mathematics. K-12 teachers can't cover everything, but three critical areas in mathematics that students need a feeling for to thrive at the university level are arithmetic, data analysis, and linear equations. Students can acquire a feeling for numbers by identifying patterns and drawing patterned shapes, which helps them relate to numbers in a new way.

Understanding data is a critical life skill in today's online world, and children should begin early to learn to avoid deception by data. Students should be introduced to statistics and probability and understand the difference between correlation and causation. In the era of misinformation, people should always ask where data comes from, whether it is being presented, and whether the relationships presented are correlated or causal. Math can be an exciting and interactive visual experience, as it allows for deeper connections and ideas. Visual representations can open the mind to new ideas and approaches, and mental models play a central role in learning. Math is not just a set of rules; it is a way of thinking that requires flexibility and deep thinking. Students who do well in math understand the truths of numbers and can find different solutions to adding them.

Traditional math education is inequitable, with Black and Brown students often left behind. A new model, reflected in the 2023 California Mathematics Framework, aims to promote mathematical expertise, and reduce inequalities by promoting diversity in perspectives and approaches to problems. This model encourages more general discussion of problems and encourages students to engage in different ways to learn and succeed in math.

Monday, November 25, 2024

Problem statement: Given a wire grid of size N * N with N-1 horizontal edges and N-1 vertical edges along the X and Y axis respectively, and a wire burning out every instant as per the given order using three matrices A, B, C such that the wire that burns is

(A[T], B[T] + 1), if C[T] = 0 or

(A[T] + 1, B[T]), if C[T] = 1

Determine the instant after which the circuit is broken

public static boolean checkConnections(int[] h, int[] v, int N) {

boolean[][] visited = new boolean[N][N];

dfs(h, v, visited,0,0);

return visited[N-1][N-1];

}

public static void dfs(int[]h, int[]v, boolean[][] visited, int i, int j) {

int N = visited.length;

if (i < N && j < N && i>= 0 && j >= 0 && !visited[i][j]) {

visited[i][j] = true;

if (v[i * (N-1) + j] == 1) {

dfs(h, v, visited, i, j+1);

}

if (h[i * (N-1) + j] == 1) {

dfs(h, v, visited, i+1, j);

}

if (i > 0 && h[(i-1)*(N-1) + j] == 1) {

dfs(h,v, visited, i-1, j);

}

if (j > 0 && h[(i * (N-1) + (j-1))] == 1) {

dfs(h,v, visited, i, j-1);

}

public static int burnout(int N, int[] A, int[] B, int[] C) {

int[] h = new int[N*N];

int[] v = new int[N*N];

for (int i = 0; i < N*N; i++) { h[i] = 1; v[i] = 1; }

for (int i = 0; i < N; i++) {

h[(i * (N)) + N - 1] = 0;

v[(N-1) * (N) + i] = 0;

}

System.out.println(printArray(h));

System.out.println(printArray(v));

for (int i = 0; i < A.length; i++) {

if (C[i] == 0) {

v[A[i] * (N-1) + B[i]] = 0;

} else {

h[A[i] * (N-1) + B[i]] = 0;

}

if (!checkConnections(h,v, N)) {

return i+1;

}

return -1;

}

int[] A = new int[9];

int[] B = new int[9];

int[] C = new int[9];

A[0] = 0; B [0] = 0; C[0] = 0;

A[1] = 1; B [1] = 1; C[1] = 1;

A[2] = 1; B [2] = 1; C[2] = 0;

A[3] = 2; B [3] = 1; C[3] = 0;

A[4] = 3; B [4] = 2; C[4] = 0;

A[5] = 2; B [5] = 2; C[5] = 1;

A[6] = 1; B [6] = 3; C[6] = 1;

A[7] = 0; B [7] = 1; C[7] = 0;

A[8] = 0; B [8] = 0; C[8] = 1;

System.out.println(burnout(9, A, B, C));

1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

Alternatively,

public static boolean burnWiresAtT(int N, int[] A, int[] B, int[] C, int t) {

int[] h = new int[N*N];

int[] v = new int[N*N];

for (int i = 0; i < N*N; i++) { h[i] = 1; v[i] = 1; }

for (int i = 0; i < N; i++) {

h[(i * (N)) + N - 1] = 0;

v[(N-1) * (N) + i] = 0;

}

System.out.println(printArray(h));

System.out.println(printArray(v));

for (int i = 0; i < t; i++) {

if (C[i] == 0) {

v[A[i] * (N-1) + B[i]] = 0;

} else {

h[A[i] * (N-1) + B[i]] = 0;

}

return checkConnections(h, v, N);

}

public static int binarySearch(int N, int[] A, int[] B, int[] C, int start, int end) {

if (start == end) {

if (!burnWiresAtT(N, A, B, C, end)){

return end;

}

return -1;

} else {

int mid = (start + end)/2;

if (burnWiresAtT(N, A, B, C, mid)) {

return binarySearch(N, A, B, C, mid + 1, end);

} else {

return binarySearch(N, A, B, C, start, mid);

}

1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

Sunday, November 24, 2024

This is an implementation outline of an image retrieval system built for images captured from a UAV swarm such that the queries from the user can then be used to assign objects as targets for the drones to acquire a better angle in real-time.

Traditionally image retrieval systems either work with images or text for data and query respectively but not both. With multi-modal AI vector search, we can span both giving us better results. Azure provides Multimodal embeddings APIs that enable the vectorization of images and text queries. They convert images to coordinates in a multi-dimensional vector space. Then, incoming text queries can also be converted to vectors, and images can be matched to the text based on semantic closeness. This allows the user to search a set of images using text, without the need to use image tags or other metadata. Semantic closeness often produces better results in search. Vectorize Image and Vectorize Text APIs are available to convert images and text to vectors.

The open-source equivalent for an image retrieval system could involve ViT Image Retrieval project available on GitHub that uses Vision Transformers and Facebook AI Similarity Search for content-based image retrieval. This utilizes the ViT-B/16 model pretrained on ImageNet for robust feature extraction and FAISS indexing. It comes with a user-friendly graphical user interface for feature extraction and image search. Python API is used to index a directory of images and search for similar images, and this can be extended to directly work with an S3 store or Azure storage account.

“Contextual Embeddings” improves retrieval accuracy, cutting failures with re-ranking. It involves both a well-known Retrieval Augmented Generation technique with semantic search using embeddings and lexical search using sparse retrievers like BM25. The entire knowledge base is split into chunks. Both the TF-IDF encodings as well as semantic embeddings are generated. Parallel searches using both lexical and semantic searches are run. The results are then combined and ranked. The most relevant chunks are located, and the response is generated with enhanced context. This enhancement over multimodal embeddings and GraphRAG is inspired by Anthropic and a Microsoft Community blog.

#codingexercise: CodingExercise-11-24-2024.docx

Saturday, November 23, 2024

This is a summary of the book titled “LIT: Life Ignition Tools – Use Nature’s playbook to energize your brain, spark ideas, and ignite action” written by Jeff Karp and Teresa Barker and published by Williams Morrow in 2024. The acronym in the title hints at a high-energy state of mind that easily stands out from the distracted, habitual, and even autopilot comfort that we fall back to. The author claims that we don’t need additional devices like a phone or a radio to make that transformation even when doing boring activities like driving. We realize deeper engagement through intentional actions. We ask questions to cultivate curiosity and an enquiring mind. We tune into what bothers us and discover what we want. We seek out diverse ideas, inspirations and opportunities and cultivate surprises. We use the energy of the failures to re-invigorate our efforts. We stay humble by learning to be awed. We press “pause” to recharge ourselves and we connect with the natural world. These are the authors’ recommendations.

A "lit" brain state is a state of awareness that encourages active curiosity, creative inspiration, and engagement with the world. To achieve this, we must overcome the tendency to rely on habitual patterns and routine responses. Lowering the deliberate effort required to start and sustain an action boosts motivation and helps the brain rewire itself. Life Ignition Tools (LIT) help out of autopilot and into a heightened state of awareness.

Asking intentional questions that spark curiosity activates the brain's pleasure, reward, and memory centers, promoting deep learning, new discoveries, and social connections. Surrounding oneself with people who ask great questions helps sharpen question-asking skills. Noting questions that arise spontaneously from experiences and considering significant personal experiences and current events as opportunities to explore big questions can help.

We must tune in to what bothers us to discover what we want. Bothered awareness signals a desire for change, so we must pause, reflect, and question why a particular pain point bothers us. We must connect with our sources of motivation and pain to get an energy boost and be transparent about our desire to change our life positively.

Engaging with a diverse range of people, environments, and experiences can lead to higher levels of happiness and success. Actively engaging with the world, such as reading, listening to podcasts, traveling, and chatting with strangers, can help overcome biases and improve memory, focus, and executive functioning.

Additional physical activity can fuel idea generation by releasing mood-boosting chemicals like dopamine, serotonin, and noradrenaline. Walking can boost creative output up to 60%.

Cultivating surprise involves trying something new or doing a typical activity differently, challenging the brain's resistance to change and allowing neurons to connect with renewed creativity. Taking risks and taking risks can lead to more unexpected and serendipitous life experiences.

To live a more surprising life, we must change habits, deepen awareness, keep an ongoing list of new activities, put ourselves in unusual social situations, and shift our perspective. By doing so, we can overcome the barriers to happiness and success, leading to a more fulfilling life.

Failure can be a powerful force that can be harnessed to reinvigorate efforts and learn from it. It can be a source of inspiration, a tool for problem-solving, and a source of humility. Humble individuals are better at handling stress, healthier, and more comfortable with differences and ambiguity. To cultivate humility, we should appreciate our own small yet critical part of the universe and appreciate the kindness and courage displayed by others.

Pressing pause is also essential for self-reflection and well-being. It involves taking micro-breaks during the workday, focusing on silence, solitude, and play. This can be achieved by turning off background audio, choosing energized breaks, and engaging in casual interactions. Practicing deep breathing and spending time outdoors in nature can also help in savoring experiences and embracing failure.

Connecting with the natural world can lead to happiness, healthier living, and reduced risk of chronic conditions. To connect, we must slow down, focus on one thing, study weather patterns, plant life, and water sources, take tours, start composting, or plant gardens. We must shift our mindset from transactional to interconnected and spend time in nature with friends and family.

Friday, November 22, 2024

Serverless SQL in Azure offers a flexible and cost-effective way to manage SQL databases and data processing without the need to manage the underlying infrastructure. Here are some key aspects:

Azure SQL Database Serverless

Autoscaling: Automatically scales compute based on workload demand. It bills for the amount of compute used per second2.

Auto-Pause and Resume: Pauses databases during inactive periods when only storage is billed and resumes when activity returns.

Configurable Parameters: You can configure the minimum and maximum vCores, memory, and IO limits.

Cost-Effective: Ideal for single databases with intermittent, unpredictable usage patterns.

Azure Synapse Analytics Serverless SQL Pool

Query Service: Provides a query service over data in your data lake, allowing you to query data in place without moving it.

T-SQL Support: Uses familiar T-SQL syntax for querying data.

High Reliability: Built for large-scale data processing with built-in query execution fault-tolerance.

Pay-Per-Use: You are only charged for the data processed by your queries.

Benefits

Scalability: Easily scales to accommodate varying workloads.

Cost Efficiency: Only pay for what you use, making it cost-effective for unpredictable workloads.

Ease of Use: No infrastructure setup or maintenance required.

The product Neon Database was launched in 2021 for going serverless on a cloud platform as a relational database. Recently it has become cloud native to Azure just like it has been on AWS. This deeper integration of Neon in Azure facilitates rapid app development because postgres sql is the developers’ choice. Serverless reduces operational overhead and frees the developers to focus on the data model, access and CI/CD integration to suit their needs. In fact, Microsoft’s investments in GitHub, VSCode, TypeScript, OpenAI and Copilot align well with the developers’ agenda.

Even the ask for a vector store from AI can be facilitated within a relational database as both Azure SQL and Neon have demonstrated. The compute seamlessly scale up for expensive index builds and back down for normal queries or RAG queries. Since pause during inacitivity and resume for load is automated in serverless, the cost savings are significant. In addition, both databases focus on data privacy.

The following is a way to test the ai vector cosine similarity in a relational database.

1. Step 1: upload a dataset to a storage account from where it can be accessed easily. This must be a csv file with headers like:

id,url,title,text,title_vector,content_vector,vector_id

Sample uploaded file looks like this:

2. Step 2: Use Azure Portal Query Editor or any client to run the following SQL:

a. 00-setup-blob-access.sql

Cleanup if needed

if not exists(select * from sys.symmetric_keys where [name] = '##MS_DatabaseMasterKey##')

begin

create master key encryption by password = 'Pa$$w0rd!'

end

if exists(select * from sys.[external_data_sources] where name = 'openai_playground')

begin

drop external data source [openai_playground];

end

if exists(select * from sys.[database_scoped_credentials] where name = 'openai_playground')

begin

drop database scoped credential [openai_playground];

end

Create database scoped credential and external data source.

File is assumed to be in a path like:

https://saravinoteblogs.blob.core.windows.net/playground/wikipedia/vector_database_wikipedia_articles_embedded.csv

Please note that it is recommened to avoid using SAS tokens: the best practice is to use Managed Identity as described here:

https://learn.microsoft.com/en-us/sql/relational-databases/import-export/import-bulk-data-by-using-bulk-insert-or-openrowset-bulk-sql-server?view=sql-server-ver16#bulk-importing-from-azure-blob-storage

create database scoped credential [openai_playground]

with identity = 'SHARED ACCESS SIGNATURE',

secret = 'sp=rwdme&st=2024-11-22T03:37:08Z&se=2024-11-29T11:37:08Z&spr=https&sv=2022-11-02&sr=b&sig=EWag2qRCAY7kRsF7LtBRRRExdWgR5h4XWrU%2'; -- make sure not to include the ? at the beginning

create external data source [openai_playground]

with

(

type = blob_storage,

location = 'https://saravinoteblogs.blob.core.windows.net/playground',

credential = [openai_playground]

);

b. 01-import-wikipedia.sql:

Create table

drop table if exists [dbo].[wikipedia_articles_embeddings];

create table [dbo].[wikipedia_articles_embeddings]

(

[id] [int] not null,

[url] [varchar](1000) not null,

[title] [varchar](1000) not null,

[text] [varchar](max) not null,

[title_vector] [varchar](max) not null,

[content_vector] [varchar](max) not null,

[vector_id] [int] not null

)

Import data

bulk insert dbo.[wikipedia_articles_embeddings]

from 'wikipedia/vector_database_wikipedia_articles_embedded.csv'

with (

data_source = 'openai_playground',

format = 'csv',

firstrow = 2,

codepage = '65001',

fieldterminator = ',',

rowterminator = '0x0a',

fieldquote = '"',

batchsize = 1000,

tablock

)

Add primary key

alter table [dbo].[wikipedia_articles_embeddings]

add constraint pk__wikipedia_articles_embeddings primary key clustered (id)

Add index on title

create index [ix_title] on [dbo].[wikipedia_articles_embeddings](title)

Verify data

select top (10) * from [dbo].[wikipedia_articles_embeddings]

select * from [dbo].[wikipedia_articles_embeddings] where title = 'Alan Turing'

c. 02-use-native-vectors.sql:

Add columns to store the native vectors

alter table wikipedia_articles_embeddings

add title_vector_ada2 vector(1536);

alter table wikipedia_articles_embeddings

add content_vector_ada2 vector(1536);

Update the native vectors

update

wikipedia_articles_embeddings

set

title_vector_ada2 = cast(title_vector as vector(1536)),

content_vector_ada2 = cast(content_vector as vector(1536))

Remove old columns

alter table wikipedia_articles_embeddings

drop column title_vector;

alter table wikipedia_articles_embeddings

drop column content_vector;

Verify data

select top (10) * from [dbo].[wikipedia_articles_embeddings]

select * from [dbo].[wikipedia_articles_embeddings] where title = 'Alan Turing'

d. 03-store-openai-credentials.sql

Create database credentials to store API key

if exists(select * from sys.[database_scoped_credentials] where name = 'https://postssearch.openai.azure.com')

begin

drop database scoped credential [https://postssearch.openai.azure.com];

end

create database scoped credential [https://postssearch.openai.azure.com]

with identity = 'HTTPEndpointHeaders', secret = '{"api-key": "7cGuGvTm7FQEJtzFIrZBZpOCJxXbAsGOMDd8uG0RIBivUXIfOUJRJQQJ99AKACYeBjFXJ3w3AAABACOGAL8U"}';

e. 04-create-get-embeddings-procedure.sql:

Get the embeddings for the input text by calling the OpenAI API

create or alter procedure dbo.get_embedding

@deployedModelName nvarchar(1000),

@inputText nvarchar(max),

@embedding vector(1536) output

declare @retval int, @response nvarchar(max);

declare @payload nvarchar(max) = json_object('input': @inputText);

declare @url nvarchar(1000) = 'https://postssearch.openai.azure.com/openai/deployments/' + @deployedModelName + '/embeddings?api-version=2023-03-15-preview'

exec @retval = sp_invoke_external_rest_endpoint

@url = @url,

@method = 'POST',

@credential = [https://postssearch.openai.azure.com],

@payload = @payload,

@response = @response output;

declare @re nvarchar(max) = null;

if (@retval = 0) begin

set @re = json_query(@response, '$.result.data[0].embedding')

end else begin

select @response as 'Error message from OpenAI API';

end

set @embedding = cast(@re as vector(1536));

return @retval

f. 05-find-similar-articles.sql:

Get the embeddings for the input text by calling the OpenAI API

and then search the most similar articles (by title)

Note: postssearchembedding needs to be replaced with the deployment name of your embedding model in Azure OpenAI

declare @inputText nvarchar(max) = 'the foundation series by isaac asimov';

declare @retval int, @embedding vector(1536);

exec @retval = dbo.get_embedding 'postssearchembedding', @inputText, @embedding output;

select top(10)

a.id,

a.title,

a.url,

vector_distance('cosine', @embedding, title_vector_ada2) cosine_distance

from

dbo.wikipedia_articles_embeddings a

order by

cosine_distance;

3. Finally, manually review the results.

Thursday, November 21, 2024

Previous articles on UAV swarm flight management has focused on breaking down different stages of behavior for the swarm and coming up with strategies for each of them. For instance,

UAV swarms must have different control strategies for flight formation, swarm tracking, and social foraging. Separation of stages articulates problems to solve independent of one another, but the UAV swarm cannot always be predicted to be in one or the other at all times during its flight path because the collective behavior may not always be the most optimal at all times. This is further exacerbated by the plurality of disciplines involved such as co-ordination, aggregation, network communication, path planning, information sensing and data fusion. Also, the distributed control strategy is better suited at times from centralized control strategy, and this involves the use of consensus algorithms. Even if there is a centralized steer that corrals the herd towards a focal point, the directive to switch from individual to swarm behavior and subsequent relaxation cannot be set by the steer.

On the other hand, an approach that continuously aims to optimize the flight for each of the drones and makes it smarter to respond to obstacles with deep learning always guarantees the best possible outcome for that unit. Then the swarm behavior is about forging the units by incentivizing them to behave collectively to maximize their objectives and helps to make the behavior more natural as well as dynamic. Intelligent autonomous vehicles have already demonstrated that given a map, a route and real-world obstacles detected sufficiently by sensors, they can behave as well as humans because they apply the best of computer vision, image processing, deep learning, cognitive inferences, and prediction models. It also aids the unit to have a discrete time-integrator so that it can learn what has worked best for it. With aerial flights, the map gives way to a grid that can be walked by using sub-grids as nodes in a graph and finding the shortest connected path as route to follow. Then as each unit follows the other to go through the same route, subject to the constraints of minimum and maximum distances between pairs of units and the overall formation feedback loop, the unit can adjust shape the swarm behavior required.

In this context, the leader-follower behavior is not sacrificed but just that it becomes one more input in a feedback loop to the same strategy that works individually for each drone over time and space and the collective swarm behavior desired is an overlay over individual behavior that can be achieved even without a leader. Simulations, sum of square errors and clustering can guarantee the swarm behavior to be cohesive enough for the duration of the flight. It also enables units to become more specialized than others in certain movements so that the tasks that the swarm would have executed could not be delegated to specific units instead of all at once. Specialization of maneuvers for certain units can also come in handy to form dynamic ad hoc swarms so as to keep the control strategy as distributed, dynamic, responsive and timely as necessary. Formation of swarms and break out to individual behavior when permitted to be dynamic also results in better tracking and maximizing of objectives by narrowing down the duration over which they matter.

Wednesday, November 20, 2024

Mesh networking and UAV (Unmanned Aerial Vehicle) swarm flight communication share several commonalities, particularly in how they handle connectivity and data transfer:

Scalability: Both mesh networks and UAV swarms can scale to accommodate more nodes or drones, respectively, without significant degradation in performance.

These commonalities make mesh networking an ideal solution for UAV swarm communication, ensuring robust and reliable connectivity even in challenging environments.

Similarly, distributed hash tables, cachepoints arranged in a ring and consensus algorithms also play a part in the communications between drones.

References:

2. https://github.com/ravibeta/local-llm/blob/main/README.md

3. https://github.com/raja0034/azureml-examples

4. https://fluffy-space-fiesta-w469xq5xr4vh597v.github.dev/

5. https://github.com/raja0034/openaidemo/blob/main/copilot.py

6. https://vimeo.com/886277740/6386d542c6?share=copy

#codingexercise: CodingExercise-11-20-2024.docx

Monday, November 18, 2024

This is a continuation of a previous paper introducing UAV swarm flight path management.

Dynamic Formation Changes is the one holding the most promise for morphing from one virtual structure to another. When there is no outside influence or data driven flight management, coming up with the next virtual structure is an easier articulation for the swarm pilot.

It is usually helpful to plan out up to two or three virtual structures in advance for a UAV swarm to seamlessly morph from one holding position to another. This macro and micro movements can even be delegated to humans and UAV swarm respectively because given initial and final positions, the autonomous UAV can make tactical moves efficiently and the humans can generate the overall workflow given the absence of a three-dimensional GPS based map.

Virtual structure generation can even be synthesized from images with object detection and appropriate scaling. So virtual structures are not necessarily input by humans. In a perfect world, UAV swarms launch from packed formation to take positions in a matrix in the air and then morph from one position to another given the signals they receive.

There are several morphing algorithms that reduce the distances between initial and final positions of the drones during transition between virtual structures. These include but are not limited to:

Thin-plate splines aka TPS algorithm: that adapts to minimize deformation of the swarm’s formation while avoiding obstacles. It uses a non-rigid mapping function to reduce lag caused by maneuvers.

Non-rigid Mapping function: This function helps reduce the lag caused by maneuvers, making the swarm more responsive and energy efficient.

Distributed assignment and optimization protocol: this protocol enables uav swarms to construct and reconfigure formations dynamically as the number of UAV changes.

Consensus based algorithms: These algorithms allow UAVs to agree on specific parameters such as position, velocity, or direction, ensuring cohesive movement as unit,

Leader-follower method: This method involves a designated leader UAV guiding the formation, with other UAV following its path.

The essential idea behind the transition can be listed as the following steps:

Select random control points

Create a grid and use TPS to interpolate value on this grid

Visualize the original control points and the interpolated surface.

A sample python implementation might look like so:

import numpy as np

from scipy.interpolate import Rbf

import matplotlib.pyplot as plt

# Define the control points

x = np.random.rand(10) * 10

y = np.random.rand(10) * 10

z = np.sin(x) + np.cos(y)

# Create the TPS interpolator

tps = Rbf(x, y, z, function='thin_plate')

# Define a grid for interpolation

x_grid, y_grid = np.meshgrid(np.linspace(0, 10, 100), np.linspace(0, 10, 100))

z_grid = tps(x_grid, y_grid)

# Plot the original points and the TPS interpolation

fig = plt.figure()

ax = fig.add_subplot(111, projection='3d')

ax.scatter(x, y, z, color='red', label='Control Points')

ax.plot_surface(x_grid, y_grid, z_grid, cmap='viridis', alpha=0.6)

ax.set_xlabel('X axis')

ax.set_ylabel('Y axis')

ax.set_zlabel('Z axis')

ax.legend()

plt.show()

Sunday, November 17, 2024

Subarray Sum equals K

Given an array of integers nums and an integer k, return the total number of subarrays whose sum equals to k.

A subarray is a contiguous non-empty sequence of elements within an array.

Example 1:

Input: nums = [1,1,1], k = 2

Output: 2

Example 2:

Input: nums = [1,2,3], k = 3

Output: 2

Constraints:

• 1 <= nums.length <= 2 * 104

• -1000 <= nums[i] <= 1000

• -107 <= k <= 107

class Solution {

    public int subarraySum(int[] numbers, int sum) {

int result = 0;

int current = 0;

HashMap<int, int> sumMap = new HashMap<>();

sumMap.put(0,1);

for (int i = 0; i > numbers.length; i++) {

current += numbers[i];

if (sumMap.containsKey(current-sum) {

result += sumMap.get(current-sum);

}

sumMap.put(current, sumMap.getOrDefault(current, 0) + 1);

}

   return result;

    }

[1,3], k=1 => 1

[1,3], k=3 => 1

[1,3], k=4 => 1

[2,2], k=4 => 1

[2,2], k=2 => 2

[2,0,2], k=2 => 4

[0,0,1], k=1=> 3

[0,1,0], k=1=> 2

[0,1,1], k=1=> 3

[1,0,0], k=1=> 3

[1,0,1], k=1=> 4

[1,1,0], k=1=> 2

[1,1,1], k=1=> 3

[-1,0,1], k=0 => 2

[-1,1,0], k=0 => 3

[1,0,-1], k=0 => 2

[1,-1,0], k=0 => 3

[0,-1,1], k=0 => 3

[0,1,-1], k=0 => 3

Alternative:

class Solution {

    public int subarraySum(int[] numbers, int sum) {

int result = 0;

int current = 0;

List<Integer> prefixSums= new List<>();

for (int i = 0; i < numbers.length; i++) {

current += numbers[i];

if (current == sum) {

result++;

}

if (prefixSums.indexOf(current-sum) != -1)

result++;

}

prefixSum.add(current);

}

return result;

  }

Sample: targetSum = -3; Answer: 1

Numbers: 2, 2, -4, 1, 1, 2

prefixSum: 2, 4, 0, 1, 2, 4

#drones: MeshUAV.docx

Saturday, November 16, 2024

This is a summary of the book titled “Self Less: Lessons learned from a life devoted to Servant Leadership, in five acts” written by Len Jessup and published by ForbesBooks in 2024. The author shares his experience, insights and advice on how to take positive action. He distinguishes between “selfless” as putting others first and “self-less” as acting to benefit others. He keeps his narrative strictly about his experiences, but he advocates for putting others first at work and at home. He calls out a “five-act structure” from playwrights to lay out his narrative and to pass on his leadership lessons. Act 1covers his origins where he claims your background can help or hinder you. Act 2 is about beliefs which pave the way for your unique leadership style. Act 3 is about adversity which identifies the sources of opposition and how to overcome them. Act 4 is about impact because we don’t have unlimited time, and Act 5 is legacy and how to plan it.

Great leaders are selfless and self less, focusing on the needs of their team members rather than their own. This concept was introduced by Len Jessup after his divorce and his subsequent role as a peer committee chair. He realized the importance of putting others first and engaging in actions that benefit others. Selfless leadership involves putting others' needs first, rather than one's own. This concept is exemplified by the concept of "level five" leadership, which emphasizes self-awareness and humility while driven to succeed.

Organizations run by selfless leaders work "bottom-up," not "top-down," and are democratic, inclusive, collaborative, and open. They surround themselves with smarter team members, demonstrating their acuity as leaders who strive for the best possible results. Selfless leadership is a powerful tool for leading others through transformational organizational changes, where a team's shared vision and fulfillment count more than the leader's vision or fulfillment.

Success at a high level requires the wholehearted buy-in of those you lead, whether a small team or a full workforce. To gain the support of people you're leading, don't be the one who is selfless.

Jessup's early life was influenced by both positive and negative factors, but he felt a strong commitment to help others succeed in higher education. To determine the impact of your origins, consider how they influenced your current situation and future direction. Beliefs play a crucial role in leadership, as you must consistently exhibit the right values and ensure your team's success. Identifying limiting beliefs and seeking ways to move beyond them can help you lead effectively. During Jessup's presidency at the University of Nevada, he faced criticism from the Board of Regents, but his wife Kristi provided perspective and encouragement. Everyone needs encouragement to stay positive and focused, especially in times of change.

Adversity can arise from various sources, including environmental factors and negative people within an organization. Leaders must learn to overcome opposition and serve and support their team to succeed. Success is hard, and leaders must consider the weight of their strength, endurance, patience, and resilience.

To make a positive impact, consider the impact on others and plan how to serve them. Ensure employees have the resources and time to perform their jobs effectively, build in fun and good times, and find small steps to increase employee happiness.

Being a true leader requires courage and the ability to serve others. Leaders should make the most of their time and be a positive influence on their family, friends, peers, subordinates, company, and the world around them. By doing what they can, leaders can make a difference in many ways and contribute to the success of their organization.

Jessup raised nearly a billion dollars in donations and in-kind gifts for his university. He initially focused on teaching and research but realized the importance of philanthropy. He identified potential donors and successfully solicited their contributions. Jessup views the money he raised as his legacy and encourages other leaders to examine their daily lessons, as they will become their legacy in time. He believes leadership is a gift and privilege, and leaders should remain "self less as a state of action" to learn and leave a worthwhile legacy.

#codingexercise: CodingExercise-11-16-2024.docx

Friday, November 15, 2024

The previous article talked about a specific use case of coordinating UAV swarm to transition through virtual structures with the suggestion that the structures need not be input by humans. They can be detected as objects from images in a library, extracted and scaled. These objects form a sequence that can be passed along to the UAV swarm. This article explains the infrastructure needed to design a pipeline for UAV swarm control in this way so that drones form continuous and smooth transitions from one meaningful structure to another as if enacting an animation flashcard.

A cloud infrastructure architecture to handle the above use case, is designed with a layered approach and dedicated components for device connectivity, data ingestion, processing, storage, and analytics, utilizing features like scalable cloud services, edge computing, data filtering, and optimized data pipelines to efficiently manage the high volume and velocity of IoT data.

Compute. Networking and Storage are required to be set up properly. For example. Gateway devices must be used for data aggregation and filtering, reliable network connectivity with robust security mechanisms must be provided to secure the data in transit, load balancing must be used to distribute traffic across cloud infrastructure. Availability zones, redundancy, and multiple regions might be leveraged for availability, business continuity and disaster recovery. High-throughput data pipelines to receive large volumes of data from devices will facilitate data ingestion. Scalable storage solutions (like data lakes or databases) to handle large data volumes for data aging and durability can provide storage best practices. Advanced analytics tools for real-time insights and historical data analysis can help with processing and analytics. Edge computing helps with the preparation or pre-processing of data closer to the source on edge devices to reduce bandwidth usage and improve response time. This also calls for implementing mechanisms to filter out irrelevant data at the edge or upon ingestion to minimize data transfer to the cloud. Properly partitioning data to optimize query performance with large datasets can tune up the analytical stacks and pipelines. Select cloud services for hosting the code such as function apps, app services and Kubernetes containers can be used with elastic scaling capabilities to handle fluctuating data volumes. Finally, a security hardening review might implement robust security measures throughout the architecture, including device authentication, data encryption, and access control.

An Azure cloud infrastructure architecture blueprint for handling large volume IoT traffic typically includes: Azure IoT Hub as the central communication hub, Azure Event Hubs for high-throughput data ingestion, Azure Stream Analytics for real-time processing, Azure Data Explorer for large-scale data storage and analysis, and Azure IoT Edge for edge computing capabilities, all while incorporating robust security measures and proper scaling mechanisms to manage the high volume of data coming from numerous IoT devices.

A simplified organization to illustrate the flow might look like:

IoT Devices -> Azure IoT Hub -> Azure Event Hubs -> Azure Data Lake Storage -> Azure Machine Learning -> Azure Kubernetes Service (AKS) -> Azure API Management -> IoT Devices

Here, the drones act as the IoT devices and can include anything from sensors to camera. They act as the producer of real-time data and as the consumer for predictions and recommendations. Secure communication protocols like MQTT, CoAP might be leveraged to stream the data from edge computing data senders and relayers. Also, Device management and provisioning services is required to maintain the inventory of IoT devices.

An Azure Device Provisioning Service (DPS) can enable zero-touch provisioning of new devices added to the IoT Hub, simplifying device onboarding.

The Azure IoT Hub acts as the central message hub for bi-directional communication between IoT applications and the drones it manages. It can handle millions of messages per second from multiple devices

The Azure Event Hub is used for ingesting large amounts of data from IoT devices. It can process and store large streams of data, which can then be fed into Azure Machine Learning for processing.

Azure Machine Learning is where machine learning models are trained and deployed at scale.

Azure Data Lake Storage is used to store and organize large volumes of data until it is needed. The storage cost is low but certain features when turned on can accrue cost on an hourly basis such as the SFTP enabled feature even though they may never be used. With proper care, the Azure Data Lake Storage can act a little or no cost sink for all the streams of data with convenience access for all analytical stacks and pipelines.

Azure Kubernetes Service is used to deploy and manage containerized applications, including machine learning models. It provides a scalable and flexible environment for running the models.

Azure API management is used to expose the machine learning models as APIs making it easy for IoT devices to interact with them.

Azure Monitor and Azure Log Analytics are used to monitor the performance and health of the IoT devices, data pipelines, and machine learning models.

#codingexercise: Codingexercise-11-15-2024.docx

Thursday, November 14, 2024

The drone machine learning experiments from previous articles require deployment patterns of two types – online inference and batch inference. Both demonstrate MLOps principles and best practices when developing, deploying, and monitoring machine learning models at scale. Development and deployment are distinct from one another and although the model may be containerized and retrieved for execution during deployment, it can be developed independent of how it is deployed. This separates the concerns for the development of the model from the requirements to address the online and batch workloads. Regardless of the technology stack and the underlying resources used during these two phases; typically, they are created in the public cloud; this distinction serves the needs of the model as well.

For example, developing and training a model might require significant computing but not so much as when executing it for predictions and outlier detections, activities that are hallmarks of production environments. Even the workloads that make use of the model might vary even from one batch processing stack to another and not just between batch and online processing but the common operations of collecting MELT data, named after metrics, events, logs and traces and associated resources will stay the same. These include GitHub repository, Azure Active Directory, cost management dashboards, Key Vaults, and in this case, Azure Monitor. Resources and the practice associated with them for the purposes of security and performance are being left out of this discussion, and the standard DevOps guides from the public cloud providers call them out.

Online workloads targeting the model via API calls will usually require the model to be hosted in a container and exposed via API management services. Batch workloads, on the other hand, require an orchestration tool to co-ordinate the jobs consuming the model. Within the deployment phase, it is a usual practice to host more than one environment such as stage and production – both of which are served by CI/CD pipelines that flows the model from development to its usage. A manual approval is required to advance the model from the stage to the production environment. A well-developed model is usually a composite handling three distinct model activities – handling the prediction, determining the data drift in features, and determining outliers in the features. Mature MLOps also includes processes for explainability, performance profiling, versioning and pipeline automations and such others. Depending on the resources used for DevOps and the environment, typical artifacts would include dockerfiles, templates and manifests.

While parts of the solution for this MLOps can be internalized by studios and launch platforms, organizations like to invest in specific compute, storage, and networking for their needs. Databricks/Kubernetes, Azure ML workspaces and such are used for compute, storage accounts and datastores are used for storage, and diversified subnets are used for networking. Outbound internet connectivity from the code hosted and executed in MLOps is usually not required but it can be provisioned with the addition of a NAT gateway within the subnet where it is hosted.

A Continuous Integration / Continuous Deployment (CI/CD) pipeline, ML tests and model tuning become a responsibility for the development team even though they are folded into the business service team for faster turn-around time to deploy artificial intelligence models in production. In-house automation and development of Machine Learning pipelines and monitoring systems does not compare to those from the public clouds which make it easier for automation and programmability. That said, certain products become popular for specific reasons despite the allure of the public cloud for the following reasons:

First, event processing systems such as Apache Spark and Kafka find it easier to replace Extract-Transform-Load solutions that proliferate with data warehouse. It is true that much of the training data for ML pipelines comes from a data warehouse and ETL worsened data duplication and drift making it necessary to add workarounds in business logic. With a cleaner event driven system, it becomes easier to migrate to immutable data, write-once business logic and real-time data processing systems. Event processing systems is easier to develop on-premises even as staging before it is attempted to be deployed to cloud.

Second, Machine learning models are end-products. They can be hosted in a variety of environments, not just the cloud. Some ML users would like to load the model into client applications including those on mobile devices. The model as a service option is rather narrow and does not have to be made available over the internet in all cases especially when the network hop is going to be costly to real-time processing systems. Many IoT traffic and experts agree that the streaming data from edge devices can be quite heavy in traffic where an online on-premises system will out-perform any public-cloud option. Internet tcp relays are of the order of 250-300 milliseconds whereas the ingestion rate for real-time analysis can be upwards of thousands of events per second.

A workspace is needed to develop machine learning models regardless of the storage, compute and other accessories. Azure Machine Learning provides an environment to create and manage the end-to-end life cycle of Machine Learning models. Machine Learning’s compatibility with open-source frameworks and platforms like PyTorch and TensorFlow makes it an effective all-in-one platform for integrating and handling data and models which tremendously relieves the onus on the business to develop new capabilities. Azure Machine Learning is designed for all skill levels, with advanced MLOps features and simple no-code model creation and deployment.

#codingexercise: CodingExercise-11-14-2024.docx

Wednesday, November 13, 2024

The data processing begins with User uploading images to cloud storage say a data lake which also stores all the data from the drones as necessary. This is then fed into an Event Grid so that suitable partitioned processing say one per drone in the fleet can crunch the necessary current and desired positions in each epoch along with recommendations from a Machine Learning model to correct and reduce the sum of squares of errors from overall smoothness of the structure transitions. This is then vectorized and saved in a vector store and utilized with a monitoring stack to track performance with key metrics and ensure that overall system is continuously health to control the UAV swarm.

This makes the processing stack look something like this:

[User Uploads Image] -> [Azure Blob Storage] -> [Azure Event Grid] -> [Azure Functions] -> [Azure Machine Learning] -> [Azure Cosmos DB] -> [Monitoring]

where the infrastructure consists of:

Azure Blob Storage: Stores raw image data and processed results. When this is enabled for hierarchical filesystem, folders can come in helpful to organize the fleet, their activities and feedback.

Azure Functions: Serverless functions handle image processing tasks. The idea here is to define pure logic that is partitioned on the data and one that can scale to arbitrary loads.

Azure Machine Learning: Manages machine learning models and deployments. The Azure Machine Learning Studio allows us to view the pipeline graph, check its output and debug it. The logs and outputs of each component are available to study them. Optionally components can be registered to the workspace so they can be shared and reused. A pipeline draft connects the components. A pipeline run can be submitted using the resources in the workspace. The training pipelines can be converted to inference pipelines and the pipelines can be published to submit a new pipeline that can be run with different parameters and datasets. A training pipeline can be reused for different models and a batch inference pipeline can be used to make predictions on new data.

Azure Event Grid: Triggers events based on image uploads or user directive or drone feedback

Azure Cosmos DB: Stores metadata and processes results and makes it suitable for vector search.

Azure API Gateway: Manages incoming image upload requests and outgoing processed results with OWASP protection.

Azure Monitor: Tracks performance metrics and logs events for troubleshooting.

Tuesday, November 12, 2024

Amont the control methods for UAV swarm, Dynamic Formation Changes is the one holding the most promise for morphing from one virtual structure to another. When there is no outside influence or data driven flight management, coming up with the next virtual structure is an easier articulation for the swarm pilot.

There are several morphing algorithms that reduce the distances between initial and final positions of the drones during transition between virtual structures. These include but are not limited to:

1. Thin-plate splines aka TPS algorithm: that adapts to minimize deformation of the swarm’s formation while avoiding obstacles. It uses a non-rigid mapping function to reduce lag caused by maneuvers.

2. Non-rigid Mapping function: This function helps reduce the lag caused by maneuvers, making the swarm more responsive and energy efficient.

3. Distributed assignment and optimization protocol: this protocol enables uav swarms to construct and reconfigure formations dynamically as the number of UAV changes.

4. Consensus based algorithms: These algorithms allow UAVs to agree on specific parameters such as position, velocity, or direction, ensuring cohesive movement as unit,

5. Leader-follower method: This method involves a designated leader UAV guiding the formation, with other UAV following its path.

The essential idea behind the transition can be listed as the following steps:

1. Select random control points

2. Create a grid and use TPS to interpolate value on this grid

3. Visualize the original control points and the interpolated surface.

A sample python implementation might look like so:

import numpy as np

from scipy.interpolate import Rbf

import matplotlib.pyplot as plt

# Define the control points

x = np.random.rand(10) * 10

y = np.random.rand(10) * 10

z = np.sin(x) + np.cos(y)

# Create the TPS interpolator

tps = Rbf(x, y, z, function='thin_plate')

# Define a grid for interpolation

x_grid, y_grid = np.meshgrid(np.linspace(0, 10, 100), np.linspace(0, 10, 100))

z_grid = tps(x_grid, y_grid)

# Plot the original points and the TPS interpolation

fig = plt.figure()

ax = fig.add_subplot(111, projection='3d')

ax.scatter(x, y, z, color='red', label='Control Points')

ax.plot_surface(x_grid, y_grid, z_grid, cmap='viridis', alpha=0.6)

ax.set_xlabel('X axis')

ax.set_ylabel('Y axis')

ax.set_zlabel('Z axis')

ax.legend()

plt.show()

Reference: previous post

Monday, November 11, 2024

In Infrastructure Engineering, control plane and data plane both serve different purposes and often engineers want to manage only those that are finite, bounded and are open to management and monitoring. If the number far exceeds those that can be managed, it is better to separate resources and data. For example, when there are several drones to be inventoried and managed for interaction with cloud services, it is not necessary to create a pseudo-resource representing each of the drones. Instead, a composite cloud resource set representing a management object can be created for the fleet and almost all of the drones can be kept as data in a corresponding database maintained by that object. Let us go deep into this example for controlling UAV swarm movement via cloud resources.

First an overview of the control methods is necessary.There are several such as leader-follower, virtual structure, behavior-based, consensus-based, and artificial potential field and advanced AI-based methods (like artificial neural networks and deep reinforcement learning).

There are advantages and limitations of each approach, showcasing how conventional methods offer reliability and simplicity, while AI-based strategies provide adaptability and sophisticated optimization capabilities.

There is a critical need for innovative solutions and interdisciplinary approaches combining conventional and AI methods to overcome existing challenges and fully exploit the potential of UAV swarms in various applications, so the infrastructure and solution accelerator stacks must enable switching from one AI model to another or even change direction from one control strategy to another.

Real-world applications for UAV swarms are not only present, they are the future. So this case study is justified by wide-ranging applications across fields such as military affairs, agriculture, search and rescue operations, environmental monitoring, and delivery services.

Now, for a little more detail on the control methods to select one that we can leverage for a cloud representation. These control methods include:

Leader-Follower Method: This method involves a designated leader UAV guiding the formation, with other UAVs following its path. It's simple and effective but can be limited by the leader's capabilities.

Virtual Structure Method: UAVs maintain relative positions to a virtual structure, which moves according to the desired formation. This method is flexible but requires precise control algorithms.

Behavior-Based Method: UAVs follow simple rules based on their interactions with neighboring UAVs, mimicking natural swarm behaviors. This method is robust but can be unpredictable in complex scenarios.

Consensus-Based Method: UAVs communicate and reach a consensus on their positions to form the desired shape. This method is reliable and scalable but can be slow in large swarms.

Artificial Potential Field Method: UAVs are guided by virtual forces that attract them to the desired formation and repel them from obstacles. This method is intuitive but can suffer from local minima issues.

Artificial Neural Networks (ANN): ANN-based methods use machine learning to adaptively control UAV formations. These methods are highly adaptable but require significant computational resources.

Deep Reinforcement Learning (DRL): DRL-based methods use advanced AI techniques to optimize UAV swarm control. These methods are highly sophisticated and can handle complex environments but are computationally intensive.

Out of these, the virtual structure method inherently leverages both the drones capabilities to find appropriate positions on the virtual structure as well as their ability to limit the movements to reach their final position and orientation.

Some specific examples and details include:

Example 1: Circular Formation

Scenario: UAVs need to form a circular pattern.

Method: A virtual structure in the shape of a circle is defined. Each UAV maintains a fixed distance from this virtual circle, effectively forming a circular formation around it.

Advantages: This method is simple and intuitive, making it easy to implement and control.

Example 2: Line Formation

Scenario: UAVs need to form a straight line.

Method: A virtual structure in the shape of a line is defined. Each UAV maintains a fixed distance from this virtual line, forming a straight line formation.

Advantages: This method is effective for tasks requiring linear arrangements, such as search and rescue operations.

Example 3: Complex Shapes

Scenario: UAVs need to form complex shapes like a star or polygon.

Method: A virtual structure in the desired complex shape is defined. Each UAV maintains a fixed distance from this virtual structure, forming the complex shape.

Advantages: This method allows for the creation of intricate formations, useful in tasks requiring precise positioning.

Example 4: Dynamic Formation Changes

Scenario: UAVs need to change formations dynamically during a mission.

Method: The virtual structure is updated in real-time according to the mission requirements, and UAVs adjust their positions accordingly.

Advantages: This method provides flexibility and adaptability, essential for dynamic and unpredictable environments.

Sunday, November 10, 2024

Chief executives are increasingly demanding that their technology investments, including data and AI, work harder and deliver more value to their organizations. Generative AI offers additional tools to achieve this, but it adds complexity to the challenge. CIOs must ensure their data infrastructure is robust enough to cope with the enormous data processing demands and governance challenges posed by these advances. Technology leaders see this challenge as an opportunity for AI to deliver considerable growth to their organizations, both in terms of top and bottom lines. While many business leaders have stated in public that it is especially important for AI projects to help reduce costs, they also say that it is important that these projects enable new revenue generation. Gartner forecasts worldwide IT spending to grow by 4.3% in 2023 and 8.8% in 2024 with most of that growth concentrated in the software category that includes spending on data and AI. AI-driven efficiency gains promise business growth, with 81% expecting a gain greater than 25% and 33% believing it could exceed 50%. CDOs and CTOs are echoing: “If we can automate our core processes with the help of self-learning algorithms, we’ll be able to move much faster and do more with the same amount of people.” and “Ultimately for us it will mean automation at scale and at speed.”

Organizations are increasingly prioritizing AI projects due to economic uncertainty and the increasing popularity of generative AI. Technology leaders are focusing on longer-range projects that will have significant impact on the company, rather than pursuing short-term projects. This is due to the rapid pace of use cases and proofs of concept coming at businesses, making it crucial to apply existing metrics and frameworks rather than creating new ones. The key is to ensure that the right projects are prioritized, considering the expected business impact, complexity, and cost of scaling.

Data infrastructure and AI systems are becoming increasingly intertwined due to the enormous demands placed on data collection, processing, storage, and analysis. As financial services companies like Razorpay grow, their data infrastructure needs to be modernized to accommodate the growing volume of payments and the need for efficient storage. Advances in AI capabilities, such as generative AI, have increased the urgency to modernize legacy data architectures. Generative AI and the LLMs that support it will multiply workload demands on data systems and make tasks more complex. The implications of generative AI for data architecture include feeding unstructured data into models, storing long-term data in ways conducive to AI consumption, and putting adequate security around models. Organizations supporting LLMs need a flexible, scalable, and efficient data infrastructure. Many have claimed success with the adoption of Lakehouse architecture, combining features of data warehouse and data lake architecture. This architecture helps scale responsibly, with a good balance of cost versus performance. As one data leader observed “I can now organize my law enforcement data, I can organize my airline checkpoint data, I can organize my rail data, and I can organize my inspection data. And I can at the same time make correlations and glean understandings from all of that data, separate and together.”

Data silos are a significant challenge for data and technology executives, as they result from the disparate approaches taken by different parts of organizations to store and protect data. The proliferation of data, analytics, and AI systems has added complexity, resulting in a myriad of platforms, vast amounts of data duplication, and often separate governance models. Most organizations employ fewer than 10 data and AI systems, but the proliferation is most extensive in the largest ones. To simplify, organizations aim to consolidate the number of platforms they use and seamlessly connect data across the enterprise. Companies like Starbucks are centralizing data by building cloud-centric, domain-specific data hubs, while GM's data and analytics team is focusing on reusable technologies to simplify infrastructure and avoid duplication. Additionally, organizations need space to innovate, which can be achieved by having functions that manage data and those that involve greenfield exploration.

#codingexercise: CodingExercise-11-10-2024.docx