Tuesday, November 5, 2024

 This is an extension of use cases for a drone management software platform in the cloud. The DFCS

Autonomous drone fleet movement has been discussed with the help of self-organizing maps as implemented in https://github.com/raja0034/som4drones but sometimes the fleet movement must be controlled precisely especially at high velocity and large number or high density when the potential for congestion or collision is high. For example, when the velocity is high, drones and missiles might fly similarly. A centralized algorithm to control their movement for safe arrival of all units might make it more convenient and cheaper for the units if there is continuous connectivity during flight. One such algorithm is the virtual structure method.

 The virtual structure method for UAV swarm movement is a control strategy where the swarm is organized into a predefined formation, often resembling a geometric shape. Instead of relying on a single leader UAV, the swarm is controlled as if it were a single rigid body or virtual structure. Each UAV maintains its position relative to this virtual structure, ensuring cohesive and coordinated movement. The steps include:

Virtual Structure Definition: A virtual structure, such as a line, triangle, or more complex shape, is defined. This structure acts as a reference for the UAVs' positions.

Relative Positioning: Each UAV maintains its position relative to the virtual structure, rather than following a specific leader UAV. This means that if one UAV moves, the others adjust their positions to maintain the formation.

Coordination and Control: The UAVs use local communication and control algorithms to ensure they stay in their designated positions within the virtual structure. This can involve adjusting speed, direction, and altitude based on the positions of neighboring UAVs.

Fault Tolerance: Since the control does not rely on a single leader, the swarm can be more resilient to failures. If one UAV fails, the others can still maintain the formation.

 A sample implementation where each UAV follows a leader to maintain a formation might appear as follows:

import numpy as np

class UAV:

    def __init__(self, position):

        self.position = np.array(position)

class Swarm:

    def __init__(self, num_uavs, leader_position, formation_pattern):

        self.leader = UAV(leader_position)

        self.uavs = [UAV(leader_position + offset) for offset in formation_pattern]

    def update_leader_position(self, new_position):

        self.leader.position = np.array(new_position)

        self.update_uav_positions()

    def update_uav_positions(self):

        for i, uav in enumerate(self.uavs):

            uav.position = self.leader.position + formation_pattern[i]

    def get_positions(self):

        return [uav.position for uav in self.uavs]

# Example usage

num_uavs = 5

leader_position = [0, 0, 0] # Initial leader position

formation_pattern = [np.array([i*2, 0, 0]) for i in range(num_uavs)] # Line formation

swarm = Swarm(num_uavs, leader_position, formation_pattern)

# Update leader position

new_leader_position = [5, 5, 5]

swarm.update_leader_position(new_leader_position)

# Get updated positions of all UAVs

positions = swarm.get_positions()

print("Updated UAV positions:")

for pos in positions:

    print(pos)


Monday, November 4, 2024

 Chief executives are increasingly demanding that their technology investments, including data and AI, work harder and deliver more value to their organizations. Generative AI offers additional tools to achieve this, but it adds complexity to the challenge. CIOs must ensure their data infrastructure is robust enough to cope with the enormous data processing demands and governance challenges posed by these advances. Technology leaders see this challenge as an opportunity for AI to deliver considerable growth to their organizations, both in terms of top and bottom lines. While a large number of business leaders have stated in public that it is especially important for AI projects to help reduce costs, they also say that it is important that these projects enable new revenue generation. Gartner forecasts worldwide IT spending to grow by 4.3% in 2023 and 8.8% in 2024 with most of that growth concentrated in the software category that includes spending on data and AI. AI-driven efficiency gains promise business growth, with 81% expecting a gain greater than 25% and 33% believing it could exceed 50%. CDOs and CTOs are echoing: “If we can automate our core processes with the help of self-learning algorithms, we’ll be able to move much faster and do more with the same amount of people.” and “Ultimately for us it will mean automation at scale and at speed.”

Organizations are increasingly prioritizing AI projects due to economic uncertainty and the increasing popularity of generative AI. Technology leaders are focusing on longer-range projects that will have significant impact on the company, rather than pursuing short-term projects. This is due to the rapid pace of use cases and proofs of concept coming at businesses, making it crucial to apply existing metrics and frameworks rather than creating new ones. The key is to ensure that the right projects are prioritized, considering the expected business impact, complexity, and cost of scaling.

Data infrastructure and AI systems are becoming increasingly intertwined due to the enormous demands placed on data collection, processing, storage, and analysis. As financial services companies like Razorpay grow, their data infrastructure needs to be modernized to accommodate the growing volume of payments and the need for efficient storage. Advances in AI capabilities, such as generative AI, have increased the urgency to modernize legacy data architectures. Generative AI and the LLMs that support it will multiply workload demands on data systems and make tasks more complex. The implications of generative AI for data architecture include feeding unstructured data into models, storing long-term data in ways conducive to AI consumption, and putting adequate security around models. Organizations supporting LLMs need a flexible, scalable, and efficient data infrastructure. Many have claimed success with the adoption of Lakehouse architecture, combining features of data warehouse and data lake architecture. This architecture helps scale responsibly, with a good balance of cost versus performance. As one data leader observed “I can now organize my law enforcement data, I can organize my airline checkpoint data, I can organize my rail data, and I can organize my inspection data. And I can at the same time make correlations and glean understandings from all of that data, separate and together.”

Data silos are a significant challenge for data and technology executives, as they result from the disparate approaches taken by different parts of organizations to store and protect data. The proliferation of data, analytics, and AI systems has added complexity, resulting in a myriad of platforms, vast amounts of data duplication, and often separate governance models. Most organizations employ fewer than 10 data and AI systems, but the proliferation is most extensive in the largest ones. To simplify, organizations aim to consolidate the number of platforms they use and seamlessly connect data across the enterprise. Companies like Starbucks are centralizing data by building cloud-centric, domain-specific data hubs, while GM's data and analytics team is focusing on reusable technologies to simplify infrastructure and avoid duplication. Additionally, organizations need space to innovate, which can be achieved by having functions that manage data and those that involve greenfield exploration.

Reference: previous article


Sunday, November 3, 2024

 There are N points (numbered from 0 to N−1) on a plane. Each point is colored either red ('R') or green ('G'). The K-th point is located at coordinates (X[K], Y[K]) and its color is colors[K]. No point lies on coordinates (0, 0).

We want to draw a circle centered on coordinates (0, 0), such that the number of red points and green points inside the circle is equal. What is the maximum number of points that can lie inside such a circle? Note that it is always possible to draw a circle with no points inside.

Write a function that, given two arrays of integers X, Y and a string colors, returns an integer specifying the maximum number of points inside a circle containing an equal number of red points and green points.

Examples:

1. Given X = [4, 0, 2, −2], Y = [4, 1, 2, −3] and colors = "RGRR", your function should return 2. The circle contains points (0, 1) and (2, 2), but not points (−2, −3) and (4, 4).

class Solution {

    public int solution(int[] X, int[] Y, String colors) {

        // find the maximum

        double max = Double.MIN_VALUE;

        int count = 0;

        for (int i = 0; i < X.length; i++)

        {

            double dist = X[i] * X[i] + Y[i] * Y[i];

            if (dist > max)

            {

                max = dist;

            }

        }

        for (double i = Math.sqrt(max) + 1; i > 0; i -= 0.1)

        {

            int r = 0;

            int g = 0;

            for (int j = 0; j < colors.length(); j++)

            {

                if (Math.sqrt(X[j] * X[j] + Y[j] * Y[j]) > i)

                {

                    continue;

                }

                if (colors.substring(j, j+1).equals("R")) {

                    r++;

                }

                else {

                    g++;

                }

            }

            if ( r == g && r > 0) {

                int min = r * 2;

                if (min > count)

                {

                    count = min;

                }

            }

        }

        return count;

    }

}

Compilation successful.

Example test: ([4, 0, 2, -2], [4, 1, 2, -3], 'RGRR')

OK

Example test: ([1, 1, -1, -1], [1, -1, 1, -1], 'RGRG')

OK

Example test: ([1, 0, 0], [0, 1, -1], 'GGR')

OK

Example test: ([5, -5, 5], [1, -1, -3], 'GRG')

OK

Example test: ([3000, -3000, 4100, -4100, -3000], [5000, -5000, 4100, -4100, 5000], 'RRGRG')

OK


Saturday, November 2, 2024

 A previous article talked about ETL, it’s modernization, new take on old issues and resolutions, and efficiency and scalability. This section talks about the bigger picture where this fits in.

In terms of infrastructure for data engineering projects, customers usually get started on a roadmap that progressively builds a more mature data function. One of the approaches for drawing this roadmap that experts observe as repeated across deployment stamps involves building a data stack in distinct stages with a stack for every phase on this journey. While needs, level of sophistication, maturity of solutions, and budget determines the shape these stacks take, the four phases are more or less distinct and repeated across these endeavors. They are starters, growth, machine-learning and real-time. Customers begin with a starters stack where the essential function is to collect the data and often involve implementing a drain. A unified data layer in this stage significantly reduces engineering bottlenecks. A second stage is the growth stack which solves the problem of proliferation of data destinations and independent silos by centralizing data into a warehouse which also becomes a single source of truth for analytics. When this matures, customers want to move beyond historical analytics and into predictive analytics. At this stage, a data lake and machine learning toolset come handy to leverage unstructured data and mitigate problems proactively. The next and final frontier to address is the one that overcomes a challenge in this current stack which is that it is impossible to deliver personalized experiences in real-time.

In this way, organizations solve the point-to-point integration problem by implementing a unified, event-based integration layer in the starters stack. Then when the needs became a little more sophisticated—to enable downstream teams

(and management) to answer harder questions and act on all of the data, they will centralize both clickstream data and relational data to build a full picture of the customer and their journey. After solving these challenges by implementing a cloud data warehouse as the single source of truth for all customer data, and then using reverse ETL pipelines to activate that data, organizations gear up towards the next stage. As the business grew, optimization required moving from historical analytics to predictive analytics, including the need to incorporate unstructured data into the analysis. To accomplish that, organizations implemented the ML Stack, which included a data lake (for unstructured data), and a basic machine learning tool set that could generate predictive outputs like churn scores. Finally,

these outputs are put to use by sending them through the warehouse and reverse ETL

pipelines, making them available as data points in downstream systems including CRM for customer touchpoints.

#codingexercise: CodingExercise-11-02-2024.docx