Thursday, April 30, 2026

 This is a summary of a book titled “Wait, You Need It When?!?: The Essential Guide to Time Management, Productivity, and Powerful Habits That Get Things Done” written by Peter Economy and published by Career Press in 2026. This book argues that time is the one resource you can never replenish, yet many people treat it as if it were infinite. The result is a workday filled with drift: low-value tasks, constant interruptions, and habits that quietly consume hours. One estimate suggests employees spend about 51% of the workday on tasks that add little value, while social media, email checking, and unnecessary meetings further erode focus. The author stresses that this isn’t merely an efficiency issue; it is a life-management issue. “Money you can get more of, belongings come and go, but once you’ve burned through a particular piece of time, you can never retrieve it….There’s no going back, only forward.”

When time management breaks down, the consequences show up everywhere. Individually, it can mean rushed or sloppy work, missed deadlines, and fewer opportunities to grow. For organizations, it translates into productivity losses, lower quality, delayed delivery, and higher turnover. The damage can ripple outward to customers when follow-through falters, and to colleagues who may feel they are compensating for someone else’s disorganization. The author also highlights a less visible cost: when work expands to fill evenings and weekends, personal relationships and basic self-care are often the first to be squeezed out, leaving people both less present at home and less effective at work.

To regain control, the book emphasizes making deliberate choices about attention and priorities. That starts with ranking tasks by importance and urgency, setting goals that are challenging but realistic, and then translating those goals into small, actionable steps. It also means protecting concentration by eliminating distractions, delegating where appropriate, and using breaks strategically so focus can recover before it collapses. Practical tactics—like scheduling blocks of uninterrupted time for demanding work, tracking how you actually spend your hours, and learning to say no to nonessential requests—create the conditions for consistent progress. He encourages mindfulness as well: noticing the patterns that sabotage your intentions and staying flexible enough to adapt when circumstances change.

Because time feels different depending on what you’re doing, The author recommends building awareness of your subjective experience of it. Meaningful work can make hours pass quickly, while monotonous tasks can feel endless; stress and feeling “behind” can warp your sense of the day. A brief reset—such as a short mindfulness practice—can reduce the sensation of rushing and help you return to the present, where better choices are easier to make.

The author calls for a “serious business mindset”—a purpose-driven attitude that builds credibility and keeps your efforts aligned with your goals. One concrete way to support that mindset is to design a workspace that signals focus. Ergonomic tools, lighting and noise adjustments, and an organized layout all reduce friction. Even small environmental choices matter: research cited in the book suggests that the freedom to personalize a workspace can raise productivity, while plants can provide a modest boost; clutter, by contrast, makes sustained attention harder. He also notes that productivity is not simply a function of longer hours. Regular breaks and clear boundaries protect both performance and work-life balance, and they prevent others from assuming you are available at all times.

Interruptions are especially costly because each shift of attention has a recovery price; the book cites an average of 23 minutes and 15 seconds to fully return to a task after an interruption. To reduce that tax, he advises setting expectations with colleagues by blocking deep-work periods and clearly communicating when you will and won’t be reachable. Technology can reinforce these boundaries through “do not disturb” settings and website blockers, while collaboration tools can replace meetings that don’t require real-time discussion. Physical cues—like closing a door or using headphones—can help others recognize focus time. Just as important is practicing single-tasking: scheduling one to three hours for a single priority rather than bouncing between demands, and keeping “digital hygiene” strong by unsubscribing from unwanted lists, turning off nonessential notifications, and maintaining an orderly file system.

Sustained performance, the book suggests, comes from routines that balance structure with adaptability. By identifying your peak energy windows and building time blocks around them, you can create consistency without becoming rigid. Techniques like the Pomodoro method—working in focused 20- to 30-minute intervals followed by short breaks, with a longer break after several rounds—provide a simple rhythm that prevents burnout while keeping momentum. Goal setting, too, should be both disciplined and flexible. The author highlights the CLEAR framework (Collaborative, Limited, Emotional, Appreciable, Refinable), which encourages seeking input, keeping goals to a manageable number, tying them to what genuinely matters to you, breaking them into milestones you can recognize and celebrate, and refining them as conditions evolve.

Daily to-do lists play an important supporting role by freeing mental bandwidth and making priorities explicit. To make lists actionable rather than overwhelming, He draws on David Allen’s Getting Things Done approach: capture everything that demands attention, clarify the next action and desired outcome, organize tasks in a system that fits your contexts and deadlines, reflect regularly to delete, delegate, or reprioritize, and then engage with the items that will have the greatest impact. The same respect for time applies to meetings. With a significant portion of meetings viewed as ineffective and many running longer than an hour, the book recommends clarifying purpose, using a timed agenda, limiting attendance to the people who can decide or contribute meaningfully, and ending with clear action items and follow-up dates. Finally, he connects productivity to intrinsic motivation: when your work aligns with values, passions, and purpose, focus becomes easier to sustain. He encourages experimentation—trying new classes, volunteering, or networking in inspiring spaces—and reflecting on what energizes you, because “As long as you’re still living and breathing, you can do something different. So if you need to make a change, don’t hesitate: The time is now.”


Wednesday, April 29, 2026

 What Confluent can tell us about video sensing applications?

Confluent’s Streaming Data platform is a cloud-native, fully managed event streaming system built on Apache Kafka but rearchitected from the ground up for elastic scalability, real-time processing, and enterprise-grade governance. At its heart, the platform turns raw data in motion into reliable, governed data products that power real-time applications, analytics, and AI.

The Foundation: KORA, Confluent’s Cloud-Native Kafka Engine

Everything starts with KORA, Confluent’s custom-engineered version of Apache Kafka. Unlike traditional Kafka deployments, KORA is designed for a multi-tenant, serverless cloud architecture. It delivers millions of messages per second with sub-10ms latency and guarantees 99.99% uptime through multi-availability-zone clustering. Topics are partitioned across brokers for horizontal scale and fault tolerance, and producers and consumers are fully decoupled—meaning you can add or evolve services without breaking dependencies.

Storage That Scales: Tiered Storage Architecture

One of KORA’s most powerful innovations is its three-tier storage system, which replaces Kafka’s traditional single-layer local-disk storage:

• Hot tier (memory/SSD): Stores recent data for ultra-low-latency access.

• Warm tier (local SSD cache): Handles intermediate retention.

• Cold tier (cloud object storage like S3, GCS, or Azure Blob): Provides infinite, cost-effective retention.

After data segments are flushed, they’re automatically moved to colder, cheaper storage while metadata is tracked internally. This separation of compute and storage lets you scale each independently and retain data for months or years at a fraction of the cost—something vanilla Kafka can’t do efficiently.

Governance and Quality: Schema Registry and Stream Governance

To keep streaming data trustworthy, Confluent includes a centralized Schema Registry that manages Avro, Protobuf, and JSON Schema with strict compatibility rules (backward, forward, full, or none). This ensures producers and consumers stay in sync even as schemas evolve.

Built on top is the Stream Governance suite, which delivers three critical capabilities:

1. Stream Quality: Enforces data contracts with schema validation and business rule checks.

2. Stream Catalog: Provides data discovery with tagging and rich business metadata.

3. Stream Lineage: Maps end-to-end event flows, showing exactly where data comes from and where it goes.

Together, these tools turn chaotic data streams into governed, high-quality data products.

Connectors: Plug-and-Play Data Integration

Confluent ships with 120+ pre-built Kafka connectors for databases, data warehouses, cloud services, and more. These source and sink connectors abstract away the complexity of data integration. You can also apply transformations on the fly using Single Message Transformations (SMTs), making it easy to clean, enrich, or reformat data as it moves through the platform.

Stream Processing: Real-Time Computation at Scale

For real-time computation, the platform supports multiple processing engines:

• Apache Flink®: A powerful engine for stateful stream processing with automatic schema evolution handling.

• Kafka Streams: A lightweight client library for building stream processing applications with a processor topology of source, processor, and sink nodes. It uses a depth-first processing strategy and partition-based state stores, avoiding backpressure issues.

• ksqlDB: A streaming SQL engine that lets you query and transform data using familiar SQL syntax.

• Tableflow: Creates materialized views for real-time analytics.

The platform even supports LLM and ML model inference directly inside stream processing, enabling streaming agents that can invoke external tools—bringing AI capabilities into real-time data pipelines.

Multi-Cloud, Hybrid, and Geo-Replication

Confluent is built for modern cloud realities:

• Cluster Linking enables geo-replication across clusters and clouds.

• Multi-cloud support includes native integration with S3, GCS, and Azure Blob, plus a Bring-Your-Own-Cloud (BYOC) option.

• Networking security includes private linking, VPC peering, and end-to-end in-transit encryption.

• The architecture supports Kappa architecture, unifying operational and analytical workloads in a single pipeline.

This flexibility lets you run Confluent consistently across AWS, GCP, Azure, or on-premises environments.

How Data Flows Through the Platform

Imagine this journey:

1. Producers send events into Kafka topics, which are partitioned distributed logs.

2. Schema Registry validates each event against its schema.

3. Data lands in tiered storage, automatically moving from hot to cold as it ages.

4. Connectors pull data in from or push data out to external systems.

5. Flink, Kafka Streams, or ksqlDB process the data in real time.

6. Processed data flows to consumer applications, data warehouses, analytics dashboards, or AI models.

Because producers and consumers are decoupled, you can add, remove, or scale any part of this pipeline without disrupting the rest.

Why Confluent Stands Out

Compared to vanilla Kafka, Confluent delivers:

• Tiered storage for infinite retention at low cost.

• Auto-scaling that’s 30× faster than manual Kafka rebalancing.

• Built-in governance with Schema Registry and Stream Governance.

• Fully managed operations with a 99.99% uptime SLA.

• Multiple processing engines: Flink + ksqlDB + Kafka Streams, not just one.

In short, Confluent’s Streaming Data platform transforms the challenge of managing real-time data into a seamless, governed, and scalable experience—enabling event-driven architectures, real-time analytics, and AI applications powered by high-quality, trusted data in motion.

What this architecture informs for AI Agents specifically for drone video sensing applications

AI agents are usually arranged in one of the following patterns:

• Automatic Query Decomposition by one agent co-ordinating with other agents to invoke each of the queries incurring token costs in parallel per agent.

• Lambda processing or function app agents: scaling to workload for predefined routines on a task by task basis.

• Reasoning agent: forming a breakdown of step-by-step tasks for execution and query response reconstitution.

• Model Context Protocol enabled Agents: for agents to independently reach each other for fulfilment.

• Grounding Agents: with connectivity to online or specific data sources or services.

What Confluent architecture suggests is to perform this at an event-by-event basis on a perpetual agent as follows:

package com.sms.event;

/**

 * This represents an observable notification.

 * @param <T> The type of event that is to be observed.

 */

public interface Notifier<T> {

    /**

     * Attach a listener for notification type T.

     * @param listener This is the listener.

     *

     */

    void subscribe(final Listener<T> listener);

    /**

     * Detach a listener.

     */

    void unsubscribe();

    /**

     * finished notifying.

     */

    void onCompleted();

    /**

     * regular event processing.

     */

    void onNext(T notification);

    /**

     * failed event processing.

     */

    void onError(Throwable exception);

}

package com.sms.event;

/**

 * Listener interface for receiving notifications.

 * @param <T> Notification type.

 */

 @FunctionalInterface

public interface Listener<T> {

    /**

     * Attach a notifier for notification type T.

     * @param notifier This is the notifier.

     *

     */

    void subscribe(final Notifier<T> notifier);

    /**

     * Detach a notifier.

     */

    void unsubscribe();

    /**

     * finished notifying.

     */

    void onCompleted();

    /**

     * regular event processing.

     */

    void onNext(T notification);

    /**

     * failed event processing.

     */

    void onError(Throwable exception);

}

package com.sms.event;

import java.util.concurrent.Executors;

import java.util.concurrent.ExecutorService;

import java.util.Map;

import java.util.HashMap;

import javax.annotation.concurrent.GuardedBy;

import lombok.Data;

import lombok.Synchronized;

import lombok.extern.slf4j.Slf4j;

/**

 * Equivalent of a message broker.

 * @param <T> Type of notification.

 */

@Slf4j

public class NotificationSystem<T extends Notification> {

     @GuardedBy("$lock")

    private final Map<String, Notifier<T>> notifierMap = new HashMap<String, Notifier<T>>();

    private final Map<String, Listener<T>> listenerMap = new HashMap<String, Listener<T>>();

    private final ExecutorService executorService = Executors.newFixedThreadPool(1);

    @SuppressWarnings({ "unchecked", "rawtypes" })

    @Synchronized

    public void addListener(final String type,

                            final Listener<T> listener) {

        if (!isListenerPresent(listener)) {

            listenerMap.put(type, listener);

        }

    }

    /**

     * This method will notify listeners.

     *

     * @param notification Notification.

     * @param <T> Type of notification.

     */

    @Synchronized

    public void notify(final T notification) {

        String type = notification.getClass().getSimpleName();

        Listener<T> listener = listenerMap.get(type);

        log.info("Executing listener of type: {} for notification: {}", type, notification);

        executorService.submit(() -> {

            try {

                listener.onNext(notification);

            } catch (Throwable ex) {

                listener.onError(ex);

            }

        });

    }

    @Synchronized

    public void removeListener(final String type, final Listener<T> listener) {

        listenerMap.remove(type);

    }

    private boolean isListenerPresent(final Listener<T> listener) {

        return listenerMap.values().stream().anyMatch(le -> le.equals(listener));

    }

    @SuppressWarnings({ "unchecked", "rawtypes" })

    @Synchronized

    public void addNotifier(final String type,

                            final Notifier<T> notifier) {

        if (!isNotifierPresent(notifier)) {

            notifierMap.put(type, notifier);

        }

    }

    @Synchronized

    public void removeNotifier(final String type, final Notifier<T> notifier) {

        notifierMap.remove(type);

    }

    public boolean isNotifierPresent(final Notifier<T> notifier) {

        return notifierMap.values().stream().anyMatch(n -> n.equals(notifier));

    }

    public boolean isSubscriberPresent(final Listener<T> listener) {

        return listenerMap.values().stream().anyMatch(l -> l.equals(listener));

    }

}


Tuesday, April 28, 2026

 This is a summary of a book titled “Trust Agents: Using the Web to Build Influence, Improve Reputation, and Earn Trust” written by Chris Brogan and Julien Smith and published by Wiley in 2009. People become less trusting, and the public’s skepticism toward institutions runs high as time passes. In this environment, traditional marketing and polished corporate messaging don’t build confidence; they often deepen suspicion. The authors argue that the web—because it is connective, searchable, and radically transparent—offers companies a different path. Instead of trying to control the message or hide imperfections, organizations can earn credibility by showing up as real participants in online communities. The people who make this work are what Brogan and Smith call “trust agents”: individuals who represent a business without acting like salespeople, who trade pressure for presence, and who build influence by being useful and genuine. For implementors of OAuth protocol, this relates to bringing audience from third party websites.

A trust agent’s influence comes from understanding a core shift: online, people don’t want to be “managed” by brands; they want to be cared for by humans. The authors stress that effective participants are not infiltrators who join groups to extract value, and they are not loud promoters trying to “convert” every interaction into a transaction. They are power users of modern web tools—blogs, feeds, social networks, audio and video platforms—but the tools matter less than the approach. The web is described as a gigantic lever: once you publish something helpful publicly, it can continue to reach new people long after you press “post,” and one thoughtful answer can save you from repeating the same response in countless private emails. Over time, that visible generosity becomes reputation, and reputation becomes trust.

To act with credibility, the book says, you first have to listen. Brogan and Smith recommend building a “listening station” so you can understand what online communities already believe about your company and your competitors—what they praise, what they distrust, and what questions keep resurfacing. Their 2009 instructions are anchored in the tools of that moment (Google services, feed readers, and blog search engines like Technorati), but the underlying practice is timeless: set up a system that continuously surfaces mentions of your organization, your products, and the themes your customers care about. The goal is not surveillance for its own sake; it is awareness. Only by paying attention can you participate in ways that feel responsive rather than performative.

Once you can “see the map” of what people are saying, you can begin to contribute. The authors emphasize that the content you create online—whether posts, videos, podcasts, or simple comments—has durability. Because it remains discoverable, it can keep answering questions and demonstrating your expertise long after the moment has passed. This is where social capital forms: when you repeatedly help people solve problems, clarify confusing topics, or point them toward useful resources, the community starts to recognize you as someone worth listening to. That recognition is not merely popularity; it is a kind of stored goodwill you can draw on later when you need to introduce an idea, request feedback, or rally people around a project.

From there, Brogan and Smith organize the trust agent’s mindset into six interlocking principles. Each principle is less a rigid rule than a way of behaving that makes trust more likely to form in public spaces where anyone can evaluate you. Together they encourage experimentation, belonging, leverage, relationship-building, empathy, and collective action—skills that turn the web from a broadcasting channel into a place where influence is earned.

The first principle, “Make Your Own Game,” argues that the internet rewards those willing to challenge industry habits. Online you can set new terms, reach audiences directly, and bypass gatekeepers who once controlled distribution. The book highlights musicians who rewrote the rules: the Arctic Monkeys built momentum through MySpace, and Radiohead experimented with a pay-what-you-want release that still generated massive sales. These examples illustrate the broader point: trust agents don’t wait for permission. They watch what the community values, take smart risks, and create approaches that feel fresh rather than formulaic.

To support that spirit of experimentation, the authors borrow a framework from Douglas Rushkoff: treating culture—and the web—as a kind of game you can learn, hack, and even redesign. At first you “play,” learning the norms and feedback signals of your space: links, comments, followers, revenue, and the general sentiment people express in public. Then you begin to “cheat,” not by being dishonest but by thinking laterally—finding unusual, effective ways to use familiar tools or sell familiar offerings. Finally, you may move into “programming,” building something new entirely and discovering its rules through trial, error, and persistence. In the trust agent’s world, that willingness to learn and iterate becomes a visible marker of competence and confidence.

The second principle, “One of Us,” focuses on belonging. Trust online is rarely granted to outsiders who sound like advertisements, and it is quickly withdrawn from anyone who appears self-serving. The book points to an early and influential example: Microsoft employee Robert Scoble, who blogged candidly about his company—even criticizing products. That openness helped him gain standing in technical communities, not because he was perfect, but because he was plainly real. Brogan and Smith connect this to the “trust equation” described in The Trusted Advisor: credibility, reliability, and intimacy raise trust, while self-orientation lowers it. Online, these factors still apply, but they are shaped by what other people publicly say about you, by the consistency of your visible actions over time, and by the surprising power of “verbal intimacy” in a world with fewer nonverbal cues.

The third principle, the “Archimedes Effect,” explains how the web turns small efforts into outsized outcomes. Like a lever, online platforms amplify reputation, relationships, and time: a single introduction can connect networks, and a single well-placed resource can help thousands. Yet the authors warn that leverage collapses the moment you treat your audience as targets. Trust agents serve as helpful gatekeepers for their communities, curating information, connecting people, and staying focused on long-term value rather than short-term selling.

Using that leverage well requires what the authors call “multicapitalism”: the ability to recognize different forms of value—money, attention, credibility, access, goodwill—and to exchange them intelligently. They offer Donald Trump as an example of turning one kind of capital into another: wealth into visibility, visibility into new ventures. For a trust agent, the more ethical version of this is building a presence online, meeting people in person when possible, and then sustaining the relationship with ongoing online touches. Over time, those repeated, generous interactions become the compounding force behind influence.

The fourth principle, “Agent Zero,” describes a particular kind of network position. Trust agents often sit at the hub of conversations, not because they demand attention, but because they continuously connect people and ideas. They comment, respond, congratulate, and share—quickly and sincerely. They use their network to solve problems, introduce collaborators, and spotlight other people’s work. Ironically, by staying out of the spotlight and acting with a service mindset, they become highly visible in the way that matters: as dependable human links within a community.

The fifth principle, “Human Artist,” is about interpersonal skill—especially empathy, observation, and respect for social norms. Brogan and Smith argue that trust agents succeed because they are good at reading the room, even when “the room” is a comment thread, a forum, or a fast-moving social feed. They take time to learn which communities matter to them, what those people value, and what behavior is considered acceptable. They listen before they speak, match the tone of the space, and follow a web-friendly version of the Golden Rule: treat online contacts the way you would want to be treated. Most importantly, they resist the temptation to market to new online friends. In community settings, aggressive selling is often treated as a violation, and it can damage reputation faster than any single mistake.

The sixth principle, “Build an Army,” highlights the web’s ability to coordinate people at scale. With platforms such as wikis, review sites, and social networks, trust agents can gather large groups around a shared purpose, helping them collaborate in ways that were once impractical. Wikipedia is an obvious example of crowdsourcing’s potential, but the authors also point to corporate efforts that succeed when they prioritize participation over persuasion. General Motors’ GMNext.com, for instance, gave customers wiki-style tools and space to share stories about vehicles they loved. The initiative worked precisely because GM didn’t treat the community like a pipeline for hard sales; it treated it as a place where customers could express identity and enthusiasm in their own words—marketing that feels credible because it isn’t forced.

In the final pages, the advice becomes practical and immediate: show up where your communities already gather, and communicate more than you think you need to. Join relevant networks, build a base of contacts, and don’t be overly cautious about connecting with people you haven’t met yet—online, relationships often begin as lightweight interactions that deepen over time. Use tools like Twitter (and today’s equivalents) to learn what people care about in real time. Comment thoughtfully on blogs and forums, answer questions, and “check in” regularly so your presence is steady rather than sporadic. The authors’ challenge is simple: aim to become the best communicator the web has ever seen, not by talking the most, but by listening well, contributing generously, and earning trust one visible interaction at a time.


Monday, April 27, 2026

 Azure Web App Logging

An Azure Web App can log in two broad ways: locally on the app host for quick troubleshooting, or externally through Azure Monitor diagnostic settings for longer-lived and downstream analytics use. The best choice depends on the following factors: speed and simplicity, or durability, integration, and centralized operations.

Logging options

Local logging writes logs to the App Service file system, where you can download them or access them over FTPS. This is the lightest-weight option for development and short investigations, and Azure App Service supports FTPS-only mode so you can avoid plain FTP; if you are using file-system logging, a common optimization is to keep retention at 0 days and size quota around 35 MB so you do not accumulate unnecessary storage or incur avoidable cost on the app resource.

Diagnostic settings send logs to a Storage account, Event Hub, or Log Analytics. This is the better fit when you need centralized retention, querying, or forwarding to operational tools such as Splunk through Event Hub or another ingestion pipeline, but it can generate meaningful storage and ingestion volume depending on how verbose the selected log categories are

Practical trade-offs

Local file-system logging is usually faster to access and easier for developers because the logs sit close to the app and can be pulled immediately. The downside is that it is not designed for long-term retention or enterprise-scale observability, and the footprint should be kept intentionally small so it does not compete with the app for space or create unnecessary overhead.

Diagnostic settings are better for compliance, analytics, and cross-team access because they move data out of the app into durable Azure services. The trade-off is cost and volume: app logs, HTTP logs, and platform logs can grow quickly, and sending all categories to Storage or Event Hub increases both ingestion and downstream processing costs, especially if a SIEM such as Splunk also charges for indexed volume.

Blob storage option

Sending logs to Azure Blob Storage is often the middle ground between local-only logs and a full streaming pipeline. Compared with keeping logs on the app host, blob storage gives you better retention, easier central access, and stronger separation of duties; compared with Event Hub, it is simpler and usually cheaper for archive-style retention, but less suitable for real-time operational forwarding.

From a security perspective, blob storage is preferable when you want to restrict access with managed identities, RBAC, and private networking rather than exposing the app host file system or broadly granting FTPS access. In general, the more external the log destination, the better your control plane story becomes, but the more important it is to secure identities, network paths, and storage permissions.

Cost impact

When logging is turned on for all log types, the monthly cost increases in two places: the App Service side and the destination side. On the app side, local logging can consume file-system quota and operational overhead, while external logging can add Azure Monitor, Storage, Event Hub, and downstream SIEM costs; in practice, the biggest cost driver is usually log volume rather than the mere act of enabling logging

A full “everything on” configuration can become expensive if verbose application logs, HTTP logs, and platform diagnostics are all emitted continuously. The right way to manage cost is to limit categories to what is actually needed, reduce verbosity in production, and set retention policies that match the business need instead of defaulting to indefinite collection

Premium tier considerations

If the app service plan is upgraded to the lowest Premium tier, turning on logging through diagnostic settings is generally a better production pattern than relying on only local file logging. Premium gives more headroom for performance-sensitive workloads, but logging still adds CPU, I/O, and network overhead, especially if the destination is remote and every write must be exported out of the app path

The main security concern is not the Premium tier itself, but the expanded data flow: logs may contain request paths, headers, identifiers, or exception details, so access to the destination must be tightly limited. The main performance concern is bursty log generation, which can increase latency if the app spends too much time serializing and exporting log data rather than serving requests

Dev and ops access

A good pattern is to optimize for both developer and operational needs by splitting access modes. Developers can use local logs or near-real-time access for low-latency troubleshooting and faster iteration, while operations teams consume the same data centrally with read-only access, least privilege, and controlled retention in Storage, Event Hub, or a SIEM pipeline

This reduces friction because developers get interactive access without waiting on a downstream pipeline, while operations gets governed, durable visibility with auditability and restricted permissions. In practice, that usually means keeping local logs small and temporary, and pushing only the logs needed for production observability into centralized destinations

Recommendations

Azure’s general direction for App Service logging is to use local logs for short-lived troubleshooting, diagnostic settings for durable monitoring, and secure transport and access controls for anything beyond the app host. FTPS should be limited to FTPS-only or disabled when not needed, detailed error pages should not be exposed to clients in production, and logging categories should be scoped narrowly to reduce cost and noise.

A popular policy posture is:

• Keep local file-system logs small, temporary, and developer-focused.

• Use diagnostic settings for production retention and centralized monitoring.

• Route only necessary categories to Storage or Event Hub.

• Restrict destination access with least privilege and private connectivity where possible.

• Treat log content as sensitive operational data and control retention accordingly

Sunday, April 26, 2026

 Continued from previous article 


Some replicas are asynchronous by nature and are called observers. They do not participate in the in-sync replica or become a partition leader, but they restore availability to the partition and allow producers to produce data again. Connected clusters might involve clusters in distinct and different geographic regions and usually involve linking between the clusters. Linking is an extension of the replica fetching protocol that is inherent to a single cluster. A link contains all the connection information necessary for the destination cluster to connect to the source cluster. A topic on the destination cluster that fetches data over the cluster link is called a mirror topic. This mirror may have a same or prefixed name, synced configurations, byte for byte copy and consumer offsets as well as access control lists.

Managed services over brokers complete the delivery value to the business from standalone deployments of brokers such that cluster sizing, over-provisioning, failover design and infrastructure management are automated. They are known to amplify the availability to 99.99% uptime service-level agreement. Often, they involve a replicator which is a worker that executes connector and its tasks to co-ordinate data streaming between source and destination broker clusters. A replicator has a source consumer that consumes the records from the source cluster and then passes these records to the Connect framework. The Connect framework would have a built-in producer that then produces these records to the destination cluster. It might also have dedicated clients to propagate overall metadata updates to the destination cluster.

In a geographically distributed replication for business continuity and disaster recovery, the primary region has the active cluster that the producers and consumers write to and read from, and the secondary region has read-only clusters with replicated topics for read only consumers. It is also possible to configure two clusters to replicate to each other so that both of them have their own sets of producers and consumers but even in these cases, the replicated topic on either side will only have read-only consumers. Fan-in and Fan-out are other possible arrangements for such replication.

Disaster recovery almost always occurs with a failover of the primary active cluster to a secondary cluster. When disaster strikes, the maximum amount of data usually measured in terms of time that can be lost after a recovery is minimized by virtue of this replication. This is referred to as the Recovery Point Objective. The targeted duration until the service level is restored to the expectations of the business process is referred to as the Recovery Time Objective. The recovery helps the system to be brought back to operational mode. Cost, business requirements, use cases and regulatory and compliance requirements mandate this replication and the considerations made for the data in motion for replication often stand out as best practice for the overall solution.

One of the toughest challenges in data engineering has been the diversity of stacks, platforms, products and logic to the detriment of smooth operations, business continuity and disaster recovery. The problem stems from the dichotomy between assets and debt. When developers spend time writing to say SQL edge, then they find a greater debt to move to an open-source stack because the data operations proliferate and there is very little curating. That is why planning for all the Ops consideration is just as necessary at design time as the feature itself.


#codingexercise: CodingExercise-04-26-2026.docx

Saturday, April 25, 2026

 (Continued from previous article)

When these IoT resources are shared, isolation model, impact-to-scaling performance, state management and security of the IoT resources become complex. Scaling resources helps meet the changing demand from the growing number of consumers and the increase in the amount of traffic. We might need to increase the capacity of the resources to maintain an acceptable performance rate. Scaling depends on number of producers and consumers, payload size, partition count, egress request rate and usage of IoT hubs capture, schema registry, and other advanced features. When additional IoT is provisioned or rate limit is adjusted, the multitenant solution can perform retries to overcome the transient failures from requests. When the number of active users reduces or there is a decrease in the traffic, the IoT resources could be released to reduce costs. Data isolation depends on the scope of isolation. When the storage for IoT is a relational database server, then the IoT solution can make use of IoT Hub. Varying levels and scope of sharing of IoT resources demands simplicity from the architecture. Patterns such as the use of the deployment stamp pattern, the IoT resource consolidation pattern and the dedicated IoT resources pattern help to optimize the operational cost and management with little or no impact on the usages.   

Edge computing relies heavily on asynchronous backend processing. Some form of message broker becomes necessary to maintain order between events, retries and dead-letter queues. The storage for the data must follow the data partitioning guidance where the partitions can be managed and accessed separately. Horizontal, vertical, and functional partitioning strategies must be suitably applied. In the analytics space, a typical scenario is to build solutions that integrate data from many IoT devices into a comprehensive data analysis architecture to improve and automate decision making.

Event Hubs, blob storage, and IoT hubs can collect data on the ingestion side, while they are distributed after analysis via alerts and notifications, dynamic dashboarding, data warehousing, and storage/archival. The fan-out of data to different services is itself a value addition but the ability to transform events into processed events also generates more possibilities for downstream usages including reporting and visualizations.

One of the main considerations for data pipelines involving ingestion capabilities for IoT scale data is the business continuity and disaster recovery scenario. This is achieved with replication.  A broker stores messages in a topic which is a logical group of one or more partitions. The broker guarantees message ordering within a partition and provides a persistent log-based storage layer where the append-only logs inherently guarantee message ordering. By deploying brokers over more than one cluster, geo-replication is introduced to address disaster recovery strategies.

Each partition is associated with an append-only log, so messages appended to the log are ordered by the time and have important offsets such as the first available offset in the log, the high watermark or the offset of the last message that was successfully written and committed to the log by the brokers and the end offset  where the last message was written to the log and exceeds the high watermark. When a broker goes down, subsequent durability and availability must be addressed with replicas. Each partition has many replicas that are evenly distributed but one replica is elected as the leader and the rest are followers. The leader is where all the produce and consume requests go, and followers replicate the writes from the leader.

A pull-based replication model is the norm for brokers where dedicated fetcher threads periodically pull data between broker pairs. Each replica is a byte-for-byte copy of each other, which makes this replication offset preserving. The number of replicas is determined by the replication factor. The leader maintains a ledge called the in-sync replica set, where messages are committed by the leader after all replicas in the ISR set replicate the message. Global availability demands that brokers are deployed with different deployment modes. Two popular deployment modes are 1) a single broker that stretches over multiple clusters and 2) a federation of connected clusters.


Thursday, April 23, 2026

 Data in motion – IoT solution and data replication

The transition of data from edge sensors to the cloud is a data engineering pattern that does not always get a proper resolution with the boilerplate Event-Driven architectural design proposed by the public clouds because much of the fine tuning is left to the choice of the resources, event hubs and infrastructure involved in the streaming of events. This article explores the design and data in motion considerations for an IoT solution beginning with an introduction to the public cloud proposed design, the choices between products and the considerations for the handling and tuning of distributed, real-time data streaming systems with particular emphasis on data replication for business continuity and disaster recovery. A sample use case can include the continuous events for geospatial analytics in fleet management and its data can include driverless vehicles weblogs.

Event Driven architecture consists of event producers and consumers. Event producers are those that generate a stream of events and event consumers are ones that listen for events. The right choice of architectural style plays a big role in the total cost of ownership for a solution involving events.

The scale out can be adjusted to suit the demands of the workload and the events can be responded to in real time. Producers and consumers are isolated from one another. IoT requires events to be ingested at very high volumes. The producer-consumer design has scope for a high degree of parallelism since the consumers are run independently and in parallel, but they are tightly coupled to the events. Network latency for message exchanges between producers and consumers is kept to a minimum. Consumers can be added as necessary without impacting existing ones.

Some of the benefits of this architecture include the following: The publishers and subscribers are decoupled. There are no point-to-point integrations. It's easy to add new consumers to the system. Consumers can respond to events immediately as they arrive. They are highly scalable and distributed. There are subsystems that have independent views of the event stream.

Some of the challenges faced with this architecture include the following: Event loss is tolerated so if there needs to be guaranteed delivery, this poses a challenge. IoT traffic mandates a guaranteed delivery. Events are processed in exactly the order they arrive. Each consumer type typically runs in multiple instances, for resiliency and scalability. This can pose a challenge if the processing logic is not idempotent, or the events must be processed in order.

The benefits and the challenges suggest some of these best practices. Events should be lean and mean and not bloated. Services should share only IDs and/or a timestamp. Large data transfer between services is an antipattern. Loosely coupled event driven systems are best.

IoT Solutions can be proposed either with an event driven stack involving open-source technologies or via a dedicated and optimized storage product such as a relational engine that is geared towards edge computing. Either way capabilities to stream, process and analyze data are expected by modern IoT applications. IoT systems vary in flavor and size. Not all IoT systems have the same certifications or capabilities.