Cluster computing

Monday, January 22, 2024

People and Data - a summary.

This is a summary of the book titled “People and Data” written by Thomas C. Redman and published by Kogan Page in 2023. Data is essential for people to act meaningfully in business, government, and private life. It is the fuel that runs the world and requires high-quality data. However, individuals and businesses often fail to prioritize data when building technological infrastructure, organizing operations, or choosing new tech tools. He urges companies to modernize their approach to and use of data, focusing on improving data quality and aligning data and business priorities. Companies must optimize their application of quality data, including ordinary employees in their approach and involving them in data generation.

Data enables meaningful action in private life and commerce, such as anticipating and shopping for needs, making rational home-purchasing decisions, and analyzing sales trends and supply chains. However, the data space has problems, such as distrust of data and the monetary cost of bad data. Businesses must make their data work by utilizing data science technologies or engendering a data-driven culture. Leaders must organize their businesses to emphasize data use and build technological infrastructure to support data use at the firms' required scale.

Organizations must reconfigure their "organization for data" to address data quality. This reconfiguration should consider five issues: people involved, data flow, information technology management, teams working on data, and people leading data-driven projects. Missing people is the single most important force holding data programs back. Data becoming a business's central driver requires the involvement of everyone in the company, including non-specialist roles. Better data use will increase revenue, lower costs, reduce errors, and foster a closer relationship between employees and customers.

Organizing for data involves smooth and coordinated movement of data between departments and people, with technology and data managers separate. Transformation comes from both the bottom-up and top-down, with young employees offering innovations and senior leadership managing coordination. Everyone should join the "data generation," as the prevalence of politically motivated misinformation has changed people's views on low-quality data.

All organizations must prioritize improving data quality, as only about 3% of companies' data meets basic quality standards and only 16% of managers trust the data they commonly use. The most transformative uses of data involve data-driven decision-making, data-driven cultures, and treating data as assets.

Small data is more important for most companies than big data, as it improves a company's operations, products, and customer acquisition. Companies often overlook the value of small data, as it involves fewer people and uses hundreds of data points. Small data projects offer more problem-solving opportunities and can reduce time wasted in interactions between colleagues and streamline in-house work processes.

Data is a team sport, but silos can inhibit effective teamwork. To address this, companies must build "fat organizational pipes" – channels of two-way communication between departments, individuals, and up and down the company hierarchy. These pipes include the "customer-supplier model," "data supply chain management," "data science bridge," and "common language."

Organizations must align their data and business priorities, and a data team should include executives, technology experts, and those managing data supply chains. Data teams should include an executive, entrepreneur, developer, and data security officer. Effective communication and collaboration between data creators and customers are crucial for increasing data quality and reducing errors.

Data projects require strong leadership and data program coordinators to empower employees with independence and responsibilities. Companies should align data projects with business priorities, form data teams, and interact with employees daily. They should understand project problems, embrace aspirations, address anxieties, and train employees to solve their own problems.

Previous book summaries: BookSummary42.docx
Summarizing Software: SummarizerCodeSnippets.docx 

Sunday, January 21, 2024

This is a continuation of the listing of design choices for infrastructure consolidation that fall into well-known patterns. These include the following patterns:  

11. Securing perimeter: Some of the security concerns are about opening up ip addresses and port to the public internet. Web API requests are notorious for various kinds of exploits and proper hardening involves using a firewall to route all the requests. This pattern manifests as a central resource that has an attached firewall for all the web applications and services hosted on different hosts. 

12. Deployment of dashboards and application insights: While technically not deemed a requirement, it is always useful to add anything that helps with the continuous monitoring of the deployed resources to the point that troubleshooting can be done by anyone merely by glancing at the dashboards. This kind of infrastructure deployment maturity is essential for large teams and infrastructure because it offloads the maintenance.

13. Dependency chaining – Most infrastructure is deployed with their dependencies independently so there is usually not a case of dependencies blocking another deployment. That said, it might take multiple times for a pipeline to run before all the dependencies and the dependent resources are created. Sometimes this might take several deployments if the generated resource identifiers or guids must be passed in manually to other resources or their configurations. If dependencies are deployed together for the same business purpose, it might even be beneficial to use the same user managed identity for all of the associated resources.

14. Redeployment – All dependencies and their access control identities must be in sync for the whole deployment to work together. Sometimes, this gets broken when identities change, or dependencies are replaced. Even creating one by the same name and different principal_ids can cause disruptions. It is best to remove and add back the entities by the same names so that the associations can be refreshed, and they don’t use the stale identifiers that they relied upon. Redeployment does not get captured in IaC but it is a pattern one cannot do without.

15. Policy compliance – Almost all resource types have their best practices and recommended settings. Compliance of resources to these policies is equally important as it safeguards the resources and assures overall safety. The type of effect a policy has depends on its settings, so they might selectively apply across resource types and span different resource and infrastructure grouping. Having a common module where the policies can be enforced as a one-point maintenance is not only a good thing but also time-saving. Since the common module can be versioned, instance can still point to older versions if they must.

16. Metadata – The state stored for an IaC is sometimes referred to as the metadata of a resource to which it must be synchronized whenever there is a drift. However, state is as scoped as the infrastructure groupings and can be somewhat disparate making it necessary to locate the state before it can be interpreted. Since state is usually comprised of json with entries each for different resources or modules as listed in the infrastructure, it is easy to view these collectively as a key-value pair where the keys are the references to the resources in the Portal and the values are their json descriptions.   

Previous articles: IaCResolutionsPart65.docx 

Saturday, January 20, 2024

This is a summary of the book titled “Neuroscience for learning and development” written by Stella Collins who is a co-founder of workplace training firm Stellar Labs. In this book, she offers insights into the brain and its learning processes. She suggests creating learning environments that are conducive to learning and making learning "stickier" through motivation and storytelling. Neuroscience is a complex field that involves the human brain, which contains about 86 billion neurons and processes information for communication, thinking, learning, and living. Neurotransmitters and brain hormones aid in various brain functions, and brain imaging techniques like MRI and PET provide comprehensive understanding. Neuroplasticity allows the brain to change as neurogenesis creates new neurons throughout life. However, it is essential to maintain a healthy skepticism towards emerging neuroscientific claims, as they often involve artificial situations and Western, educated, industrialized, rich, and democratic subjects. To critically evaluate neuroscience claims, ask questions about the researchers, their agenda, publication date, method, and results. Learning is a process that occurs over time within the mind and body, requiring effort and motivation to conserve energy.

Learning changes the way brain cells interact with each other, and various types of learning can be used to enhance learning. Models like "Brain Friendly Learning" and "MASTER model" use neuroscience to infuse learning content with meaning and memory techniques for retention. Curiosity and motivation are essential for effective learning, as they release dopamine, focus attention, and act as rewards. Attending to physical and emotional needs, fostering social connections, and encouraging persistence in practice and repetition can help learners remain receptive to new ideas and create their own.

The brain processes information from various internal and external senses, with vision being the strongest sense in learning. Designing learning to engage as many senses as possible makes the material more engaging and memorable. Arousal is crucial for learning, as it involves staying alert, directing attention, and focusing on the subject matter. Attention spans vary from five to 20 minutes, but learners choose to refocus on engaging material.

The Goldilocks level of stress is crucial for cognitive tasks to reach their peak performance. To increase attention, increase complexity of processing, reduce multitasking demands, and determine whether multisensory input is helpful or detrimental. Challenge learners with tasks beyond their current abilities and knowledge and incorporate movement and complex digital interaction. Maintain an engaging and dynamic learning environment to avoid boring learners. Learning involves more than absorbing and remembering information. The concept of multiple intelligence suggests that individuals may have strengths in specific areas, and teaching essential materials in diverse ways can enhance design and delivery skills. Harvard University's Howard Gardner proposes eight intelligences: linguistic, logical-mathematical, visual-spatial, body/physical, musical, interpersonal, intrapersonal, and naturalist.

Learning necessitates memorization, with declarative (explicit) memories and nondeclarative (implicit) memories. Understanding the difference between these can improve how people learn. To help people remember, link new information to things they already know, use novelty, repetition, meaning, organization, smell, and context.

Learning effectively involves creating strong neural connections through regular, independent, and active reviews of important materials. Spaced repetition and learning over a longer period help build long-term relationships between neurons and improve connectivity. Telling memorable stories and using straightforward language improves audience comprehension and retention. Emphasizing compliant behaviors and multisensory metaphors engages the learner's sensory cortex, making learning experiences more stimulating and memorable. Encouraging an exploratory, playful, and creative approach to learning helps learners better understand and retain new information. Remember, memories tend to decay rapidly if not actively reinforced.

Previous book summaries: BookSummary41.docx
Summarizing Software: SummarizerCodeSnippets.docx 

Friday, January 19, 2024

This is a continuation of the listing of design choices for infrastructure consolidation that fall into well-known patterns. These include the following patterns:

6. Regional redundancy: This is a pattern that adds a resource or a deployment of resources in another region other than the ones used in a primary region. The paired region can be deployed on demand from backups, remain in standby in an active-passive configuration or remain operational as in an active-active configuration. This pattern is used even for full deployments of critical services for the purpose of ensuring business continuity and disaster recovery.

7. Managed resources/services: When a cloud resource is assigned to a team, they have all the features of the cloud to interact with it such as command-line interface, software development kit libraries, REST APIs and the cloud management User interface, but when the same resource must be shared between teams, certain operations that are allowed from the cloud must be hidden or locked and alternatives may need to be provided to support isolation between the uses. When the resource is deployed in this manner, the services become managed by one team while they are used by many other teams.

8. Automation of access control- Almost any resource or deployment is incomplete without access control and organizing and assigning identities is team specific. When dedicated resources are handed off, the assignee can be granted contributor access but on shared instances, appropriate group creation and role-based access control assignments become necessary even if they are far more in number than actual instances. Use of custom-roles to assign only the minimum number of permissions also become necessary. The custom-roles will have allowed and denied sets of permissions for both control plane and data plane. The exclusion of permissions in the effective permission set for a given principal guarantees the locking of that resource for consumption.

9. Alerting – The same best practice that applies to dedicated resources for their monitoring holds true for shared resources except that the audience is more diverse than earlier and involves different teams and their members. It is also necessary to isolate notifications specific to trouble and their intended audience, especially when many teams share the same fate on that resource.

10. Infrastructure layering – This pattern is evident from container infrastructure layering which allows even more scale because it virtualizes the operating system. While traditional virtual machines enable hardware virtualization and hyper V’s allow isolation plus performance, containers are cheap and barely anything more than just applications. Azure container service serves both Linux and Windows container services. It has standard docker tooling and API support with streamlined provisioning of DCOS and Docker swarm. Job based computations use larger sets of resources such as with compute pools that involve automatic scaling and regional coverage with automatic recovery of failed tasks and input/output handling. Azure demonstrates a lot of fine grained loosely coupled micro services using HTTP listener, Page content, authentication, usage analytic, order management, reporting, product inventory and customer databases.

Previous articles: IaCResolutionsPart63.docx

#codingexercise: CodingExercise-01-19-2024.docx

Thursday, January 18, 2024

This is a summary of the book titled Bruce Lee Code written by Thomas Lee who is a journalist. Bruce Lee, born in 1940 in San Francisco, was a martial arts legend who introduced kung fu to many Americans. He believed in the ideals and ideas behind martial arts and viewed his mistakes as opportunities for self-improvement. Lee's philosophy was that a person should be like water, taking the shape of whatever it touches. He faced racism with courage and perseverance and continually evolved as a martial artist. Lee's vision for his businesses was to integrate the East and West by promoting Chinese culture and ideas through the teaching and practice of kung fu. He used his winning personality to introduce Eastern culture to a Western audience and aimed to negate the stereotype of Asian men as submissive and unsexy. Lee's vision of introducing the East to the West was executed through his movies and martial arts schools, aiming to become a role model for young Asian Americans.

As a Chinese-American martial artist, he was an inspiration to many, including Ken Hao, Chairman of Silver Lake Partners, which invests in Ultimate Fighting Championships (UFC) and mixed martial arts (MMA). Lee believed in breaking current patterns, staying simple, being different, keeping it real, spotlighting characters, and teaching the audience. He applied his martial arts philosophy to filmmaking, using his mistakes as opportunities for self-improvement. Lee believed in simplicity, expressing a lot through minimal dialogue and action. He fused the kinetic, dynamic action of Hong Kong kung fu movies with the characters and story-driven nature of Hollywood film, elevating mass entertainment to a new level of ambition and excellence. Lee's films had depth of character, using martial arts battles to convey insight into the characters. He believed in the ideals and ideas behind martial arts, seeing kung fu as a way to teach people to live in the moment and connect with their opponents.

Bruce Lee, a renowned martial artist, successfully connected cinematic kung fu with his martial arts philosophy through intricate choreography and sophisticated cinematography. His films, with higher budgets, proved more profitable than their competitors. Lee also attracted audiences by casting pop culture stars like Kareem Abdul-Jabbar and Chuck Norris. Executive Dug Song, chief strategy officer of Cisco Security, cites Lee as a role model, as he saw Lee create his own style of martial arts and inspired him to break the mold in his business. Lee's philosophy emphasizes efficiency and effectiveness, and he faced institutional racism, which he saw as a step towards growth. Lee's character Kato became a runaway hit on The Green Hornet, becoming a role model for young Asian Americans. Lee's philosophy, likened to water, emphasizes the importance of adapting to change and being flexible. His background as a ballroom dancer taught him to be flexible and fluid, which he applied to all aspects of his life.

As a successful film producer, he faced racism with courage and perseverance, leading to his breakout success in Hong Kong and America. He had his own production company, allowing him to control his projects and take an active role in screenplays and choreography. Lee's life and philosophy influenced Vijoy Rao, a former executive at Fleishman Hillard, who learned to endure life's difficulties and respect competitors. Lee's intellectual curiosity, work ethic, creativity, and willingness to experiment made him successful. He taught himself every aspect of filmmaking, in front and behind the camera, and was open to teaching kung fu to anyone. Lee continually evolved as a martial artist, adding aspects of judo, boxing, and other fighting styles to his skills. His life and career influenced Tony Blauer, founder of Blauer Tactical Systems, who believes that people should unify their "mind, body, and spirit" to succeed. Lee's life and career exemplify the importance of experimenting, adapting to changing situations, pivoting in one's career, and finding joy in fresh opportunities.

Previous book summaries: BookSummary40.docx
Summarizing Software: SummarizerCodeSnippets.docx

#codingexercise https://1drv.ms/w/s!Ashlm-Nw-wnWhOcl-8PNqDu5Jj3hjg?e=k0xrzx

This is a summary of the book titled Principles of Knowledge Auditing written by Patrick Lambe and published by MIT Press. Knowledge management (KM) is a complex field with rapid growth, often lacking a common vocabulary and theoretical underpinnings. In this book, the author delves into knowledge auditing to clarify its language and trace its evolution. Knowledge audits reveal the state of knowledge production, access, and use in an organization and can help support change. They emerged from communication and information audits of the mid-20th century and target various phenomena, including knowledge stocks, flows, and processes. However, the precise use of language can cause confusion and bias, and personal or collective dualism can oversimplify how knowledge is held in organizations. Typologies organize types of knowledge to support auditing, sensemaking, and action.

Knowledge audits, also known as KM assessments, help identify opportunities and needs, develop action plans, and create alignment across the organization. They support organizational change by helping leaders understand how the organization produces, accesses, and uses information. However, knowledge audit teams often face problems due to a disconnect between the KM function and the business's daily operation, particularly in service domains like human resources, IT, and finance.

Knowledge auditing is a complex field that involves various practices and methodologies, including inventorying knowledge, evaluating knowledge assets, analyzing records, and assessing against a benchmark. It originated from the communication and information audits of the mid-20th century and has evolved over time. Knowledge audits target various phenomena, including knowledge stocks, flows, and processes. The scope of knowledge audits varies widely, with a framework breaking them into seven categories: stocks, flows, goals and needs, enablers, processes for knowledge creation, capture, discussion, synthesis, retention, storage, capabilities, and outcomes desired from the knowledge audit and the KM program overall.

The complexity of knowledge auditing often leads to confusion and bias in the field. The word "audit" in KM often implies rigor and authority, while the methodologies often lack such rigor. The use of the term "asset" is also confusing, as knowledge doesn't have the same qualities and properties as tangible or financial assets.

Syllepsis in knowledge management (KM) can lead to issues when practitioners transfer attributes from one phenomenon to another, such as knowledge assets. The International Organization for Standardization's ISO 30401 standard suggests that knowledge is an intangible asset that needs to be managed like any other asset. Metaphors in KM can affect how people think about and use knowledge, such as "stuff" metaphors that discount the human and emotional aspects of KM. Personal or collective dualism and tacit or explicit dualism oversimplify how knowledge is held in organizations, leading to errors in knowledge auditing and KM. The personal or collective dualism leaves out the core team, preventing understanding how personal knowledge is mediated and rendered actionable. The tacit or explicit dualism overemphasizes explicit knowledge, while implicit knowledge includes knowledge that could be made explicit but has not yet. Typologies organize types of knowledge to support auditing, sensemaking, and action, but many of them fail to meet auditability criteria.

Typologies for personal and organizational knowledge have been developed, but they fail to meet the requirements for auditability. Michael Zack's typology distinguishes between declarative, procedural, causal, conditional, and relational knowledge. Harry Collins' typology identifies three forms of tacit knowledge: relational, somatic, and collective. Frank Blackler's typology describes knowledge as "embrained," embodied, encultured, embedded, and encoded. Straits Knowledge's "Wheel of Knowledge" typology helps understand and manage personal and organizational knowledge, meeting auditability requirements.

Previous summaries: BookSummary39.docx

Summarizing Software: SummarizerCodeSnippets.docx

Tuesday, January 16, 2024

IaC innovations continued...

Many might point to native supportability from existing tracking systems including issue and code repositories given that files and not databases serve better for difficult to automate clouds such as sovereign clouds, and regions with fewer resource-types availabilities. It is true that the practice of annotating every commit in a repository with rich links to origin, growth and timelines can also provide independent sources of information that can be spanned by custom queries as the need arises, but it remains an extra mile and most teams are left to fulfil that themselves leading to boutique solutions. On the other hand, incident tracking software alone has demonstrated the effectiveness of a knowledge base that supports ITSM, ITBM, ITOM and CMDB capabilities.

In addition, a realization dawns in, as the size and scale of infrastructure grows that the veritable tenets of IaC such as reproducibility, self-documentation, visibility, error-free, lower TCO, drift prevention, joy of automation, and self-service somewhat diminish when the time and effort increases exponentially to overcome its brittleness. Packages go out of date, features become deprecated and stop working, backward compatibility is hard to maintain, and all existing resource definitions have a shelf-life. Similarly, assumptions are challenged when the cloud provider and the IaC provider describe attributes differently. The information contained in IaC can be hard to summarize in an encompassing review unless we go block by block and without a knowledge base, this costly exercise is often repeated. It’s also easy to shoot oneself in the foot by means of a typo or a command during the exercise and especially when the state of the infrastructure disagrees with that of the portal.

The data model would articulate Infrastructure-as-a-code and blueprints, resources, policies, and accesses as an entity and become a unit of provisioning the environment. It would include issue and code tracking references, key performance indicators, x-rays and service map references, alerts and notifications and continuously updated with each deployment.

TCO of an IaC for a complex deployment does not include the man-hours required to keep it in a working condition and to assist with redeployments and syncing. One-off investigations are just too many to count on a hand in the case when deployments are large and complex. The sheer number of resources and their tracking via names and identifiers can be exhausting. A sophisticated CI/CD for managing accounts and deployments is good automation but also likely to be run by several contributors. When edits are allowed and common automation accounts are used, it can be difficult to know who made the change and why. All of these shortcomings can be overcome with a cloud IaC data model that is continuously updated via each pipeline-based deployment and encompasses silo’ed views of the numerous pipelines and repositories that exist while providing a base for canning repeated queries.

Some flexibility is required to make judicious use of automation and manual interventions for keeping the deployments robust. Continuously updating the IaC and its knowledge base, especially by the younger members of the team, is not only a comfort but also a necessity. The more mindshare the IaC data model gets, the more likely that it will reduce the costs associated with maintaining the IaC and dispel some of the limitations mentioned earlier.

As with all solutions, scope and boundaries apply. It is best not to let IaC or its data model spread out so much that the high priority and severity deployments get affected. It can also be treated like any asset with its own index, model, documentation and co-pilot.