Thursday, August 31, 2023

 

Recently, I came across a situation where CI/CD pipelines were making unintended changes to the resources in the Azure public cloud. The IaC was written in Terraform and the resource provider was Azure. The symptom that manifested was that when a code change was pushed through the pipeline, settings on unrelated resources would fall off. This impacted the uptime of those resources and business continuity suffered whenever those settings were to be restored. In addition, it was getting hard to tell which resources were going to be affected since the author of any change had nothing to do with those resources in his change. The team responsible for the IaC is referred to as the infrastructure team.

There is also some context to this situation that came before these symptoms manifested. First, the subscription where these resources were impacted had long been a shared subscription and one of the first to be tried out. Consequently, there were proof of concepts, multiple versions, and many stakeholders, sometimes even with contributor access who could update their specific resources. The sheer number of resource groups, subnets and virtual networks had grown to be quite large and neglected in a few cases. The specific resources that were most affected by this exchange were the app services and it just so happened that the application engineering team had started requiring changes more often now than ever before for an improvement they owned.

One specific example of changes that were accumulated in the portal was virtual network integration for these resources and whenever these settings fell off, the connectivity was disrupted resulting in some downtime. While this applied to the outbound traffic from the resource, similar discrepancies were noticed on the incoming side where access restrictions were lost on the resource. Since the inbound traffic and outbound traffic settings were maintained by the infrastructure team, they were supposed to be captured in the IaC. Briefly, some of these definitions indeed appeared in the IaC but on closer inspection, they turned out to be improper or even failing enforcement. Other settings happened to be specific to that resource only and very much linked to the code or container image being deployed to those resources. The application engineering team managed these.

Another source of errors was attributed to the Terraform state. Irrespective of the resources in the portal or their definitions in the IaC, the state was maintained and even updated without corresponding changes elsewhere. This was done to overcome the conflicts that were found during the compile or the execution of the IaC but it resulted in other sets of conflicts found when the pipeline ran. Consequently, resources were even destroyed during the execution of the pipeline. It is not wrong to edit the state file, but it is usually done to keep it in sync with both the portal and the IaC. Keeping it in sync with the portal first and then backpropagating the changes to the IaC is one direction of the edits. The other direction is to write through the state with the changes in the IaC and then push it to the resources in the portal. Both Non-prod and prod resources must have their own sets of IaC, state and actual resources and must also be kept separate.

Lastly, the changes being made to keep all three in sync were often spread out over time and distributed among authors leading to sources of errors or discrepancies. Establishing a baseline combination of state, IaC and corresponding resources is necessary to make incremental changes. It is also important to keep them in sync going forward. The best way to do this would be to close the gap by enumerating all discrepancies to establish a baseline and then have the process and the practice to enforce that they do not get out of sync.

References: Earlier articles on IaC shortcomings and resolutions: IacResolutionsPart21.docx

Wednesday, August 30, 2023

Databricks and active directory passthrough authentication.


Azure Databricks is used to process, store, clean, share, analyze, model, and monetize datasets with solutions from Business Intelligence to machine learning. It is used to build and deploy data engineering workflows, machine learning models, analytics dashboards and more.

It connects to different external storage locations including the Azure Data Lake Storage. Users logged in to the Azure Databricks instance can execute python code and use Spark platform to view tabular representation of the data stored in various formats on the external storage accounts. When they refer to a file on the external storage account, they need not specify credentials to connect, and their logged-in credential can be passed through to the remote storage account. For example: spark.read.format("parquet").load("abfss://container@storageAccount.dfs.core.windows.net/external-location/path/to/data")

This feature required two settings:

1.       When a compute cluster is created to execute the python code, it must have the checkbox to pass through the credentials, checked.

2.       It must also have the flag spark.databricks.passthrough.adls set to true

Until recently, the Sparks UI allowed this flag to be set but the configuration for passthrough changed with the new UI that facilitated Unity Catalog – a unified access control mechanism. Passthrough credentials and Unity Catalog are mutually exclusive. The flag no longer can be set to create new clusters with the new UI in most cases and this affected the implicit login required to authenticate the current user to the remote storage. The token provider used earlier was spark.databricks.passthrough.adls.gen2.tokenProviderClassName and with the new UI the login required more elaborate configuration. The error code encountered by the users when using the earlier clusters with the newer version Databricks UI is 403.

The newer configuration is the following:

spark.hadoop.fs.azure.account.oauth2.client.id.<datalake>.dfs.core.windows.net <sp client id>

spark.hadoop.fs.azure.account.auth.type.<datalake>.dfs.core.windows.net OAuth

spark.hadoop.fs.azure.account.oauth.provider.type.<datalake>.dfs.core.windows.net org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider

spark.hadoop.hadoop.fs.azure.account.oauth2.client.secret.<datalake>.dfs.core.windows.net {{secrets/yoursecretscope/yoursecretname}}

spark.hadoop.fs.azure.account.oauth2.client.endpoint.<datalake>.dfs.core.windows.net https://login.microsoftonline.com/<tenant>/oauth2/token

This requires a secret to be created but that is possible via the https://<databricks-instance#secrets/createScope URL. The value used would be {{secrets/yoursecretscope/yoursecretname}}

Finally, the 403 error also requires that the networking be checked. If the databricks and storage accounts are in different virtual networks, that of the storage account must allow list the subnets both private and public for the databricks instance.


Tuesday, August 29, 2023

 

Azure managed instance for Apache Cassandra is an open-source NoSQL distributed database that is trusted by thousands of companies for scalability and high availability without compromising performance. Linear scalability and proven fault tolerance on commodity hardware or cloud infrastructure makes it the perfect platform for mission critical data. This is a distributed database environment, but the data can be replicated to other environments including the Azure Cosmos Database for use with Cassandra API.

The Database Migration Assistant has a preview feature to help with this database migration.  The Azure Cosmos DB Cassandra connector helps with the live data migration from existing native Apache Cassandra workloads running on-premises or in the Azure public cloud to the Azure Cosmos DB with zero application downtime. It does this with the help of a replication agent to move data from Apache Cassandra to the Cosmos DB. The replication agent is a java process that runs on the native Cassandra host(s) and uploads data from Cassandra via a managed pipeline. Customers need only download the agent on the source Cassandra nodes and configure the target Azure Cosmos DB Cassandra API account information.

The replication agent runs on the native Cassandra cluster. Once it is installed,  it takes a snapshot of the cluster and uploads the requisite files. After the initial snapshot, continuous ingestion commences in the following manner. First, it connects to the replication metadata endpoint of the Cosmos Cassandra account and fetches replication component information. Then it sends the commit logs to the replication component. Finally, mutations are replicated to the Cosmos DB Cassandra endpoint by the replication component.

Customers can begin using the data in the Azure Cosmos DB Cassandra API account by first verifying the supported features of Cassandra here and estimating the request units required. This can be calculated even at the granularity of each operation which helps with the planning.

The benefits of this data migration from native Cassandra clusters to Cosmos DB Cassandra API account include no downtime, no code changes, and no manual data migration. The configuration is simple and the replication is fast. It is also completely transparent to Cassandra and the other workloads to the cluster.

The Cosmos DB Cassandra API account normalizes the cost of all database operations using Request Units. This is a performance currency abstracting the system resources such as CPU, IOPS, and memory that are required to perform the database operations and help with cost estimation in dollars by virtue of unit price.

Reference: This article is a continuation of articles on Azure Resources with the last one describing Cassandra Configuration: CassandraConnectivity-2.docx

 

Monday, August 28, 2023

 

Azure managed instance for Apache Cassandra is an open-source NoSQL distributed database that is trusted by thousands of companies for scalability and high availability without compromising performance. Linear scalability and proven fault tolerance on commodity hardware or cloud infrastructure makes it the perfect platform for mission critical data.

Azure managed instance for Apache Cassandra is a distributed database environment. This is a managed service that automates the deployment, management (patching and node health), and scaling of nodes within an Apache Cassandra cluster.  It also provides the capability for hybrid clusters, so Apache Cassandra datacenters deployed in Azure can join an existing on-premises or third party hosted Cassandra ring. This service is deployed using Azure Virtual machine scale sets.

However, Cassandra is not limited to any one form of compute platform. For example, Kubernetes runs distributed applications and Cassandra and Kubernetes can be run together. One of the advantages is the use of containers and another is the interactive management of Cassandra from the command line. The Azure managed instance for Apache Cassandra is notorious for allowing limited form of connection and interactivity required to manage the Cassandra instance. Most of the Database administration options are limited to the Azure command line interface that takes the invoke-command option to pass the actual commands to the Cassandra instance. There is no native invocation of commands directly by reaching the IP address because the Azure Managed Instance for Apache Cassandra does not create nodes with public IP addresses, so to connect to a newly created Cassandra cluster, one will need to create another resource inside the VNet. This could be an application, or a Virtual Machine with Apache’s open-source query tool CSQLSH installed.  The Azure Portal may also provide connection strings that have all the necessary credentials to connect with the instance using this tool. Native support for Cassandra is not limited to the nodetool and sstable commands that are permitted via the Azure CLI command options. CSQLSH is a command-line shell interface for interacting with Cassandra using CQL (Cassandra Query Language). It is shipped with every Cassandra package and can be found in the bin/ directory. It is implemented with the Python native protocol driver and connects to the single specified node, and this greatly reduces the overhead to manage the Cassandra control and data planes.

The use of containers is a blessing for developers to deploy applications in the cloud and Kubernetes helps with the container orchestration.  Unlike managed Kubernetes instances in Azure that can allow a client to configure the .kubeconfig file with connection configuration using the az cli get-credentials and kubectl switch context commands, the Azure managed instance for Apache Cassandra does not come with the option to use kubectl commands. The use of containers helps with managing add or remove of nodes to the Cassandra cluster with the help of the cassandra.yaml file. It can be found in the /etc/cassandra folder within the node. One cannot access the node directly from the Azure managed instance for Cassandra so a shell prompt in the node is out of the question. The nodetool option to bootstrap is also not available via Invoke-Command but it is possible to edit this file. One of the most important properties of this application is the option to set seed-providers for existing datacenters. This option allows a new node to quickly become ready by importing all the necessary information from the existing datacenter. The seed provider must not be set to the new node but point to the existing node.

Cassandra service on a node must be stopped prior to the execution of some commands and restarted post execution. The database must also be set to read-write for certain commands to execute. These options can be set as command line parameters to the Azure Command-line interface for the managed-cassandra set of commands.

Sunday, August 27, 2023

This completes a set of three book summaries. This one is about the book “Viral Justice: How we grow the world we want?” by Ruha Benjamin. She is a professor of African American studies at Princeton University and is also the author of “Race After Technology” and “People’s Science”. 

The author uses the term “Viral justice” in the context of promoting collective healing and unlearning dominant narratives. Systemic oppression such as sexism, classism, racism, ableism, and colonialism, operate like viruses. When the “privilege” of the status quo is maintained, this kills people and robs them of the material and social conditions they need to survive. It’s time to treat these societal “viruses” as signals that the status quo is no longer acceptable. When opportunities to dismantle these oppressive systems are actively sought and a more inclusive caring world is built, “viral justice” comes into play. 

Systems might indeed be retractable, and the wronged person might be the only victim whose heart is broken, and the shattering might be both emotional and physiological but “viral justice” can be the rallying cry inviting others who desire change to join the individual. The first step in this direction requires us to unlearn patterns of behavior and thought that reinforce dominant narratives. The act of dreaming must be reclaimed and the promotion of collective good must be imagined. 

Support networks must be built to weather the stress and physical damage caused by oppressive systems. The term “weathering” here is a public health concept that embodies the stress of living with oppressive systems. If the struggle to make ends meet is one of the principal causes of weathering, then viral justice is about creating social relations that are resuscitating instead of exhausting. Some examples illustrate weathering. Black teenage boys are more likely to die before the age of 65 than teenage boys in Bangladesh. The health of Latinx immigrants deteriorates each generation after their families arrive in the United States. Experiencing traumatic events, ages a person prematurely. There is protection needed from the negative impacts of weathering and this could include cultivating supportive relationships, committing to practices of healing and accountability, and building networks of solidarity. 

One of the classic examples is punitive policing which must be replaced with community centered harm reduction policies. Police surveillance affects health of entire communities. Some feel “hunted”, and witnesses report acts of “licensed terror” such as pepper spraying homeless people’s sleeping bags to shooting unarmed civilians. “Viral justice” can be enacted by growing communities of care which does not mean police reform but rather everyday people relating to one another in life-affirming ways. Technology also plays a role. Some apps like GhettoTracker and NextDoor perpetuate systems of oppression, and this manifests as 240 million calls reported annually to 911 for suspicious activity viewing but this can be undone with a more empathetic approach. 

Such examples are clearer with racism. For instance, teachers may fail to recognize Black students as gifted and talented, because their image of successful students is white. Researchers found that schools punish Black girls more often and severely for minor infractions – such as having “too much attitude” – than they punish their white female counterparts. A neutral example can be seen with “zero-tolerance” disciplinary approaches which damage students’ self-esteem and rob them of education and life opportunities. “Viral justice” in the educational system can be embraced by advocating reforms such as: 

  1. Replacing punitive actions with “restorative practices” where authorities display calm and loving presence. 

  1. Prioritizing recruiting and fostering diversity among teachers which can inspire students. 

  1. Updating the curriculum to include ethnic studies and Black history. 

  1. Hiring counselors to ensure the well-being of students and inviting police to walk the hallways. 

Reimagining the place of work in our lives helps workers to thrive. It demands understanding that rest, like healthy foods, clean water, and fresh air, is essential. In a recession or pandemic, the rich could get richer while the poor could become poorer. Imagining a future where rich no longer devalue labor and redistributing wealth to ensure everyone has access to social and economic conditions necessary for living a flourishing life are ways to embrace “Viral justice”. 

Similar prospect goes for healthcare institutions. For example, white babies are paying the price for anti-black racism from the time they are born and black babies more so. Institutions must make reparations to victims and their families. 

Reimagining a better world as an individual can be broken down into the following steps: 

  1. Reflecting on one’s own biases and constantly envisioning a future that embraces all. 

  1. Taking micro actions that have a collective bigger impact. 

  1. Demonstrating inclusivity by creating spaces where everybody knows they are welcome and safe and influencing others to do the same in gamut such as housing, education, and transportation. 

  1. Live poetically to transform oppressive systems and embrace creative ways of thinking. 

 


Saturday, August 26, 2023

 

Azure managed instance for Apache Cassandra is an open-source NoSQL distributed database that is trusted by thousands of companies for scalability and high availability without compromising performance. Linear scalability and proven fault tolerance on commodity hardware or cloud infrastructure makes it the perfect platform for mission critical data.

One of the most common concerns with this resource is how to connect to it. Azure Managed Instance for Apache Cassandra does not create nodes with public IP addresses, so to connect to a newly created Cassandra cluster, one will need to create another resource inside the VNet. This could be an application, or a Virtual Machine with Apache’s open-source query tool CSQLSH installed.  The Azure Portal may also provide connection strings that have all the necessary credentials to connect with the instance using this tool.

CSQLSH is a command-line shell interface for interacting with Cassandra using CQL (Cassandra Query Language). It is shipped with every Cassandra package and can be found in the bin/ directory. It is implemented with the Python native protocol driver and connects to the single specified node.

The configuration options for this tool are in the ~/.cassandra/cqlsh/.csqlshrc file. All CQL commands executed are written to a history file. The three essential operations for connecting to the Cassandra cluster are the database server’s host name or IP address, the correct connection port, and the username and password if using the authentication.

This would look something like this:

export SSL_VERSION=TLSv1_2

export SSL_VALIDATE=false

host=(“<IP>”)

initial_admin_password=”Password provided when creating the cluster”

csqlsh $host 9042 -u cassandra -p $Initial_admin_password --ssl

 

The az cli command for this resource type allows us to manage the cluster and the datacenters for the instance and most commands start with the az managed-cassandra prefix but they do not help with data plane operations for which the best bet is the CSQLSH once the connectivity is established.

The management operations in Azure Managed Instance for Apache Cassandra include compaction, patching, and maintenance. Out of these the nodetool utility is frequently used for repairs. The nodetool repair is automatically run by the service called reaper. Nodetool repairs one or more tables and performing an anti-entropy node repair on a regular basis helps with the maintenance.

The azure cli command provides a way to invoke the nodetool with the invoke-command option for an instance.

 

Friday, August 25, 2023

 Sequences are excellent source of information that are usually not self-contained in the discrete units of an input stream such as words in a text, symbols in a language, or images in a video, yet they are under-utilized in many machine learning scenarios that have done so much in enhancing the information within the unit by means of features, coming up with various relative distance metrics or finding their relative co-occurrence similarities with classifications. This article explorer conventional and futuristic usages of sequences.

The inherent benefit of the sequence is that it is captured in the form of state that is independent of the units themselves. This powerful concept allows us to work with all kinds of input units be it words, symbols, images, or any other code. The conventional way to work with sequences belongs to a family of neural networks that is steeped in shredding data. It encodes the sequences and later decodes it to form a different output sequence. These recursive neural networks aka RNNs use this state as the essence of the sequence which is almost independent of the forms of the units comprising the sequence and infer the meaning of those units without knowing what they are. The original RNN proposed by Bahdanau et al in 2014 could be used with different kinds of decoder that resulted in different outputs but the sequences remained fixed in size and the state was accrued in a batch manner. In the future, if it could be possible to build one state in an aggregated manner that continuously evolved by leveraging growing size of the input stream from start to finish, that state is likely going to be a better representation of the overall import than ever before. The difference is in building sequences as records in table that are distinct from one another versus enriching the state in a streaming manner. The same state continually updates for each unit one at a time.

TensorFlow is a convenient library to write RNN. As with all machine learning models, at least 80% data is used for training and 20% used to test/predict. The model can be developed on high-performance computing servers and later exported to be used on low-resource-usage devices and clients. The model can be tuned with continuous feedback and its releases versioned.

Let us take an example of predicting the next word from a passage. This goal is particularly suited to the conventional RNNs because a sequence of three words at a time and one labeled symbol  will make the neural network predict the next symbol correctly. The model can only understand real numbers so a way to convert a symbol to a number is to assign a unique integer to each symbol based on the frequency of occurrence. The frequency table and a reverse dictionary help to articulate the next symbol.

As with any Softmax classifier used with neural networks, each symbol is associated with a vector of probabilities. The highest probability encountered can then be used towards finding the index in the reverse dictionary for determining the prediction.

Using TensorFlow, this is written as:

def RNN(x, weights, biases):

       x = tf.shape(x, [-1, n_input])

       x = tf.split(x, n_input, 1)

       rnn_cell = rnn.BasicLSTMCell(n_hidden)

       outputs, states = rnn.static_rnn(rnn_cell, x, dtype=tf.float32)

    return tf.matmul(outputs[-1], weights[‘out’]) + biases[‘out’]

 

The streaming form of RNN would use a summation form to continuously update the state.

Thursday, August 24, 2023

 

This is a summary of the book “The Devil never stops: Learning to live in an age of disasters” by Juliette Kayyem which was published in 2022. She is a specialist in crisis management, disaster response and homeland security, serves on the faculty at Harvard’s Kennedy school of Government and is Faculty Chair of the Homeland Security project. She is also a national security analyst for CNN and the author of “Security Mom: An unclassified guide to protecting our homeland and your home.”

She proposes that disasters aren’t anomalies. Planners should assume disasters will occur and people need “situational awareness” to respond effectively to disasters, especially those that repeat. As part of the preparation and response process to disasters, all leaders must be on the same page and a plan for “managed retreat” must be high up in the top choices for response. Controlling the losses, stopping the hemorrhaging are some of the options that also need to be considered. When conditions don’t remain the same over time, a static plan does not help, and the response must be dynamically modified. People tend to disregard near disasters rather than recognizing them as warning signs. History has valuable lessons especially when it comes to fatalities and response can be better articulated with this kind of insight.

There’s a wide variety of phenomena that the general public is already aware of, and these include natural calamities such as hurricanes, tsunamis and earthquakes as well as man-made ones such as financial meltdown, pandemic, war, and cyberattacks. Leaders of all demographics can find universal lessons to their advantage when disruptions occur so that they can ‘fail safer’. People who study disasters tend to divide their duration into two phases – before and after the disaster and the interim is often referred to as the boom. The time before the disaster is an opportunity to adopt measures that will prevent calamities. In the phase after the disaster, people attempt to recuperate from its consequences. Disasters might not be completely avoided but disaster managers can focus on what happens after a disaster to respond and rebuild effectively. In fact, the author asserts that disaster will strike. They should prepare for all hazards as a worst-case scenario.

For example, Boeing 737 max planes crashed, killing 346 people and the crashes resulted from a design error that limited the pilot’s ability to control the planes. A whistleblower cited that the flaw was brought to the attention of management who downplayed it. Even the first disaster was explained away as a rare occurrence and Boeing executives didn’t seriously consider the possibility of such disasters. The second disaster was waiting to happen. A disaster management response in the public or private sector, should include an organizing principle, such as the “Incident Command System” or a hierarchical system that extends from a Public Information Officer and Safety officer to teams for planning, planning, and finance. Some might refer to it as the war room.

As the disaster unfolds, the public must be made aware of the rollout status so that they know when and what is going on. A method for gathering real-time information is necessary. Situational awareness involves keeping a record of what happened, indexed to time, place and location. An SA template includes “perception”, “comprehension”, and “projection”. One of the recurring failures in organizational responses to disasters is how key players find it difficult to understand events as they play out in real time. Gathering a lot of details and furiously removing the noise are key activities for situational awareness. The author cites an example of better awareness as one where the San Francisco mayor notices that the members of the city’s Asian and Asian-American community who have strong ties to the epicenter of the pandemic weren’t attending Lunar New Year events in expected numbers and went ahead to institute social distancing protocol and stay at home orders long before most other mayors did.

Disaster response demands a consolidated strategy and purpose. Empowering social workers to perform at the highest possible level is an advantage. Unfortunately, many institutions approach to security architecture tends to fragment  that architecture into different, specialized silos, which impedes unified action. Poor “governance structures” rather than straightforward ignorance exacerbates disasters. The oversimplifying action to reduce safety and security to “gates, guards and guns” and focusing more on buying equipment rather than on setting up effective processes, doesn’t help. Disaster’s consequences and its negative impact can be better managed when people learn to “fail safer”.

A “managed retreat” is sometimes referred to as a backup plan.  When British Petroleum built its Deepwater Horizon oil drilling rig  in the Gulf of Mexico, there were assurances that a blowout preventer would shut down the rig in the case of a spill.  In April 2010, the rig exploded,  and the blowout preventer failed resulting in one of the worst oil spills and BP did not have a backup plan. The blowout preventer was also the last line of defense.

A more systemic approach enables us to mitigate negative consequences or in the event of a disaster, render the consequences less awful. There is a literal example for this.  The American military began encountering “improvised explosive devices” in Afghanistan and Iraq and the main threat to mortality was that the victims bleed to death. The response included minimizing damage with amputated limbs, transfers to better hospitals, training every soldier as a field medic, developing better tourniquets and blood clotting foam. One Pentagon study found that this response saved the soldiers and their limbs some 90% of the time.

The author asserts that disasters are simply no longer random and rare and that’s where the adage, the devil never sleeps, comes from. Since conditions deteriorate over time, responses must also be dynamic and not remain locked in static plans. In June 2021, a condominium tower in Surfside in South Florida, collapsed suddenly killing 100 residents. A consulting firm had warned about structural conditions several years earlier, but the responses were put off for want of budget. Safety and security systems are designed based on conditions that existed when the structure is built but conditions don’t remain constant.

Even big companies such as Apple fail to read the near misses such as when iPhone 4 was launched in 2010, it dropped calls and interrupted people’s messages. Steve Jobs and Apple absurdly blamed the customers, and no one complained. This kind of “normalizing deviance” can be quite dangerous.

History teaches valuable lessons and one of the lessons that stands out is that a perfectly managed crisis is an oxymoron. A dozen hurricanes made landfall in the United States in 2020 and the response to Hurricane Laura in Louisiana resulted in only 28 deaths who mostly died from carbon monoxide poisoning because of unsafe generators when the electrical grid went down.

Crisis managers must pay attention to what happened and prepare for the next disaster accordingly.

 

 

Wednesday, August 23, 2023

This is a summary of the book “How the other half eats? The untold story of food and inequality in America” written by Priya Fielding Singh PhD who is a sociologist at Stanford University. She studies the societal factors that influence people’s health.

There are prevalent assumptions about eating that grossly misunderstand the dietary choices in America. There is societal pressure to be a “good mom” which dictates family dietary choices. The food industry pushes junk food to ease mothers' guilt.  Gendered expectations create further frustrations for mothers trying to uphold healthy eating habits. Lack of time and resources often leads to unhealthy dietary compromises. Emotional stress and misguided blame affect diets across the income spectrum.

The author makes recommendations for both mothers without resources who must be prudent to buy the right foods and those who can buy healthful food but who think the choices are not good enough. Her research targets diverse families and shows that Americans’ dietary choices have little to do with personal discipline and, instead, mainly involve family budgets and societal pressures. Personal desires – whether to be a perfect mom or to alleviate the weight of poverty – shape how Americans eat.

The American diet is overwhelmingly unhealthy. The US Department of Agriculture agrees with most nutritionists that a healthy diet is made up of fresh fruits, vegetables, low fat dairy, whole grains, and lean proteins. Most Americans don’t eat this way. The Americans who suffer the most from diets lacking in nutritional value are low-income families of color. They often eat too much sugar and too many processed foods and fatty meats, leading to higher rates of diabetes and heart problems, as well as earlier deaths than more affluent people.

As the disparity between rich and the poor widens, some political figures, such as Michelle Obama, have sought to mitigate some of the causes behind this issue. However, those efforts operate on two assumptions about why some Americans eat unhealthily. First, low-income families can’t afford healthier foods and second, low-income families don’t have physical access to grocery stores that sell healthy foods.

The second assumption is false. For example, The Healthy Food Financing Initiative invested more than $650 million dollars in building supermarkets in communities that lacked nearby grocery stores. Yet, making healthful food more available brought about little or no dietary changes within low-income communities. The author asserts that geographical access was not a contributing factor to dietary choices. Most people have cars and don’t mind traveling to get the food they want.

A mother who makes ends meet lacks the resources to take her kids out for fun activities, such as visiting a water park. Her lack of financial security impedes her ability to provide for her children. She constantly denies her daughters’ requests for new clothes, electronics, or toys. This makes her feel guilty and leaves her wondering if she’s a terrible mother. However, she can say yes to junk food because it’s cheap. Buying her daughters powdered donuts or a bag of Doritos puts smiles on their faces and is often the only thing she can do to ease the hardship of poverty.

On the opposite end of the economic spectrum, an affluent mother often says no to her kid’s junk food requests. However, she can say yes to most of their other requests. She can provide her children with private school, concert tickets, summer camp and consistent, healthy dietary choices.

Intensive mothering dooms moms to feelings of inadequacy and the sense that they never do enough — that they never are enough. This behavior creates a racial and economic inequality gap concerning who gets to be a good mother. Gold standard mothering now means giving your kids every opportunity to grow and learn, buying them whatever they need to thrive and providing them with nutritious food. By those unfair criteria, only the financially secure can afford to be good moms.

The food industry pushes junk food to ease mothers’ guilt. Because many low-income Americans are people of color, food choices may also reflect racial inequalities. Americans often associate childhood obesity with being Black or Hispanic – and often blame mothers instead of scrutinizing the food industry’s practices. The author states that the dads she met did not need to devote themselves to feeding their kids to feel like they were good dads.

Single mothers who work labor-intensive jobs have greater difficulty making healthy choices. Lack of time is an issue for most working parents across economic brackets. They often face long hours and long commutes, leaving them with less time to shop for food, cook or clean. Mothers often feel they must choose between spending quality time with their kids or cooking a healthy meal. This is also true for moms who are somewhat better off, though some wealthier moms can afford to hire household help to compensate for their lack of parenting time.

The author says that as moms, we deserve to live in a society built of infinitely more empathy, appreciation, and support.” The narrative of blaming mothers will never fix these issues. The government should hold employers and corporations responsible.

 


 

Reducing Trials and Errors

Model: When trials and errors are scattered in their results, an objective function that can measure the cost or benefit will help with convergence. If the samples are large, a batch analysis mode is recommended. The approach to minimize or maximize the objective function is also possible via gradient descent methods but the use of simulated annealing can overcome local minimum even if the cost is higher because it will accept with a certain probability. In Simulated annealing, the current cost is computed, and the new cost is based on the direction of change. If the cost improves, the temperature decreases.
Sample implementation follows:

def annealingoptimize(domain,costf,T=10000.0,cool=0.95,step=1):
     # Initialize the values randomly
     vec=[float(random.randint(domain[i][0],domain[i][1]))
          for i in range(len(domain))]
     while T>0.1:
          # Choose one of the indices
          i=random.randint(0,len(domain)-1)
          # Choose a direction to change it
          dir=random.randint(-step,step)
          # Create a new list with one of the values changed
          vecb=vec[:]
          vecb[i]+=dir
          if vecb[i]<domain[i][0]: vecb[i]=domain[i][0]
          elif vecb[i]>domain[i][1]: vecb[i]=domain[i][1]

          # Calculate the current cost and the new cost
          ea=costf(vec)
          eb=costf(vecb)
          p=pow(math.e,(-eb-ea)/T)
          # Is it better, or does it make the probability
          # cutoff?
          if(eb<ea or random.random( )<p):
               vec=vecb
          # Decrease the temperature
          T=T*cool
     return vec

 

Monday, August 21, 2023

 Fourier Transformations for wave propagation: 

Introduction:  A Fast Fourier Transform converts wave form data in the time domain into the frequency domain. It achieves this by breaking down the original time-based waveform into a series of sinusoidal terms, each with a unique magnitude, frequency and phase. This process converts a waveform in the time domain into a series of sinusoidal functions which when added together reconstruct the original waveform. Plotting the amplitude of each sinusoidal term versus its frequency creates a power spectrum, which is the response of the original waveform in the frequency domain. 

 

When Fourier transforms are applicable, it means the “earth response” now is the same as the “earth response” later. Switching our point of view from time to space, the applicability of the Fourier transformation means that the “impulse response” here is the same as the “impulse response” there. An impulse is a column vector full of zeros with somewhere a one. An impulse response is a column from the matrix q = Bp The collection of impulse responses in q=Bp defines the convolution operation. 

 

Sample FFT application: 
import numpy as nm 

import scipy 

import scipy.fftpack 

import pylab 

 

def lowpass_cosine( y, tau, f_3db, width, padd_data=True): 

    # padd_data = True means we are going to symmetric copies of the data to the start and stop 

    # to reduce/eliminate the discontinuities at the start and stop of a dataset due to filtering 

    # 

    # False means we're going to have transients at the start and stop of the data 

 

    # kill the last data point if y has an odd length 

    if nm.mod(len(y),2): 

        y = y[0:-1] 

 

    # add the weird padd 

    # so, make a backwards copy of the data, then the data, then another backwards copy of the data 

    if padd_data: 

        y = nm.append( nm.append(nm.flipud(y),y) , nm.flipud(y) ) 

 

    # take the FFT 

    ffty=scipy.fftpack.fft(y) 

    ffty=scipy.fftpack.fftshift(ffty) 

 

    # make the companion frequency array 

    delta = 1.0/(len(y)*tau) 

    nyquist = 1.0/(2.0*tau) 

    freq = nm.arange(-nyquist,nyquist,delta) 

    # turn this into a positive frequency array 

    pos_freq = freq[(len(ffty)/2):] 

 

    # make the transfer function for the first half of the data 

    i_f_3db = min( nm.where(pos_freq >= f_3db)[0] ) 

    f_min = f_3db - (width/2.0) 

    i_f_min = min( nm.where(pos_freq >= f_min)[0] ) 

    f_max = f_3db + (width/2); 

    i_f_max = min( nm.where(pos_freq >= f_max)[0] ) 

 

    transfer_function = nm.zeros(len(y)/2) 

    transfer_function[0:i_f_min] = 1 

    transfer_function[i_f_min:i_f_max] = (1 + nm.sin(-nm.pi * ((freq[i_f_min:i_f_max] - freq[i_f_3db])/width)))/2.0 

    transfer_function[i_f_max:(len(freq)/2)] = 0 

 

    # symmetrize this to be [0 0 0 ... .8 .9 1 1 1 1 1 1 1 1 .9 .8 ... 0 0 0] to match the FFT 

    transfer_function = nm.append(nm.flipud(transfer_function),transfer_function) 

 

    # plot up the transfer function 

    # since "freq" is only the positive frequencies, select out 

    pylab.figure(1) 

    pylab.clf() 

    pylab.plot(freq,transfer_function) 

    pylab.xlabel('Frequency [Hz]') 

    pylab.ylabel('Filter Transfer Function') 

    pylab.xlim([-10.0,10.0]) 

    pylab.ylim([-0.05,1.05]) 

 

    # apply the filter, undo the fft shift, and invert the fft 

    filtered=nm.real(scipy.fftpack.ifft(scipy.fftpack.ifftshift(ffty*transfer_function))) 

 

    # remove the padd, if we applied it 

    if padd_data: 

        filtered = filtered[(len(y)/3):(2*(len(y)/3))] 

 

    # return the filtered data 

    return filtered 

 

 

# do an example of lowpass filtering 

# first make some fake data 

# a sine wave fluctuating once every pi seconds 

# samples 1000 times per second 

fakedata = nm.sin(nm.arange(0,11,0.001)) + nm.random.randn(len(nm.arange(0,11,0.001)))/4.0 

 

# run the filter 

# lowpass at 5 Hz, with a 1 Hz width of its roll-off 

filtered = lowpass_cosine(fakedata,0.001,5.0,1.0,padd_data=True) 

 

# plot the noisy data, with the filtered data on top 

pylab.figure(2) 

pylab.clf() 

pylab.plot(nm.arange(0,11,0.001),fakedata,label='Noisy Data') 

pylab.plot(nm.arange(0,11,0.001),filtered,label='Lowpass Filtered Data') 

pylab.xlabel('Time [s]') 

pylab.ylabel('Voltage') 

pylab.legend() 

 

pylab.ion() 

pylab.show()