Sunday, January 28, 2024

 

How to draw a graph image?

A simple way to do it is to use prebuilt libraries using algorithms like the Kamada-Kawai and Fructerman-Reingold layout algorithms

Sample implementation:

From igraph import *

g = Graph()

vertex_labels=[‘A’, ‘B’, ‘C’, ‘D’, ‘E’]

attributes={}

attributes[“label”] = vertex_labels

g.add_vertices(5, **attributes).add_edges([(0, 1),

(1, 2),

(2, 3),

(3, 4),

(4, 0),

(2, 4),

(0, 3),

(4, 1)]

layout = g.layout(“kamada_kawai”)

plot(g, layout=layout)

If we control spacing ourselves to spread the edges with little or no crossing lines, we could anneal the costs of overlapping.

# The following program lays out a graph with little or no crossing lines.

from PIL import Image, ImageDraw

import math

import random

 

vertex = ['A','B','C','D','E']

links=[('A', 'B'),

('B', 'C'),

('C', 'D'),

('D', 'E'),

('E', 'A'),

('C', 'E'),

('A', 'D'),

('E', 'B')]

domain=[(10,370)]*(len(vertex)*2)

 

def randomoptimize(domain,costf):

    best=999999999

    bestr=None

    for i in range(1000):

        # Create a random solution

        r=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))]

        # Get the cost

        cost=costf(r)

        # Compare it to the best one so far

        if cost<best:

            best=cost

            bestr=r

    return r

 

def annealingoptimize(domain,costf,T=10000.0,cool=0.95,step=1):

    # Initialize the values randomly

    vec=[float(random.randint(domain[i][0],domain[i][1]))

         for i in range(len(domain))]

 

    while T>0.1:

        # Choose one of the indices

        i=random.randint(0,len(domain)-1)

        # Choose a direction to change it

        dir=random.randint(-step,step)

        # Create a new list with one of the values changed

        vecb=vec[:]

        vecb[i]+=dir

        if vecb[i]<domain[i][0]: vecb[i]=domain[i][0]

        elif vecb[i]>domain[i][1]: vecb[i]=domain[i][1]

 

        # Calculate the current cost and the new cost

        ea=costf(vec)

        eb=costf(vecb)

        p=pow(math.e,(-eb-ea)/T)

        # Is it better, or does it make the probability

        # cutoff?

        if (eb<ea or random.random( )<p):

            vec=vecb

 

        # Decrease the temperature

        T=T*cool

    return vec

 

def crosscount(v):

    # Convert the number list into a dictionary of person:(x,y)

    loc=dict([(vertex[i],(v[i*2],v[i*2+1])) for i in range(0,len(vertex))])

    total=0

 

    # Loop through every pair of links

    for i in range(len(links)):

      for j in range(i+1,len(links)):

 

        # Get the locations

        (x1,y1),(x2,y2)=loc[links[i][0]],loc[links[i][1]]

        (x3,y3),(x4,y4)=loc[links[j][0]],loc[links[j][1]]

       

        den=(y4-y3)*(x2-x1)-(x4-x3)*(y2-y1)

       

        # den==0 if the lines are parallel

        if den==0: continue

 

        # Otherwise ua and ub are the fraction of the

        # line where they cross

        ua=((x4-x3)*(y1-y3)-(y4-y3)*(x1-x3))/den

        ub=((x2-x1)*(y1-y3)-(y2-y1)*(x1-x3))/den

       

        # If the fraction is between 0 and 1 for both lines

        # then they cross each other

        if ua>0 and ua<1 and ub>0 and ub<1:

            total+=1

 

    for i in range(len(vertex)):

        for j in range(i+1,len(vertex)):

          # Get the locations of the two nodes

          (x1,y1),(x2,y2)=loc[vertex[i]],loc[vertex[j]]

 

          # Find the distance between them

          dist=math.sqrt(math.pow(x1-x2,2)+math.pow(y1-y2,2))

          # Penalize any nodes closer than 50 pixels

          if dist<50:

            total+=(1.0-(dist/50.0))

    return total

                       

def drawnetwork(loc):

    #create the image

    img = Image.new('RGB', (400,400),(255,255,255))

    draw=ImageDraw.Draw(img)

    #create the position dict

    pos=dict([(vertex[i],(loc[i*2],loc[i*2+1])) for i in range(0, len(vertex))])

    #Draw Links

    for (a,b) in links:

        draw.line((pos[a],pos[b]), fill=(255,0,0))

    #Draw vertex

    for (n,p) in pos.items():

        draw.text(p,n,(0,0,0))

    img.save('graph.jpg', 'JPEG')

    img.show()

 

sol=randomoptimize(domain,crosscount)

crosscount(sol)

sol=annealingoptimize(domain,crosscount,step=50,cool=0.99)

crosscount(sol)

drawnetwork(sol)



#codingexercise: https://1drv.ms/w/s!Ashlm-Nw-wnWhOgxp_uQRRXpdZ8wOA?e=AClqyl

Saturday, January 27, 2024

 

How to write a chatbot?

The following code explains how to write a chatbot:

import React, { useState } from 'react';

 

const Chat = () => {

    const [query, setQuery] = useState('');

    const [response, setResponse] = useState('');

    const handleQueryChange = (event: React.ChangeEvent<HTMLInputElement>) => {

        setQuery(event.target.value);

    };

 

    const handleSubmit = async (event: React.FormEvent<HTMLFormElement>) => {

        event.preventDefault();

 

        try {

            const response = await fetch('https://oai-ravinote-3.openai.azure.com/openai/deployments/key-ravinote-3/chat/completions?api-version=2023-07-01-preview', {

                method: 'POST',

                headers: {

                    'Content-Type': 'application/json',

                    'api-key': '', // specify api key here

                    'deployment-id': 'key-ravinote-3'

                },

                body: JSON.stringify({"messages":[{"role": "user", "content": query}]}),

            });

            const data = await response.json();

            setResponse(data.choices[0].message.content);

        } catch (error) {

            setResponse('Something went wrong. 429: Too many requests. Please try again later.');

            console.error('Error:', error);

        }

    };

 

    return (

        <div>

            <h1>AskRavi</h1>

            <h2> What would you like to know? This search covers all his writings.</h2>

            <form onSubmit={handleSubmit}>

                <input id="query" type="text" width="50%" value={query} onChange={handleQueryChange} />

                <button type="submit">Send</button>

            </form>

            <div>{response}</div>

        </div>

    );

};

 

export default Chat;

 

Friday, January 26, 2024

 

A conjecture about Artificial Intelligence:

ChatGPT is increasingly in the news for its ability to mimic a human conversationalist and for being versatile. It introduces a significant improvement over the sequential encoding with the GPT-3 family of parallelizable learning. This article wonders if the state considered to be in summation form so that as the text continues to be encoded, the overall state is continuously accumulated in a streaming manner. But first a general introduction to the subject.

ChatGPT is based on a type of neural network called a transformer. These are models that can translate text, write poems and op-ed and even generate code. Newer natural language processing (NLP) models like BERT, GPT-3 or T5 are all based on transformers. Transformers are incredibly impactful as compared to their predecessors that were also based on neural networks. Just to recap, neural networks are models for analyzing complicated data by discovering hidden layers that represent the latency semantics in each data and often referred to as the hidden layer between an input layer of data and an output layer of encodings. Neural networks can handle a variety of data including images, video, audio and text. There are different types of neural networks optimized for different types of data. If we are analyzing images, we would typically use a convolutional neural network so called because it often begins with embedding the original data into a common shared space before it undergoes synthesis, training and testing. The embedding space is constructed from an initial collection of representative data. For example, it could refer to a collection of 3D shapes drawn from real world images. Space organizes latent objects between images and shapes which are complete representation of objects. The use of CNN is a technique that focuses on the salient invariant embedded objects rather than the noise.

CNNs worked great for detecting objects in images but did not do as well for languages or tasks such as summarizing text or generating text. The Recurrent Neural Network was introduced to adjust this for language tasks such as translation where an RNN would take a sentence from the source language and translate it to a destination language one word at a time and then sequentially generate the translations. Sequential is important because the words order matter for the meaning of a sentence as in “The dog chased the cat” versus “The cat chased the dog”. RNNs had a few problems. They could not handle large sequences of text, like long paragraphs. They were also not fast enough to handle large data because they were sequential. The ability to train on large data is considered a competitive advantage when it comes to NLP tasks because the models become more tuned.

Transformers changed that and in fact, were developed for the purposes of translation.  Unlike RNNs, they could be parallelized. This meant that transformers could be used to train on large data sets.  GPT-3 that writes poetry and code and writes conversations was trained on almost 45 Terabytes of text data and including the entire world wide web. It scales well with a huge data set.

Transformers work very well because of three components: 1. Positional Encoding, 2. Attention and 3. Self-Attention.  Positional encoding is about enhancing the data with positional information rather than encoding it in the structure of the network. As we train the network on lots of text data, the transformers learn to interpret those positional encodings. It really helped transformers easier to train than RNN. Attention refers to a concept that originated from the paper aptly titled “Attention is all you need”. It is a structure that allows a text model to look at every single word in the original sentence when deciding to translate the word in the output. A heat map for attention helps with understanding the word and its grammar. While attention is for understanding the alignment of words, self-attention is for understanding the underlying meaning of a word to disambiguate it from other usages. This often involves an internal representation of the word also referred to as its state. When attention is directed towards the input text, there can be differences understood between say “server, can I have the check” and the “I crashed the server” to interpret the references to a human versus a machine server. The context of the surrounding words helps with this state.

BERT, an NLP model makes use of attention and can be used for a variety of purposes such as text summarization, question answering, classification and finding similar sentences. BERT also helps with Google search and Google cloud AutoML language. Google has made BERT available for download via TensorFlow library while Hugging Face company has made Transformers available in Python language.

Basis for the conjecture to use stream processing for state encoding rather than parallel batches is derived from the online processing of Big Data as discussed in:  http://1drv.ms/1OM29ee

Wednesday, January 24, 2024

 

This is a summary of a book titled Inclusion On Purpose written by Ruchika Tulshyan and published by MIT in 2022. She had earlier published the viral article “Stop telling women they have imposter syndrome.” Racism and prejudice against women of color persist worldwide, affecting the US economy and global economy. Racism is present daily and can be attributed to cultural white supremacy, which associates wealth and achievement with whiteness. Women of color face additional barriers due to gender- and race-based biases combined. Diversity, equity, and inclusion (DEI) efforts often ignore intersectionality, with women of color outnumbering women of all other racial identities in the workforce.

Racism costs everyone, including about $1 trillion in lost GDP each year in the US. Women of color still fill lower-paying jobs out of proportion to their population numbers. Stereotyping, bias, and discrimination hold them back, and inclusion in STEM professions remains stagnate due to biases against women of color in math and computing.

To make work safe for women of color, leaders and influencers must admit their racism and support them. They should embrace willful inclusion, encourage candid conversations about race, remove fear of reprisal, and provide direct feedback about job performance.

White people must acknowledge their privilege and develop purposeful inclusion skills by taking others' perspectives and developing empathy. Acknowledge that two people in the same workplace experience it differently, and show compassion to build trust and recruit and retain people of all backgrounds. Admit your racism and support it by challenging every instance of racism you encounter and questioning your own advancement over that of marginalized peers. Overcome complacency and build empathy by seeking different ideas and perspectives, advocating for unbiased hiring systems, and including representatives of every group in workforce decisions.

Managers and leaders must embrace willful inclusion and speak up when they witness acts of bias, racism, discrimination, or exclusion. Recognize that women of color have likely dealt with hostility and discrimination for years or decades, and actively listen to them. Adopt an inclusion mindset and stay open and curious about conversations about race, religion, nationality, and other factors. Gather anonymous workforce data to gather information on experiences, promotion rates, engagement, and performance of women, women of color, and other minority groups.

Workplaces that expect women of color to hide parts of themselves are not the problem. Establish a specific code of conduct concerning Diversity and Inclusion (DEI) and communicate consequences for breaches. Remove fear of reprisal and ensure psychological safety for women of color. Provide anonymous means for reporting bias and take action when you see bias.

Create conditions that allow people of color to bring their whole selves to work. Encourage marginalized employees to have a voice through well-funded and respected employee resource groups. Consider how your privilege and influence can help someone with a weaker voice.

Managers should recognize and give credit to women of color for their contributions and ideas. In meetings, ensure no one dominates or interrupts and everyone gets an opportunity to share their thoughts and ideas. When hiring, avoid cultural fit and seek cultural addition. Take responsibility as a leader or hiring manager and don't make HR solely responsible for diversity.

Hiring for cultural fit is a prevalent and exclusionary hiring practice. It is crucial to examine sourcing channels, including qualifications in job ads, and design structured interviews led by diverse interviewers. Women should demand fair wages and pay them fairly, as they earn less on average than men. Leaders should lead pay conversations without negotiations, and share compensation openly and ask others about it. Coach women to ask for fair wages, but don't force them. Provide direct, specific feedback about job performance, starting with positive examples. Focus on objective goals in annual performance reviews and offer supportive guidance if defensiveness is encountered.

Previous book summary: BookSummary43.docx
Summarizing Software:
SummarizerCodeSnippets.docx 

 #codingexercise: CodingExercise-01-24-2024.docx

Tuesday, January 23, 2024

 


Friends of appropriate ages:

There are n persons on a social media website. You are given an integer array ages where ages[i] is the age of the ith person.

A Person x will not send a friend request to a person y (x != y) if any of the following conditions is true:

  • age[y] <= 0.5 * age[x] + 7
  • age[y] > age[x]
  • age[y] > 100 && age[x] < 100

Otherwise, x will send a friend request to y.

Note that if x sends a request to yy will not necessarily send a request to x. Also, a person will not send a friend request to themself.

Return the total number of friend requests made.

 

Example 1:

Input: ages = [16,16]

Output: 2

Explanation: 2 people friend request each other.

Example 2:

Input: ages = [16,17,18]

Output: 2

Explanation: Friend requests are made 17 -> 16, 18 -> 17.

Example 3:

Input: ages = [20,30,100,110,120]

Output: 3

Explanation: Friend requests are made 110 -> 100, 120 -> 110, 120 -> 100.

 

Constraints:

  • n == ages.length
  • 1 <= n <= 2 * 104
  • 1 <= ages[i] <= 120

class Solution {

    public int numFriendRequests(int[] ages) {

        int[][] requests = new int[ages.length][ages.length];

        for (int i = 0; i < ages.length; i++) {

            for (int j = 0; j < ages.length; j++){

                requests[i][j] = 0;

            }

        }

        int sum = 0;

        for (int i = 0; i < ages.length; i++) {

            for (int j = 0; j < ages.length; j++){

                if (i == j) continue;

                if (ages[j] <= 0.5 * ages[i] + 7) continue;

                if (ages[j] > ages[i]) continue;

                if (ages[j] > 100 && ages[i] < 100) continue;

                requests[i][j] = 1;

                sum += 1;

            }

        }

        return sum;

    }

}

 

Monday, January 22, 2024

People and Data - a summary.

 

This is a summary of the book titled “People and Data” written by Thomas C. Redman and published by Kogan Page in 2023. Data is essential for people to act meaningfully in business, government, and private life. It is the fuel that runs the world and requires high-quality data. However, individuals and businesses often fail to prioritize data when building technological infrastructure, organizing operations, or choosing new tech tools. He urges companies to modernize their approach to and use of data, focusing on improving data quality and aligning data and business priorities. Companies must optimize their application of quality data, including ordinary employees in their approach and involving them in data generation.

Data enables meaningful action in private life and commerce, such as anticipating and shopping for needs, making rational home-purchasing decisions, and analyzing sales trends and supply chains. However, the data space has problems, such as distrust of data and the monetary cost of bad data. Businesses must make their data work by utilizing data science technologies or engendering a data-driven culture. Leaders must organize their businesses to emphasize data use and build technological infrastructure to support data use at the firms' required scale.

Organizations must reconfigure their "organization for data" to address data quality. This reconfiguration should consider five issues: people involved, data flow, information technology management, teams working on data, and people leading data-driven projects. Missing people is the single most important force holding data programs back. Data becoming a business's central driver requires the involvement of everyone in the company, including non-specialist roles. Better data use will increase revenue, lower costs, reduce errors, and foster a closer relationship between employees and customers.

Organizing for data involves smooth and coordinated movement of data between departments and people, with technology and data managers separate. Transformation comes from both the bottom-up and top-down, with young employees offering innovations and senior leadership managing coordination. Everyone should join the "data generation," as the prevalence of politically motivated misinformation has changed people's views on low-quality data.

All organizations must prioritize improving data quality, as only about 3% of companies' data meets basic quality standards and only 16% of managers trust the data they commonly use. The most transformative uses of data involve data-driven decision-making, data-driven cultures, and treating data as assets.

Small data is more important for most companies than big data, as it improves a company's operations, products, and customer acquisition. Companies often overlook the value of small data, as it involves fewer people and uses hundreds of data points. Small data projects offer more problem-solving opportunities and can reduce time wasted in interactions between colleagues and streamline in-house work processes.

Data is a team sport, but silos can inhibit effective teamwork. To address this, companies must build "fat organizational pipes" – channels of two-way communication between departments, individuals, and up and down the company hierarchy. These pipes include the "customer-supplier model," "data supply chain management," "data science bridge," and "common language."

Organizations must align their data and business priorities, and a data team should include executives, technology experts, and those managing data supply chains. Data teams should include an executive, entrepreneur, developer, and data security officer. Effective communication and collaboration between data creators and customers are crucial for increasing data quality and reducing errors.

Data projects require strong leadership and data program coordinators to empower employees with independence and responsibilities. Companies should align data projects with business priorities, form data teams, and interact with employees daily. They should understand project problems, embrace aspirations, address anxieties, and train employees to solve their own problems.

Previous book summaries: BookSummary42.docx
Summarizing Software:
SummarizerCodeSnippets.docx