Tuesday, August 1, 2023

 

Multi-dimensional optimization:

Introduction: This is a continuation of the previous article on the dynamic walk for optimization. Here we try extending the same optimization based on the stochastic optimization of more than two random variables represented by higher dimension form of equations involving a greater number of random variables. Instead of pair-wise treatment of random variables, we now study optimization involving a vector space also called a Hilbert space.

Description:

The equation representing pair of random variables is always in a quadratic form.

 


Where A is the matrix, x and b are vectors and c is a scalar constant. 

The function is minimized by the solution to A x = b  

The solution to the optimization involving these two pairs of random variables is represented by Ax = b which is a linear function.

When we have several random variables, then each random variable contributes a dimension and the simple contour map changes to n-dimensional Euclidean space.

Just as the shortest distance between a point to a line is a perpendicular, and the shortest distance from a point to a plane is an orthogonal, the shortest distance between a point and a subspace must also be orthogonal to the subspace. This is the basis for an optimization principle involving an n-dimensional Euclidean space called the projection theorem. This theorem might not be applicable to a normed space but it is applicable to a Hilbert space. A normed linear space is a vector space having a measure of distance or length defined on it.  A Hilbert space is a special form of normed space having an inner product defined which is analogous to the dot product of the two vectors in analytical geometry. Those vectors are orthogonal if their dot product is zero. With the help of the orthogonality to determine the minimum, it is now possible to find the optimum as a minimization problem. The least-squares minimization is a technique that describes this minimization in Hilbert space.

The least squares regression involves the estimation of data at each point which gives a set of equations. If the data had no noise, the resulting set of equations would be written in the form of a matrix equation. Solving a matrix equation is well-known. Since the data may not fit a matrix equation perfectly, the least squares regression transforms the data point to an estimation that is as close to a matrix equation as possible. An estimation function can go through all the data points to determine if there will be a solution. The least squares regression is written as Beta = inverse-of(A-Transpose.A).A-Transpose.Y

An example of a least squares sample program in Python would look like this:

import numpy as np

from scipy import optimize

import matplotlib.pyplot as plt

x = np.linspace(0, 1, 101)

y = 1 + x + x * np.random.random(len(x))

A = np.vstack([x, np.ones(len(x))]).T

y = y[:, np.newaxis]

alpha = np.dot((np.dot(np.linalg.inv(np.dot(A.T, A)), A.T)), y)

print(alpha)

plt.figure(figsize = (10, 8))

plt.plot(x, y, ‘b.’)

plt.plot(x, alpha[0] * x + alpha[1], ‘r’)

plt.show()

Layering of neural networks is a technique that applies the same technique at a higher abstraction but it does not transform a problem from one space to another.

No comments:

Post a Comment