Tuesday, November 7, 2017

We were discussing the difference between data mining and machine learning given that there is some overlap. In this context, I want to attempt explaining modeling in general terms. We will be following slides from Costa, Kleinstein and Hershberg on Model fitting and error estimation.
A model articulates how a system behaves quantitatively. It might involve equations or a system of equations using variables to denote the observed. The purpose of the model is to give a prediction based on the variables. In order to make the prediction somewhat accurate, it is often trained on a set of data before being used to predict on the test data. This is referred to as model tuning. Models use numerical methods to examine complex situations and come up with predictions. Most common techniques involved for coming up with a model include statistical techniques, numerical methods, matrix factorizations and optimizations.  Starting from the Newton's laws, we have used this kind of technique to understand and use our world.


#codingexercise
Describe the k-means clustering technique
void kmeans(int dimension, double **vectors, int size, int k, int max_iterations)
{
     if (vectors == NULL || *vectors == NULL || |size == 0 || k == 0 || dimension  == 0) return;

     int* cluster_labels = initialize_cluster_labels(vectors, size, k);
     int* centroids = initialize_centroids(vectors, size, k);
     bool centroids_updated = true;
     int count = 0;
     while(centroids_updated)
     {
         count++;
         if (count > max_iterations) break;
         assign_clusters(dimension, vectors, centroids, cluster_labels, size, k);
         centroids_updated = fix_centroids(dimension, vectors, centroids, cluster_labels, size, k);
     }

     print_clusters(cluster_labels, size);
     free(centroids);
     free(cluster_labels);
     return;
}

No comments:

Post a Comment