Cluster computing

Applications of Data Mining to Reward points collection service

Neural network can be applied in layers and they can be combined with regressors so the technique can be used for a variety of use cases. There are four different types of neural networks. The fully connected layer which connects every neuron in one layer to every neuron in another layer. This is great for rigorous encoding, but it becomes expensive for large inputs and scalability. The convolutional layer is mostly used as a filter that brings out salient features from the input set. The filter sometimes called kernel is represented by a set of n-dimensional weights and describe the probabilities that a given pattern of input values represents a feature. A deconvolutional layer comes from a transposed convolutional process where the data is enhanced to increase resolution or to transform. A recurrent layer includes a looping capability such that its input consists of both the data to analyze as well as the output from a previous calculation performed by that layer. This is helpful to maintain state across iterations and for transforming one sequence to another.

The choices to apply machine learning techniques is dependent both on the applicability of the algorithm as well as the data. For example, we use a Convolutional Neural Network when we want to perform only classification. We use a Recurrent Neural Network when we want to retain state between encodings such as with sequences. We use classifier and regressor when we want to detect objects and their bounding box. The choices also vary with the data. CNN works great with Tensors which are distinct and independent from one another. The output using Tensors for a K-Nearest neighbors consists of a label with the most confidence which is a statistical parameter based on the support for the label, a class index, and a score set for the confidence associated with each label. Scalar data works very well for matrix and matrix operations. RNN works well with sequence of inputs.

One of the highlights of the machine learning deployments as opposed to the deployment of data mining models is that the model can be built and tuned in one place and run anywhere else. The client friendly version of TensorFlow allows the model to run on clients with little resource as mobile devices. The environment for model building usually supports GPU. This works well to create a production pipeline where the data can be sent to the model independent of where the training data was kept and used to train the model. Since the training data flows into the training environment, its pipeline is internal. The test data can be sent to the model wherever it is hosted over the wire as web requests. The model can be run on containers on the server side or even in the browser on the client side. Another highlight of the difference between ML environment pipeline and the data mining pipeline is the heterogeneous mix of technologies and products on the ML side as opposed to the homogeneous relational database-based stack on the data mining side. For example, logs, streams and events may be streamed into the production pipeline via Apache Kafka, processed using Apache Flink, the kernels built with SciKit, Keras or Spark-ML and the trained models run on containers taking and responding to web-requests.

The following chart makes a comparison of all the data mining algorithms including the neural networks: https://1drv.ms/w/s!Ashlm-Nw-wnWxBFlhCtfFkoVDRDa?e=aVT37e

Thank you.

Cluster computing

Sunday, April 4, 2021

No comments:

Post a Comment