Wednesday, January 20, 2021

When-to-use-what data mining algorithms:

 

Logistic Regression 

This is a form of regression that supports binary outcomes. It uses statistical measures, is highly flexible, takes any kind of input and supports different analytical tasks. This regression folds the effects of extreme values and evaluates several factors that affects a pair of outcomes. 

IT requests based on demographics can be used to predict the likelihood of a category of request from a customer. It can also be used for finding repetitions in requests  

Neural Network 

This is widely used method for machine learning involving neurons that have one or more gates for input and output. Each neuron assigns a weight usually based on probability for each feature and the weights are normalized across resulting in a weighted matrix that articulates the underlying model in the training dataset. Then it can be used with a test data set to predict the outcome probability. Neurons are organized in layers and each layer is independent of the other and can be stacked so they take the output of one as the input to the other. 

Widely used for softmax classifier in NLP associated with Service Requests. Since descriptions, case investigation notes, case resolution notes are loose text data captured by technicians, Natural Language Processing has become a significant part of the IT data mining portfolio 

Naïve Bayes algorithm 

This is probably the most straightforward statistical probability-based data mining algorithm compared to others. 

The probability is a mere fraction of interesting cases to total cases. Bayes probability is conditional probability which adjusts the probability based on the premise. 

This is widely used for cases where conditions apply especially binary conditions such as with or without. If the input variables are independent, their states can be calculated as probabilities, and there is at least a predictable output, this algorithm can be applied. The simplicity of computing states by counting for class using each input variable and then displaying those states against those variables for a give value, makes this algorithm easy to visualize, debug and use as a predictor.   

No comments:

Post a Comment