Wednesday, January 13, 2021

Predicting relief time on service tickets – nuances between decision tree and time-series algorithms – a data science essay (continued...)

In this way, a decision tree uses the attributes of the service requests to make a prediction on the relief time. A time-series algorithm, on the other hand, does not need any attributes other than the historical collection of relief times to be able to predict the next relief time. It only looks at scalar value regardless of the type of factors playing into the relief time of an individual request. The historical data is utilized to predict an estimation of the incoming event as if the relief were a scatter plot along the timeline. Unlike other data mining algorithms that involve additional attributes of the event, this approach uses a single auto-regressive method on the continuous data to make a short-term prediction. The regression is automatically trained as the data accrue. 

Comparing the decision tree and the time-series to the Naïve Bayes Classifier, it is easy to see that while these two algorithms work with new rows, the Bayes classifier works with attributes of the rows against the last columns as the predictor. Although linear regressions are useful in the prediction of a variable, Naïve Bayes builds on conditional states across attributes and are easy to visualize which allows experts to show the reasoning process and allows users to judge the quality of prediction. All these algorithms need training data in our use case, but Naïve Bayes uses it for explorations and predictions based on earlier requests such as to determine whether the self-help was useful or not – evaluating both probabilities conditionally.  


The conditional probability can be used both for exploration as well as for prediction. Each input column in the dataset has a state calculated by this algorithm which is then used to assign a state to the predictable column.  For example, the availability of a Knowledge Base article might show a distribution of input values significantly different from others which indicates that this is a potential predictor. 


The viewer also provides values for the distribution so that KB articles that suggest opening service requests with specific attributes will be easier to follow, act upon, and get resolution. The algorithm can then compute a probability both with and without that criteria.

No comments:

Post a Comment