Tuesday, January 5, 2021

Performing Association Data mining on IT service requests continued ...

We were discussing that association data mining allows IT users to see helpful messages such as “users who opened a ticket for this problem type also opened a ticket for this other problem type”. This article describes the implementation aspect of this data mining technique.  

Evaluating the three metrics for each of the association results in an Association.content table where product pairs have support, confidence and lift. Then the associations can be filtered to have a lift > 1.0 

The apriori algorithm works from the superset down to the associations with the required size for antecedent, consequent item sets. In this case, we have both itemsets of size 1 each. The idea behind the apriori algorithm is that if the inclusion of an item in an itemset is not increasing the lift of that itemset, it will not increase the lift for any subsets formed from that itemset where each subset has that item. This way the cartesian products of antecedent-consequent itemsets can be trimmed by eliminating those consequents where that item is present. Consider several layers of cartesian products formed where the consequent itemset grows by 1 for each layer and the elimination of one or more associations at each layer, then the consequent grows to the optimum size in the final set. In our case, we require only one layer and associations can be sorted in the descending order based on the lift. 

SELECT A.name, B.name from Associations.Content  

ORDER BY lift_y_x DESC 

LIMIT 10; 


Sample Implementation: https://jsfiddle.net/g2snw4da/  


No comments:

Post a Comment