The following table summarizes the use of
data mining algorithms as it pertains to service requests that an IT department
receives from the employees of an organization. The idea in data mining is to
apply a data driven, inductive and backward technique to identifying a
model. This is different from forward
deductive methods in that those build model first, then deduce conclusions and
then match with data. If there’s a mismatch between the model prediction and
reality, the model would then be tuned.
Data
Mining Algorithms |
Description |
Use
case |
Classification
algorithms |
This is useful for finding similar groups based on discrete
variables It
is used for true/false binary classification. Multiple label classifications
are also supported. There are many techniques, but the data should have
either distinct regions on a scatter plot with their own centroids or if it
is hard to tell, scan breadth first for the neighbors within a given radius
forming trees or leaves if they fall short. |
Useful
for categorization of service requests beyond the nomenclature. Primary use
case is to see clusters of service request that match based on features. By
translating to a vector space and assessing the quality of cluster with a sum
of square of errors, it is easy to analyze large number of requests as
belonging to specific clusters for management perspective. |
Regression
algorithms |
This is very useful to calculate a linear relationship
between a dependent and independent variable, and then use that relationship
for prediction. |
IT service requests demonstrate elongated
scatter plots in specific categories. Even when the service requests come
demanding different resolutions in the same category, the relief times are
bounded and can be plotted along the timeline. One of the best advantages of
linear regression is the prediction about time as an independent variable.
When the data point has many factors contributing to their occurrence, a
linear regression gives an immediate ability to predict where the next
occurrence may happen. This is far easier to do than come with up a model
that behaves like a good fit for all the data points. |
No comments:
Post a Comment