Friday, January 9, 2015

Today we discuss a paper called Text Knowledge Mining : An alternative to Text Data Mining. The authors introduce a notion that is different from contemporary mining techniques which they call inductive inference. They term their approach deductive inference and they include some of the existing techniques in this category. They also discuss about the application of existing theories in possible future research in this category.
They say that the text mining has essentially been data mining on unstructured data by obtaining structured datasets called intermediate forms. Some examples are :
a text unit of a word translates to an intermediate form of bag of words or N-grams.
A concept translates to concept hierarchy, conceptual graph, semantic graph, conceptual dependence.
A phrase translates to N-phrases, Multi-term text phrases, trends etc.
A paragraph translates to a paragraph, N-Phrases, multiterm text phrases, and Trends.
A text unit of document is retained as such.
They argue that text data is not inherently unstructured. It is characterized by very complex implicit structure that has defeated many representation attempts with a very rich semantics. The use of intermediate forms in fact loses the semantics of the text because we are using a very small part of their expressive power by the chosen text unit.
The authors quote the study of causal relationships in medical literature where the attempt to piece together the causes from the titles of the MEDLINE database, in order to generate previously unknown hypothesis actually produced good results. For e.g. Migraine is attributed to deficiency of Magnesium was established from the following:
stress is associated with migraines
stress can lead to loss of magnesium
calcium channel blockers prevent some migraines
magnesium is a natural calcium channel blocker
spreading cortical depression is implicated in some migraines
high levels of magnesium inhibit spreading cortical depression
migraine patients have high platelet aggregability
magnesium can suppress platelet aggregability
#codingexercise
Double GetAlternateEvenNumberRangeMedian()(Double [] A)
{
if (A == null) return 0;
Return A.AlternateEvenNumberRangeMedian();
}
#codingexercise
Double GetAlternateEvenNumberRangeStdDev()(Double [] A)
{
if (A == null) return 0;
Return A.AlternateEvenNumberRangeStdDev();
}

No comments:

Post a Comment