Cluster computing

Saturday, January 17, 2015

#codingexercise
Double GetAlternateEvenNumberRangeSqRtSumOfSquares()(Double [] A)
{
if (A == null) return 0;
Return A.AlternateEvenNumberRangeSqRtSumOfSquares();
}

Today we read a paper from Hearst : Untangling text data mining. Hearst reminds us that the possibilities of extracting information from text is virtually untapped because text expresses a vast range of information but encodes it in a way that is difficult to decipher automatically. We recently reviewed the difference between Text Knowledge Mining and Text Data Mining. In this paper, the focus is on text data mining. Some of the new problems encountered in computational linguistics are called out in this paper and outlines ideas about how to pursue exploratory data analysis over text.
Hearst differentiates between TDM and Information Access. The goal of information access is to help users find documents that satisfy their information needs. The standard procedure is akin to looking for needles in a needlestack. The analogy comes from the fact that the problem is about finding information that coexists with other valid pieces of information. The homing in on to the information that the user is interested in is the problem here. As per Hearst, the goal of data mining on the other hand, is to derive new information from data, finding patterns across datasets, and/or separating signal from noise. The fact that an information retrieval system can return a document that contains the information the user requested implies that no new discovery is made.
Hearst points out that text data mining is sometimes discussed together with search on the web. For example, the KDD-97 panel on data mining stated that the two challenges predominant for data mining are finding useful information on the web and discovering knowledge about a domain that is represented by a collection of web-documents as well as to analyze the transactions run in a web based system. This search-centric view misses the point that the web can be considered a knowledge base that is helpful to extract new never before encountered information.
The results of certain types of text processing can yield tools that indirectly aid in the information access process. Examples include text clustering to create thematic overviews of text collection.
#codingexercise
Double GetAlternateEvenNumberRangeSumOfSquares()(Double [] A)
{
if (A == null) return 0;
Return A.AlternateEvenNumberRangeSumOfSquares();
}

Cluster computing

Saturday, January 17, 2015

No comments:

Post a Comment