|
The two disciplines of statistics and data mining have common aims in that both are concerned with discovering structure in data. Indeed, so much do their aims overlap, that some people (perhaps, in the main, some statisticians) regard data mining as a subset of statistics. This is not a realistic assessment. Data mining also makes use of ideas, tools, and methods from other areas -especially computational areas such as database technology and machine learning - and is not heavily concerned with some areas in which statisticians are interested.
The commonality of aims between statistics and data mining has naturally caused some confusion. Indeed, it has even sometimes caused antipathy. Statistics has formal roots stretching back at least throughout this century, and the appearance of a new discipline, with new players, who purported to be solving problems that statisticians had previously considered part of their dominion, inevitably caused concern. The more so since the new discipline had an attractive name, almost alculated to arouse interest and curiosity. Contrast the promise latent in the term ‘data mining’ with the historical burden conveyed by the word ‘statistics’, a word originally coined to refer to ‘matters of state’ and which carries with it the emotional connotations of sifting through columns of tedious numbers. Of course, the fact that this historical image is far from the modern truth is neither here nor there. Furthermore, the new subject had particular relevance to commercial concerns (though it also had scientific and other applications).
统计与数据挖掘.pdf
(66.38 KB, 下载次数: 64)
|
|