Fundamentals of statistical training. Data mining, logical inference and forecasting
Over the past decade, there has been an explosion in computing and information technology. Together with it, huge amounts of data have appeared in various fields, such as medicine, biology, finance and marketing. The problem of understanding this data has led to the development of new statistical tools and has given rise to new scientific disciplines such as data mining, machine learning and bioinformatics. Many of these tools have common scientific foundations, but are often described using different terminology. This book describes important ideas in these fields from a unified theoretical point of view. Although this approach is statistical, the emphasis is on concepts rather than mathematics. There are many examples with extensive use of color graphics. The book is a valuable source of information for statisticians and anyone interested in data mining in science or industry. The scope of the book is wide: from teaching with a teacher (forecasting) to teaching without a teacher. It describes neural networks, the method of support vectors, classification trees and boosting, which for the first time is comprehensively considered in the book, and not in separate publications. This deeply revised edition presents many topics not covered in the first edition, including graph models, random forests, ensemble methods, least angle regression algorithms and trajectory algorithms for LASSO methods, non-negative matrix factorization and spectral clustering. The book also has a chapter on methods for analyzing "broad" data (when p is greater than n), including multiple testing and the proportion of false hypothesis deviations.
No reviews found