Data Mining: Concepts, Models, Methods,and Algorithms
-
Author:
-
Subject:
-
Published by:John Wiley & Sons Inc (US)
-
Published:11/10/2002
-
Price:$150.00
- < Buy this book >
We are surrounded by data, numerical and otherwise, which must be analyzed and processed to convert it into information that informs, instructs, answers, or otherwise aids understanding and decision-making. Due to the ever-increasing complexity and size of today's data sets, a new term, data mining, was created to describe the indirect, automatic data analysis techniques that utilize more complex and sophisticated tools than those which analysts used in the past to do mere data analysis.
Data Mining: Concepts, Models, Methods, and Algorithms discusses data mining principles and then describes representative state-of-the-art methods and algorithms originating from different disciplines such as statistics, machine learning, neural networks, fuzzy logic, and evolutionary computation. Detailed algorithms are provided with necessary explanations and illustrative examples.
This text offers guidance: how and when to use a particular software tool (with their companion data sets) from among the hundreds offered when faced with a data set to mine. This allows analysts to create and perform their own data mining experiments using their knowledge of the methodologies and techniques provided.
This book emphasizes the selection of appropriate methodologies and data analysis software, as well as parameter tuning. These critically important, qualitative decisions can only be made with the deeper understanding of parameter meaning and its role in the technique that is offered here. Data mining is an exploding field and this book offers much-needed guidance to selecting among the numerous analysis programs that are available.
Biography
Table of Contents
1 Data Mining Concepts.
1.1 Introduction.
1.2 Data-mining roots.
1.3 Data-mining process.
1.4 Large data sets.
1.5 Data warehouses.
1.6 Organization of this book.
1.7 Review questions and problems.
1.8 References for further study.
2 Preparing the Data.
2.1 Representation of raw data.
2.2 Characteristics of raw data.
2.3 Transformation of raw data.
2.4 Missing data.
2.5 Time-dependent data.
2.6 Outlier analysis.
2.7 Review questions and problems.
2.8 References for further study.
3 Data Reduction.
3.1 Dimensions of large data sets.
3.2 Features reduction.
3.3 Entropy measure for ranking features.
3.4 Principal component analysis.
3.5 Values reduction.
3.6 Feature discretization: ChiMerge technique.
3.7 Cases reduction.
3.8 Review questions and problems.
3.9 References for further study.
4 Learning from Data.
4.1 Learning machine.
4.2 Statistical learning theory.
4.3 Types of learning methods.
4.4 Common learning tasks.
4.5 Model estimation.
4.6 Review questions and problems.
4.7 References for further study.
5 Statistical Methods.
5.1 Statistical inference.
5.2 Assessing differences in data sets.
5.3 Bayesian inference.
5.4 Predictive regression.
5.5 Analysis of variance.
5.6 Logistic regression.
5.7 Log-linear models.
5.8 Linear discriminant analysis.
5.9 Review questions and problems.
5.10 References for further study.
6 Cluster Analysis.
6.1 Clustering concepts.
6.2 Similarity measures.
6.3 Agglomerative hierarchical clustering.
6.4 Partitional clustering.
6.5 Incremental clustering.
6.6 Review questions and problems.
6.7 References for further study.
7 Decision Trees and Decision Rules.
7.1 Decision trees.
7.2 C4.5 Algorithm: generating a decision tree.
7.3 Unknown attribute values.
7.4 Pruning decision tree.
7.5 C4.5 Algorithm: generating decision rules.
7.6 Limitations of decision trees and decision rules.
7.7 Associative-classification method.
7.8 Review questions and problems.
7.9 References for further study.
8 Association Rules.
8.1 Market-Basket Analysis.
8.2 Algorithm Apriori.
8.3 From frequent itemsets to association rules.
8.4 Improving the efficiency of the Apriori algorithm.
8.5 Frequent pattern-growth method.
8.6 Multidimensional association-rules mining.
8.7 Web mining.
8.8 HITS and LOGSOM algorithms.
8.9 Mining path-traversal patterns.
8.10 Text mining.
8.11 Review questions and problems.
8.12 References for further study.
9 Artificial Neural Networks.
9.1 Model of an artificial neuron.
9.2 Architectures of artificial neural networks.
9.3 Learning process.
9.4 Learning tasks.
9.5 Multilayer perceptrons.
9.6 Competitive networks and competitive learning.
9.7 Review questions and problems.
9.8 References for further study.
10 Genetic Algorithms.
10.1 Fundamentals of genetic algorithms.
10.2 Optimization using genetic algorithms.
10.3 A simple illustration of a genetic algorithm.
10.4 Schemata.
10.5 Traveling salesman problem.
10.6 Machine learning using genetic algorithms.
10.7 Review questions and problems.
10.8 References for further study.
11 Fuzzy Sets and Fuzzy Logic.
11.1 Fuzzy sets.
11.2 Fuzzy set operations.
11.3 Extension principle and fuzzy relations.
11.4 Fuzzy logic and fuzzy inference systems.
11.5 Multifactorial evaluation.
11.6 Extracting fuzzy models from data.
11.7 Review questions and problems.
11.8 References for further study.
12 Visualization Methods.
12.1 Perception and visualization.
12.2 Scientific visualization and information visualization.
12.3 Parallel coordinates.
12.4 Radial visualization.
12.5 Kohonen self-organized maps.
12.6 Visualization systems for data mining.
12.7 Review questions and problems.
12.8 References for further study.
13 References.
APPENDIX A: Data-Mining Tools.
Al Commercially and publicly available tools.
A2 Web site links.
APPENDIX B: Data-Mining Applications.
Bl Data mining for financial data analysis.
B2 Data mining for the telecommunications industry.
B3 Data mining for the retail industry.
B4 Data mining in healthcare and biomedical research.
B5 Data mining in science and engineering.
B6 Pitfalls of data mining.
INDEX.
ABOUT THE AUTHOR.
- CCDB2 / DBA Technical Consultant - Finance company - Melbourne CBD - DB2VIC
- FTSenior .Net Developer - Mobility/Portal SolutionsNSW
- FTAccount Manager - Strategic Enterprise DevelopmentNSW
- FTMobile Portal Architect - .Net TechnologiesNSW
- FTDigital Account ManagerNSW
- FTDigital Account ManagerNSW
- FTSupport Consultant - Global Vendor - $55-75,000NSW
- CCDigital Business Analyst - Agile/ScrumNSW
iAsset is a channel management ecosystem that automates all major aspects of the entire sales,marketing and service process, including data tracking, integrated learning, knowledge management and product lifecycle management.
Premier Media Group Fast Study
A Fast Study is a succinct, easy to read Case Study. Spectra Logic aims to provide an overview of how to obtain the right solution for data archive, backup and recovery.
HiveManager Online: Less Dollars, More Sense
Today’s de facto standard controller-based Wi-Fi infrastructure model is just too complicated, too expensive, and too unreliable. It’s common for enterprise and mid-market network operators alike to get caught in a crossroads of compromises involving costs, complexity, features, and reliability.
Buying Guides
Latest Products
- Whistleblowing site Cryptome.org infected with drive-by exploits
- Office for Windows on ARM: Free or not?
- Sony shows power outlets that can control electricity by user, device, or source
- Startup with heavy Russian connections offers crypto authentication service
- FBI seeks social media monitoring tool








