TIENE EN SU CESTA DE LA COMPRA
en total 0,00 €
Data Mining for Business Analytics: Concepts, Techniques, and Applications in R presents an applied approach to data mining concepts and methods, using R software for illustration
Readers will learn how to implement a variety of popular data mining algorithms in R (a free and open-source software) to tackle business problems and opportunities.
This is the fifth version of this successful text, and the first using R. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes:
Two new co-authors, Inbal Yahav and Casey Lichtendahl, who bring both expertise teaching business analytics courses using R, and data mining consulting experience in business and government
Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students
More than a dozen case studies demonstrating applications for the data mining techniques described
End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented
A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions
Data Mining for Business Analytics: Concepts, Techniques, and Applications in R is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology.
" This book has by far the most comprehensive review of business analytics methods that I have ever seen, covering everything from classical approaches such as linear and logistic regression, through to modern methods like neural networks, bagging and boosting, and even much more business specific procedures such as social network analysis and text mining. If not the bible, it is at the least a definitive manual on the subject.ö
Gareth M. James, University of Southern California and co-author (with Witten, Hastie and Tibshirani) of the best-selling book An Introduction to Statistical Learning, with Applications in R
Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University's Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 publications including books.
Peter C. Bruce is President and Founder of the Institute for Statistics Education at Statistics.com. He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective (Wiley) and co-author of Practical Statistics for Data Scientists: 50 Essential Concepts (O'Reilly).
Inbal Yahav, PhD, is Professor at the Graduate School of Business Administration at Bar-Ilan University, Israel. She teaches courses in social network analysis, advanced research methods, and software quality assurance. Dr. Yahav received her PhD in Operations Research and Data Mining from the University of Maryland, College Park.
Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years.
Kenneth C. Lichtendahl, Jr., PhD, is Associate Professor at the University of Virginia. He is the Eleanor F. and Phillip G. Rust Professor of Business Administration and teaches MBA courses in decision analysis, data analysis and optimization, and managerial quantitative analysis. He also teaches executive education courses in strategic analysis and decision-making, and managing the corporate aviation function.
Table of Contents
Contents
Foreword by Gareth James xix
Foreword by Ravi Bapna xxi
Preface to the R Edition xxiii
Acknowledgments xxvii
PART I PRELIMINARIES
CHAPTER 1 Introduction 3
1.1 What Is Business Analytics? 3
1.2 What Is Data Mining? 5
1.3 Data Mining and Related Terms 5
1.4 Big Data 6
1.5 Data Science 7
1.6 Why Are There So Many Different Methods? 8
1.7 Terminology and Notation 9
1.8 Road Maps to This Book 11
Order of Topics 11
CHAPTER 2 Overview of the Data Mining Process 15
2.1 Introduction 15
2.2 Core Ideas in Data Mining 16
Classification 16
Prediction 16
Association Rules and Recommendation Systems 16
Predictive Analytics 17
Data Reduction and Dimension Reduction 17
Data Exploration and Visualization 17
Supervised and Unsupervised Learning 18
2.3 The Steps in Data Mining 19
2.4 Preliminary Steps 21
Organization of Datasets 21
Predicting Home Values in the West Roxbury Neighborhood 21
Loading and Looking at the Data in R 22
Sampling from a Database 24
Oversampling Rare Events in Classification Tasks 25
Preprocessing and Cleaning the Data 26
2.5 Predictive Power and Overfitting 33
Overfitting 33
Creation and Use of Data Partitions 35
2.6 Building a Predictive Model 38
Modeling Process 39
2.7 Using R for Data Mining on a Local Machine 43
2.8 Automating Data Mining Solutions 43
Data Mining Software: The State of the Market (by Herb Edelstein) 45
Problems 49
PART II DATA EXPLORATION AND DIMENSION REDUCTION
CHAPTER 3 Data Visualization 55
3.1 Uses of Data Visualization 55
Base R or ggplot? 57
3.2 Data Examples 57
Example 1: Boston Housing Data 57
Example 2: Ridership on Amtrak Trains 59
3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots 59
Distribution Plots: Boxplots and Histograms 61
Heatmaps: Visualizing Correlations and Missing Values 64
3.4 Multidimensional Visualization 67
Adding Variables: Color, Size, Shape, Multiple Panels, and Animation 67
Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering 70
Reference: Trend Lines and Labels 74
Scaling up to Large Datasets 74
Multivariate Plot: Parallel Coordinates Plot 75
Interactive Visualization 77
3.5 Specialized Visualizations 80
Visualizing Networked Data 80
Visualizing Hierarchical Data: Treemaps 82
Visualizing Geographical Data: Map Charts 83
3.6 Summary: Major Visualizations