R Statistical Application Development By Example Beginners Guide
User Manual:
Open the PDF directly: View PDF
Page Count: 345 [warning: Documents this large are best viewed by clicking the View PDF Link!]
- Cover
- Copyright
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Table of Contents
- Preface
- Chapter 1: Data Characteristics
- Chapter 2: Import/Export Data
- data.frame and other formats
- Time for action – understanding constants, vectors, and basic arithmetic
- Time for action – matrix computations
- Time for action – creating a list object
- Time for action – creating a data.frame object
- Time for action – creating the Titanic dataset as a table object
- read.csv, read.xls, and the foreign package
- Time for action – importing data from external files
- Exporting data/graphs
- Time for action – exporting a graph
- Managing an R session
- Time for action – session management
- Summary
- Chapter 3: Data Visualization
- Visualization techniques for categorical data
- Time for action – bar charts in R
- Time for action – dot charts in R
- Time for action – the spine plot for the shift and operator data
- Time for action – the mosaic plot for the Titanic dataset
- Visualization techniques for continuous variable data
- Time for action – using the boxplot
- Time for action – understanding the effectiveness of histograms
- Time for action – plot and pairs R functions
- A brief peek at ggplot2
- Time for action – qplot
- Time for action – ggplot
- Summary
- Chapter 4: Exploratory Analysis
- Essential summary statistics
- Time for action – the essential summary statistics for "The Wall" dataset
- The stem-and-leaf plot
- Time for action – the stem function in play
- Letter values
- Data re-expression
- Bagplot – a bivariate boxplot
- Time for action – the bagplot display for a multivariate dataset
- The resistant line
- Time for action – the resistant line as a first regression model
- Smoothing data
- Time for action – smoothening the cow temperature data
- Median polish
- Time for action – the median polish algorithm
- Summary
- Chapter 5: Statistical Inference
- Maximum likelihood estimator
- Time for action – visualizing the likelihood function
- Time for action – finding the MLE using mle and fitdistr functions
- Confidence intervals
- Time for action – confidence intervals
- Hypotheses testing
- Time for action – testing the probability of success
- Time for action – testing proportions
- Time for action – testing one-sample hypotheses
- Time for action – testing two-sample hypotheses
- Summary
- Chapter 6: Linear Regression Analysis
- The simple linear regression model
- Time for action – the arbitrary choice of parameters
- Time for action – building a simple linear regression model
- Time for action – ANOVA and the confidence intervals
- Time for action – residual plots for model validation
- Multiple linear regression model
- Time for action – averaging k simple linear regression models
- Time for action – building a multiple linear regression model
- Time for action – the ANOVA and confidence intervals for the multiple linear regression model
- Time for action – residual plots for the multiple linear regression model
- Regression diagnostics
- The multicollinearity problem
- Time for action – addressing the multicollinearity problem for the Gasoline data
- Model selection
- Time for action – model selection using the backward, forward, and AIC criteria
- Summary
- Chapter 7: The Logistic Regression Model
- The binary regression problem
- Time for action – limitations of linear regression models
- Probit regression model
- Time for action – understanding the constants
- Logistic regression model
- Time for action – fitting the logistic regression model
- Time for action – The Hosmer-Lemeshow goodness-of-fit statistic
- Model validation and diagnostics
- Time for action – residual plots for the logistic regression model
- Time for action – diagnostics for the logistic regression
- Receiving operator curves
- Time for action – ROC construction
- Logistic regression for the German credit screening dataset
- Time for action – logistic regression for the German credit dataset
- Summary
- Chapter 8: Regression Models with Regularization
- The overfitting problem
- Time for action – understanding overfitting
- Regression spline
- Time for action – fitting piecewise linear regression models
- Time for action – fitting the spline regression models
- Ridge regression for linear models
- Time for action – ridge regression for the linear regression model
- Ridge regression for logistic regression models
- Time for action – ridge regression for the logistic regression model
- Another look at model assessment
- Time for action – selecting lambda iteratively and other topics
- Summary
- Chapter 9: Classification
and Regression Trees
- Recursive partitions
- Time for action – partitioning the display plot
- Time for action – building our first tree
- The construction of a regression tree
- Time for action – the construction of a regression tree
- The construction of a classification tree
- Time for action – the construction of a classification tree
- Classification tree for the German credit data
- Time for action – the construction of a classification tree
- Pruning and other finer aspects of a tree
- Time for action – pruning a classification tree
- Summary
- Chapter 10: CART and Beyond
- Improving CART
- Time for action – cross-validation predictions
- Bagging
- Time for action – understanding the bootstrap technique
- Time for action – the bagging algorithm
- Random forests
- Time for action – random forests for the German credit data
- The consolidation
- Time for action – random forests for the low birth weight data
- Summary
- Appendix: References
- Index