Modern Applied Statistics with S, 4th ed

Modern Applied Statistics with S. Fourth Edition

by W. N. Venables and B. D. Ripley

Springer. ISBN 0-387-95457-0, 2002.

Hardback 232mm × 155mm, xi+495 pages

[Image of Cover]

This fourth edition was published in late July 2002, and reprinted in May 2003. Links to material for earlier editions.

On-line material:
Description Contents Differences from Earlier Editions
On-line Complements Exercises and Selected Answers Software and Datasets
Errata Contact authors Publisher's Web Sites

Description:

S is a powerful environment for the statistical and graphical analysis of data. It provides the tools to implement many statistical ideas that have been made possible by the widespread availability of workstations having good graphics and computational capabilities. This book is a guide to using S environments to perform statistical analyses and provides both an introduction to the use of S and a course in modern statistical methods. Implementations of S are available commercially in S-PLUS®; and as the Open Source R for a wide range of computer systems.

The aim of this book is to show how to use S as a powerful and graphical data analysis system. Readers are assumed to have a basic grounding in statistics, and so the book is intended for would-be users of S-PLUS or R and both students and researchers using statistics. Throughout, the emphasis is on presenting practical problems and full analyses of real data sets. Many of the methods discussed are state of the art approaches to topics such as linear, nonlinear and smooth regression models, tree-based methods, multivariate analysis, pattern recognition, survival analysis, time series and spatial statistics. Throughout modern techniques such as robust methods, non-parametric smoothing and bootstrapping are used where appropriate.

This fourth edition is intended for users of S-PLUS 6.0 or R 1.5.0 (or later). A substantial change from the third edition is updating for the current versions of S-PLUS and adding coverage of R. The introductory material has been rewritten to emphasis the import, export and manipulation of data. Increased computational power allows even more computer-intensive methods to be used, and methods such as GLMMs, MARS, Kohonen's SOM and support vector machines are considered.

The authors have written several software libraries that enhance S-PLUS and R; these and all the datasets used are supplied with Windows versions of S-PLUS and all versions of R, and are also available on the Internet in versions for Windows and Unix. There are extensive on-line complements covering advanced material, exercises and new features of S-PLUS and R as they are introduced.

Dr Venables is a Senior Statistician with the CSIRO in Australia. He has given many courses on statistical computing, data analysis and graphics using S in Australia, Europe and the USA. Professor Ripley holds the Chair of Applied Statistics at the University of Oxford, and is the author of four other books on spatial statistics, simulation, pattern recognition and neural networks. They are the joint authors of `S Programming', the authoritative guide to using the S language.

S-PLUS® is a commercial system of the Insightful Corporation.

The book is equally useful with R, a freely-available Open Source statistical system `not unlike S'.


Contents

  1. Introduction
  2. Data manipulation
  3. The S language
  4. Graphics
  5. Univariate statistics
  6. Linear statistical models
  7. Generalized linear models
  8. Non-linear and smooth regression
  9. Tree-based methods
  10. Random and mixed effects
  11. Exploratory multivariate analysis
  1. Classification
  2. Survival analysis
  3. Time series analysis
  4. Spatial statistics
  5. Optimization

Appendices:

  1. Implementation-specific details
  2. The S-PLUS GUI
  3. Datasets, software and libraries

On-line Complements

The `complements' provide an on-line updating of the book, as well as further details of technical material.

Currently there are `Statistical Complements' in gzip-ed postscript and PDF covering

Future complements are planned to cover changes in S-PLUS and in R.


Exercises and Selected Answers

Some exercises on both S programming and data analysis are available for downloading. There are answers to almost all the programming exercises and to some of the data analysis problems.

VR4ex.ps.gz gzip-ed PostScript (125Kb)
VR4ex.pdf PDF (240Kb)

The PDF version has extensive hyperlinks, for example between exercises and their answers. Viewers can be downloaded from www.adobe.com; a suitable viewer is normally installed with S-PLUS 6.x on Windows.


Errata

There are errata lists available for

Printing
First Edition first second third fourth
Second Edition first second third
Third Edition first second third
Fourth Edition first second

Only those for the current edition are maintained.


Differences from the Earlier Editions

The Second Edition was written when S-PLUS 3.4 was current; version 4.0 appeared shortly after the book.

The Third Edition was extensively revised, assuming that the reader had S-PLUS 4.0 or later, and it takes account of S-PLUS 5.x and 2000. As much of the material as possible was usable with S-PLUS 3.3/4 and also with R. This gave accounts of the analyses made possible by the nlme3 and survival5 software. We added enhanced software for robust regression and for proportional odds logistic regression, and provided in-depth analyses using these.

The Fourth Edition is targeted at S-PLUS 6.x and R. This enables many new features of current S to be used, and will be particularly helpful to R users for whom almost all the changes needed are present in the main text. We have re-organized the introductory material and added new material on data import/export. The statistical material uses automated bandwidth selectors for histograms and density estimation, and adds new material on visualization, ICA, Kohonen's SOM, support vector machines and fitting GLMMs. Material previously in the on-line complements on over-dispersion, factor analysis and correspondence analysis is now in the main text. The time-series material has been re-worked, using the arima() function we wrote and including material on GARCH models. There is a separate chapter on optimization, making use of the optim() function written by BDR for R and now available for S-PLUS in MASS.

The material on programming has been reduced since the first and second editions: a much more comprehensive account is given in the companion volume S Programming.


Authors:

The main contact address is MASS@stats.ox.ac.uk: please use that for general enquiries.
Dr W. N. Venables
CMIS Environmetrics Project
PO Box 120,
Cleveland, Qld, 4163
AUSTRALIA

Email: Bill.Venables@csiro.au

Professor B. D. Ripley
Department of Statistics
1 South Parks Road
Oxford OX1 3TG
UK

Email: ripley@stats.ox.ac.uk


Publisher:

Links are provided to Springer's home pages in Germany and the USA.


Last edited on 2 May 2003 by Brian Ripley ripley@stats.ox.ac.uk