Regression analysis playes a central role in statistics being one of its
most powerful and commonly used techniques. The standard linear regression
models assume that the response variable is normal (or at least can be
transformed to a normal one). However, unfortunately/fortunately (?)
it is not always the case. A wide
variety of models with a categorical or counting response is typical
(althgough not the
only ones!) examples, where the assumption of normality cannot be accepted as
reasonable. In this course we study generalized linear models, where the
response variables are allowed to be non-normal. We start from the general
theory of generalazied linear models, extending the corresponding results for
standard linear regression, and then consider the most useful particular cases
in more details.
The course assumes an extensive use of computer. There are no limitations
on using various statistical packages and software for this course, although
the data-examples considered in the class will be ``oriented" for S-Plus or its
free analog R.
Installation instructions and manuals for R can be found on the
R Home page .
The following S-Plus and R based books may be helpful for this course:
Aitkin, M., Francis, B., Hinde, J. and Darnell, R. Statistical Modelling in
R.
Some specific S-Plus and R notes you may find useful for generalized linear models:
To fit a generalized linear model you will generally use the function glm:
>glm(formula, family=...(link=...),...)
The glm function creates an object of class glm that contains most of
information you need. See help(glm.object) for details.
For some data the convergence of the iteratively reweighted least squares algorithm
is slow and does not occur in (default) 10 iterations.
It may happen, for example, in binomial models with a lot of empty cells.
S-Plus gives you a "Warning". Don't panic! You can increase the number of
iterations by the parameter maxit:
>glm(formula,...,maxit=...)
To evaluate the fitted model at some new values of the predictors use the
function predict:
>predict(glm.object, type=...,se=T)
The output will contain, in particlular, a vector of estimated response
(depending on type) and a vector of standard errors for constructing
confidence intervals for the mean response.