Generalized Linear Models
Regression analysis playes a central role in statistics being one of its
most powerful and commonly used techniques. The standard linear regression
models assume that the response variable is normal (or at least can be
transformed to a normal one). However, unfortunately/fortunately (?)
it is not always the case. A wide
variety of models with a categorical or counting response is typical
(althgough not the
only ones!) examples, where the assumption of normality cannot be accepted as
reasonable. In this course we study generalized linear models, where the
response variables are allowed to be non-normal. We start from the general
theory of generalazied linear models, extending the corresponding results for
standard linear regression, and then consider the most useful particular cases
in more details.
- Standard (normal) linear regression model
- Generalized linear regression model
- Theory of Generalized Linear Models
- Model components
- exponential family and its properties
- link functions
- Maximum likelihood estimation
- Newton-Raphson method
- iteratively reweighted least squares
- analysis of deviance
- Pearson statistic
- analysis of residuals
- Model selection
- Particular Models
- Binary data
- Binomial data
- Multinomial data
- Poisson data
- Overdispersion & Quasi-Likelihood Models
- Nonparametric GLM
- Normal linear models with heterogeneous variance and GLM
- Generalized linear mixed effects models
- Dobson, A.J. An Introduction to Generalized Linear Models.
- McCullagh, P. and Nelder, J.A. Generalized Linear Models.
- Wood, S.N. Generalized Additive Models. An Introduction with R (Chapter 2).
- Myers, R.H. and Montgomery, D.C. A tutorial on Generalized Linear Models.
Journal of Quality Technology, 29, 274-291.
- Chapter 5: Green, P.G. and Silverman B.W. Nonparametric Regression and
Generalized Linear Models.
The course assumes an extensive use of computer. There are no limitations
on using various statistical packages and software for this course, although
the data-examples considered in the class will be ``R-oriented''.
Installation instructions and manuals for R can be found on the
R Home page .
The R based books may be helpful for this course:
In addition, you can enjoy various R packages that are not included in
ka standard R software. For example, you can find very useful Ripley's
software provided with the book "Modern Applied Statistics with S".
To use Ripley's software enter S-Plus and give the command:
It is a good idea to add the above command to the function .First
This function is automatically executed every time you start R
and in this case you won't need to give this command every R session.
If you have not created the function .First before, do it by
and you will creat the R function .First
that meanwhile contains the only library command.
Some specific R notes you may find useful for generalized linear models:
- To fit a generalized linear model you will generally use the function glm:
- >glm(formula, family=...(link=...),...)
- The glm function creates an object of class glm that contains most of
information you need. See help(glm.object) for details.
- For some data the convergence of the iteratively reweighted least squares algorithm
is slow and does not occur in (default) 10 iterations.
It may happen, for example, in binomial models with a lot of empty cells.
R gives you a "Warning". Don't panic! You can increase the number of
iterations by the parameter maxit:
- To evaluate the fitted model at some new values of the predictors use the
- >predict(glm.object, type=...,se=T)
- The output will contain, in particlular, a vector of estimated response
(depending on type) and a vector of standard errors for constructing
confidence intervals for the mean response.