back to Home Page
Generalized Linear Models
(0365.4006)
Purpose
Regression analysis playes a central role in statistics being one of its most powerful and commonly
used techniques. The standard linear regression models assume that the response variable is normal (or at least can
be transformed to a normal one). However, unfortunately/fortunately (?) it is not always the case. A wide variety
of models with a categorical or counting response is typical (althgough not the only ones!) examples, where the
assumption of normality cannot be accepted as reasonable. In this course we study
generalized linear models, where the response variables are allowed to be non-normal. We start from the
general theory of generalazied linear models, extending the corresponding results for standard linear regression,
and then consider the most useful particular cases in more details.
Topics:
- Introduction
- Standard (normal) linear regression model
- Generalized linear regression model
- Theory of Generalized Linear Models
- Model components
- exponential family and its properties
- link functions
- Maximum likelihood estimation
- Newton-Raphson method
- iteratively reweighted least squares
- Goodness-of-fit
- analysis of deviance
- Pearson statistic
- analysis of residuals
- Model selection
- Particular Models
- Binary data
- Binomial data
- Multinomial data
- Poisson data
- Overdispersion & Quasi-Likelihood Models
- Nonparametric GLM
- Normal linear models with heterogeneous variance and GLM
- Generalized linear mixed effects models
Literature
- Dobson, A.J. An Introduction to Generalized Linear Models.
- McCullagh, P. and Nelder, J.A. Generalized Linear Models.
- Wood, S.N. Generalized Additive Models. An Introduction with R (Chapter 2).
- Myers, R.H. and Montgomery, D.C. A tutorial on Generalized Linear Models. Journal of Quality Technology,
29, 274-291.
- Chapter 5: Green, P.G. and Silverman B.W. Nonparametric Regression and Generalized Linear Models.
Computing:
The course assumes an extensive use of computer. There are no limitations on using various
statistical packages and software for this course, although the data-examples considered in the class will be
``R-oriented''. Installation instructions and manuals for R can be found on the
R Home page . The R based books may be helpful for this course:
In addition, you can enjoy various R packages that are not included in ka standard R software. For example, you
can find very useful Ripley's software provided with the book "Modern Applied Statistics with S". To use Ripley's
software enter S-Plus and give the command:
library(MASS)
It is a good idea to add the above command to the function
.First . This function is automatically executed every time you start R and in this case you won't
need to give this command every R session. If you have not created the function
.First before, do it by
.First<-function(){library(MASS)}
and you will creat the R function
.First that meanwhile contains the only
library command.
Some specific R notes you may find useful for generalized linear models:
- To fit a generalized linear model you will generally use the function
glm:
glm(formula, family=...(link=...),...)
The
glm function creates an object of class glm that contains most of information you need. See
help(glm.object)
for details.
- For some data the convergence of the iteratively reweighted least squares algorithm is slow and does not
occur in (default) 10 iterations. It may happen, for example, in binomial models with a lot of empty cells. R
gives you a "Warning". Don't panic! You can increase the number of iterations by the parameter maxit:
glm(formula,...,maxit=...)
- To evaluate the fitted model at some new values of the predictors use the function
predict:
predict(glm.object, type=...,se=T)
The output will contain, in particlular, a vector of estimated response (depending on type) and a vector of standard errors for constructing confidence intervals for the mean response.