back to Home Page
Linear Models
(0365.4004)
Purpose
Regression analysis plays a central role in statistics being one of its most powerful and commonly
used techniques. Regression analysis deals with problems of finding appropriate models to represent relationships
between a response variable and a set of explanatory variables based on data collected from a series of
experiments. These models are used to represent existing data and also to predict new observations. The basic
regression models are
linear ones. Although they are the simplest and (hence) most well studied models, they nevertheless
do work in numerious problems. Sometimes even for non-linear models it is possible to transfer the
original non-linear model to a linear one after certain transformations of variables; in some other cases
linearization of complex non-linear models may be used. In this course we'll try to understand how linear models
work and when it is possible to use them efficiently.
Topics:
- Introduction
- regression models
- linear regression models, examples of linear regression models
- Least Squares Estimates
- derivation of LSE for regression coefficients
- statistical properties of LSE
- Gauss-Markov theorem
- geometrical interpretaion of LSE
- multiple correlation coefficient
- Statistical Inference
- maximum likelihood estimators for normal models
- confidence intervals and confidence regions for regression coefficients
- hypothesis testing: t-test, LRT-test (
F-test)
- Model Criticism
- analysis of residuals
- influential observations
- the Box-Cox transformation family
- Prediction and Forecasting
- Model Selection
- criteria for model selection: correlation coefficient, penalized least squares, cross-validation
- model selection and dimensionality reduction in high-dimensions: stepwise procedures, lasso, principle
component regression, partial least squares
- Some Special Topics:
- ridge regression
- polynomial regression, orthogonal polynomials
- piecewise-polynomial regression, splines
- Generalized Least Squares
- motivation, derivation of generalized LSE for regression coefficients
- some special cases: unequal variances, repeated measurements, hierarchical models
- Random and mixed effects models
- ANOVA models with fixed and random effects
- variance component (mixed effects) models
- Nonlinear Regression
- least squares estimation, the Gauss-Newton method
- statistical inference
- Generalized Linear Models
- definition, examples
- maximum likelihood estimation, iteratively reweighted least squares
- goodness-of-fit
- particular models (logistic regression, log-linear Poisson model)
Literature
- Draper, N. and Smith, H. Applied Regression Analysis.
- Faraway, J.J. Linear Models with R.
- Rao, C.R. and Toutenburg, H. Linear Models. Least Squares and Alternatives.
- Ryan, T.P. Modern Regression Methods.
- Seber, G. A. Linear Regression Analysis.
- Sen, A. and Srivastava, M. Regression Analysis: Theory, Methods and Applications.
- much-much more
Example files:
Homework Exercises:
Exams:
Computing:
The course assumes an extensive use of computer. There are no limitations on using various
statistical packages and software for this course, although the data-examples considered in the class will be
R-``oriented". Installation instructions and manuals for R can be found on the
R Home page . The following R based books may be helpful for this course:
- Aitkin, M., Francis, B., Hinde, J. and Darnell, R.
Statistical Modelling in R.
- Faraway, J.J.
Linear Models with R.
- James, G., Witten, D., Hastie, T. and Tibshirani, R.
An Introduction to Statistical Learning with
Applications in R.
- Venables, W.N. and Ripley, B.D.
Modern Applied Statistics with S.
Those who are not familiar with this powerful and effective statistical language or those who want to refresh
their knowledge in S-Plus, can find tutorial material in the following documents (which can also be printed):
Two recommended books on S-Plus are
- Venables, W.N. and Ripley, B.D. Modern Applied Statistics with S-Plus.
- Chambers, J.M. and Hastie, T.J. Statistical Models in S.
A short basic description of S-Plus and linear modelling in S-Plus can be found clicking
HERE . Advanced S-Plus users can find some useful
information about various aspects of S-Plus clicking
HERE
In addition, you can enjoy various R packages that are not included in ka standard R software. For example, you
can find very useful Ripley's software provided with the book "Modern Applied Statistics with S". To use Ripley's
software enter S-Plus and give the command:
library(MASS)
It is a good idea to add the above command to the function
.First . This function is automatically executed every time you start R and in this case you won't
need to give this command every R session. If you have not created the function
.First before, do it by
.First<-function(){library(MASS)}
and you will creat the R function
.First that meanwhile contains the only
library command.
To find S-Plus facilities about a specific item use S-Plus help system: open a help
window
help.start()
and simply mention the topic you are interested, for example, "linear models" or "model
selection".