LONGITUDINAL DATA ANALYSIS


David Steinberg
Department of Statistics and Operations Research
dms@post.tau.ac.il
Office Hours: Monday 13-14.
Shreiber Building, Room 115.

Phone: +972-3-640-8035.

Class Hours:

Second Semester

What is Longitudinal Data Analysis?

Many research designs follow subjects over time. For example, in a study of a new drug to treat schizophrenia, patients were allocated at random to receive either the new drug or a standard treatment. The progress of each patient was monitored at regular time points, first weekly, then monthly. The main responses of interest were related to the patients' psychological well-being and were measured using a standard battery of questions. Other measures that were tracked related to possible side effects. For example, there was concern that one treatment might lead to overeating, so weight was one of the variables that was followed.

Many questions arise in analyzing a study like the one described above. What is the difference between the treatments? How should we take account of time (from beginning treatment) in describing that difference? Is the difference consistent throughout time, or time-dependent? How should a statistical analysis account for the fact that repeat observations on the same individual will almost certainly be correlated?

The goal of the course will be to address these questions, to explore useful ways of thinking about and analyzing longitudinal data, and to gain experience with software for carrying out such analyses.

Announcements

Solutions to the theoretical and applied work in Homework 3 are now on the web site. There is also R code for the applied question. Thanks to Jonathan for sharing his good work with us.

The data and R code from the clinical trial on schizophrenia are available here.

Data file

R code

The data and R code from the rat growth study are available here.

Data file

R code

Click here for a file with output from SPSS analyzing the rat data.

Click here for a file of slides showing SPSS menu choices for analyzing the rat data.

Click here for the file with output from SPSS analyzing the vision responses. The file now includes an additional analysis that shows how to model the factorial structure for the "within subject" responses.

The data file and R code for the vision experiment are available here.

Data file

R code

The data file and R code for comparing Vitamin E diets on guinea pigs are available here.

Data file

R code

You can download the slides for the first class meeting by clicking here.

Professor Geert Molenberghs has very kindly agreed to give us access to the slides that he prepared for teaching a course on longitudinal data analysis. We will make use of these slides throughout the semester. You can download them by clicking here.

Prerequisites

Statistical Theory

A course in regression or linear models is highly desirable.

First-degree students should get my permission to attend.

Course Requirements

The final grade will be based on a final exam and may also reflect homework assignments. These assignments will be given every week or two and will cover both the theory and application of the methods we study. Many of the assignments will involve use of a computer package to analyze data.

Course Content

You will become familiar with the basic ideas and methods of longitudinal data analysis. The goals of the class are:

To learn how to recognize longitudinal data settings and related settings in which study designs induce correlation among data.

To learn how to analyze and reach conclusions from longitudinal data.

To learn how to analyze repeated measures data.

To learn how to build and analyze regression models for longitudinal data.

To learn various approaches to modeling correlation.

To learn how to use the "mixed linear model".

To learn how to carry out statistical inference in longitudinal analysis.

To learn how to interpret the results of longitudinal data analyses.

To learn methods for handling incomplete data, such as dropouts and periodic missing observation.

To learn methods for modeling longitudinal data that are binary or are counts.

To learn how to use statistical software packages to carry out longitudinal data analyses.

Helpful Reference Books

Topics:

  1. An Overview of Longitudinal Data
    • Examples of repeated measures data
    • Growth curves
    • Other typical applications
  2. Review
    • Analysis of variance
    • Regression
  3. Repeated Measures ANOVA
    • Typical applications
    • The basic model
    • Between- and within-subject comparisons
    • Statistical inference
    • Checking assumptions
  4. Models for Longitudinal Data
    • Two-stage analysis
    • The general linear mixed-effects model
  5. Exploratory Data Analysis
  6. Marginal Models
    • The marginal model
    • Estimation by restricted maximum likelihood
    • Model fitting and checking assumptions
    • Inference for fixed effects
    • Inference for variance components
    • Inference for random effects
  7. Models for Serial Correlation
  8. Heterogeneity and How to Model It
  9. Methods for Missing Data
    • The impact of missing data
    • Types of missing data
    • Methods for analysis with missing data
    • Modeling missing data
  10. Selection Models
    • Nature of the model
    • Use with growth curves
    • Pattern-mixture models
    • Sensitivity analyses
  11. Binary and Count Data
    • Some applications with binary and count data
    • Basic models
    • Statistical inference
  12. Case Studies (if time permits)

Homework

Homework 1

Data file for Homework 1.

Last update: November 11, 2009.