| subject | - | an identification code; there are several observations for each subject, but because the girls were hospitalized at different ages, the number of observations, and the age at the last observation, vary |
| age | - | the subject's age in years at the time of observation; all but the last observation for each subject were collected retrospectively at intervals of two years, starting at 8. |
| exercise | - | the amount of exercise in which the subject engaged, expessed as estimated hours per week |
| group | - | a factor indicating whether the subject is "patient" or "control" |
| lcavol | - | log(cancer volume) |
| lweight | - | log(prostate weight) |
| age | - | age |
| lbph | - | log(benign prostatic hyperplasia amount) |
| svi | - | seminal vesicle invasion |
| lcp | - | log(capsular penetration) |
| gleason | - | Gleason score |
| pgg45 | - | percentage Gleason scores 4 or 5 |
The goal is to predict the log of PCA (lpsa) from these measurements.