Exercise 3
Multiple Nonparametric Regression
(warning: beware of curse of dimensionality!)
Question 1.
The file
Air.dat contains 111 observations taken from an environmental study that measured the four variables:
ozone (surface concentration of ozone in New York, in parts per million),
radiation (solar radiation),
temperature (observed temperature, in degrees Fahrenheit) and
wind (wind speed, in miles per hour) for 111 consecutive days. The study investigated the influence of solar
radiation, temperature and wind speed on concentration of ozone.
- Fit a linear model to the data. Does it seem adequate?
- Fit an additive nonparametric model and compare it with the linear one.
- Perform projection pursuit regression trying various smoothing methods and different number of terms (one, two,
three). Choose the most reasonable projection pursuit model. What are the resulting explanatory variables? Compare the
results with those obtained on previous steps.
- Apply neural and/or deep neural networks (you can try various network architectures). Comment the results.
- Summarize the results.
Question 2.
The data in the file
Diabetes.dat come from a study of the factors affecting patterns of insulin-dependent diabetes
mellitus in children. The objective was to investigate the dependence of the level of serum C-peptide on various other
factors in order to understand the patterns of residual insulin secretion. The response measurement is the logarithm of
C-peptide concentration (pmol/ml) at diagnosis, and the predictor measurements are
age and
base deficit, a measure of acidity.
- Plot the data. Do you think that a linear model is appropriate? Verify your initial conclusions.
- Fit an additive and projection pursuit estimators, and comment the results.
Computational Notes for R users.
Here is a (partial) list of R functions for multivariate nonparametric regression. See the corresponding help files for
details of their use.
- gam from the CRAN's package package gam performs backfitting algorithm for additive models by spline smoothing with
''automatically'' chosen amount of smoothing.
- ppr from the CRAN's package package gam fits projection pursuit estimator using smoothing splines or supersmoother. Read carefully its help
comments.
- nnet from the
CRAN's package nnet fits neural networks with a single hidden layer.
- neuralnet from the CRAN's package neuralnet allows to fit neural networks with several hidden layers (deep neural networks).
Note, however, that it requires all input variables to be numeric.
- I am sure there exist other R-packages for fitting neural and deep neural networks.