Exercise 2
Generalized Fourier Improvisations on Signal De-Noising
Question 1 (theoretical warming-up).
Assume the standard nonparametric regression model
yi=g(ti)+εi, i=1,...,n
Following the generalized Fourier series approach, we choose a certain
orthonormal basis and try to estimate the generalized Fourier coefficients
Di
of the unknown function g(t) from the empirical generalized Fourier
coefficients di of the data y.
Consider the following
penalized estimator w of D:
Sumi(di-wi)2 + h P(w) -> minw,
where P(w) is the penalty function and h is the parameter
controlling the trade-off between goodness-of-fit and penalty loss.
Show that
- P(w)=Sumiwi2 leads to
linear shrinkage of d
- P(w)=Sumi|wi| yields soft
thresholding of d
- P(w)=#{wi≠0 } corresponds to hard
thresholding of d
Question 2.
Generate a sample of size 512 from Doppler function
g(x)=sqrt(x * (1 - x)) * sin((2 * pi * (1 - e))/(x + e)), where
e=0.05,
by adding to g(x) a normal noise with σ=.05 at uniformly designed points
xi=i/512, i=1,...,512.
Plot the original Doppler signal and the generated noisy one. Now we
(actually you) will try to de-noise the noisy signal using wavelets.
- Perform the Discrete Wavelet Transform (DWT) of the noisy data and compare
it with the DWT of the noiseless Doppler signal. To "feel" wavelets better try
several mother wavelets from Daubechies family by taking different
filter.number (see Computational Comments). Note that filter.number=1
corresponds to the Haar basis.
Comment the plots of the resulting DWT coefficients.
- Threshold noisy wavelet coefficients using the universal Donoho & Johnstone
threshold and one of the adaptive thresholding rules (e.g., SURE, FDR or CV).
Plot
thresholded wavelet coefficients and compare them with those from 2.1.
- Reconstruct de-noised estimates by inverse DWT. Compare the two estimates
and explain the differences.
Compare the wavelet estimators with a smoothing spline.
Try several noisy samples from Doppler function to strengthen your conclusions.
Question 3.
The file NMR.dat contains a nuclear magnetic resonance
signal of length 1024.
- Plot the data and make some initial conclusions.
- Let's start with the Fourier cosine series. Perform the discrete cosine
transform (DCT) of the data. Unfortunately, to the best of my knowledge
(I'd be happy to be wrong!) there is no R function for DCT
but you can easily write it by yourself
(try to avoid explicit loops as much as you can otherwise your program might run
slowly for such a large data set).
The DCT of vector y is given by
c0=mean(y),
cj = (2/n)*Sumi[yi*cos(π*j*(i-1/2)/n)]
while the inverse DCT is
yi = c0 + Sumj[cj*cos(π*j*(i-1/2)/n)]
i=1,...,n; j=0,...,n-1
Look at the DCT of the original noisy signal and try to de-noise it -
truncate series, select largest coefficients (by fixing the number of
non-zero coefficients or by thresholding), etc. - improvise as much as you
want. Give the resulting DCT
estimator and comment the results.
- Now let's see what wavelets can do with this data.
Perform the DWT of the signal, threshold the DWT coefficients
by any method you have chosen (you can try several mother wavelets,
various thresholds, etc.) and compare them with
those of the original signal.
- Compare the resulting wavelet estimator with that of DCT estimator,
smoothing spline and super
smoother estimators. Comment the results.
Computational Note for R users.
To perform wavelet analysis you can use the R wavethresh package written
by Guy Nason.
In fact, there are 3 basic functions you will need in wavethresh:
- >y.wd<-wd(y,filter.numer= ,...)
- performs DWT of
y and creates a .wd object y.wd you'll need
further. The parameter filter.number selects the mother
wavelets you use in wavelet decomposition. For example, filter.number=1 corresponds to the Haar
mother wavelet. The more is the filter number, the more regular is the resulting
mother wavelet.
To plot the resulting wavelet coefficients simply do
- >plot(y.wd)
- >ythresh.wd<-threshold.wd(y.wd,policy= ,type= ,...)
- thresholds noisy wavelet
cofficients according to the type ("hard" or "soft") and the policy
("universal", "cv", "fdr",
"sure", etc.) you ask. You can plot the thresholded coefficients by
- >plot(ythresh.wd)
- >y.wr<-wr(ythresh.wd)
- performs the inverse DWT and derives the resulting thresholded wavelet
estimate.
See help files for details of their use.
More details can be obtained from the
WaveThresh Help or in the book "Wavelet Methods in Statistics with R"
by Guy Nason available in the library.