Exercise 2

Generalized Fourier Improvisations on Signal De-Noising

Question 1 (theoretical warming-up).

Assume the standard nonparametric regression model

yi=g(ti)+εi, i=1,...,n

Following the generalized Fourier series approach, we choose a certain orthonormal basis and try to estimate the generalized Fourier coefficients Di of the unknown function g(t) from the empirical generalized Fourier coefficients di of the data y. Consider the following penalized estimator w of D:

Sumi(di-wi)2 + h P(w) -> minw,

where P(w) is the penalty function and h is the parameter controlling the trade-off between goodness-of-fit and penalty loss.

Show that

  1. P(w)=Sumiwi2 leads to linear shrinkage of d
  2. P(w)=Sumi|wi| yields soft thresholding of d
  3. P(w)=#{wi≠0 } corresponds to hard thresholding of d

Question 2.

Generate a sample of size 512 from Doppler function g(x)=sqrt(x * (1 - x)) * sin((2 * pi * (1 - e))/(x + e)), where e=0.05, by adding to g(x) a normal noise with σ=.05 at uniformly designed points xi=i/512, i=1,...,512. Plot the original Doppler signal and the generated noisy one. Now we (actually you) will try to de-noise the noisy signal using wavelets.
  1. Perform the Discrete Wavelet Transform (DWT) of the noisy data and compare it with the DWT of the noiseless Doppler signal. To "feel" wavelets better try several mother wavelets from Daubechies family by taking different filter.number (see Computational Comments). Note that filter.number=1 corresponds to the Haar basis. Comment the plots of the resulting DWT coefficients.
  2. Threshold noisy wavelet coefficients using the universal Donoho & Johnstone threshold and one of the adaptive thresholding rules (e.g., SURE, FDR or CV). Plot thresholded wavelet coefficients and compare them with those from 2.1.
  3. Reconstruct de-noised estimates by inverse DWT. Compare the two estimates and explain the differences. Compare the wavelet estimators with a smoothing spline.
Try several noisy samples from Doppler function to strengthen your conclusions.

Question 3.

The file NMR.dat contains a nuclear magnetic resonance signal of length 1024.
  1. Plot the data and make some initial conclusions.
  2. Let's start with the Fourier cosine series. Perform the discrete cosine transform (DCT) of the data. Unfortunately, to the best of my knowledge (I'd be happy to be wrong!) there is no R function for DCT but you can easily write it by yourself (try to avoid explicit loops as much as you can otherwise your program might run slowly for such a large data set). The DCT of vector y is given by

    c0=mean(y), cj = (2/n)*Sumi[yi*cos(π*j*(i-1/2)/n)]

    while the inverse DCT is

    yi = c0 + Sumj[cj*cos(π*j*(i-1/2)/n)]

    i=1,...,n; j=0,...,n-1

    Look at the DCT of the original noisy signal and try to de-noise it - truncate series, select largest coefficients (by fixing the number of non-zero coefficients or by thresholding), etc. - improvise as much as you want. Give the resulting DCT estimator and comment the results.

  3. Now let's see what wavelets can do with this data. Perform the DWT of the signal, threshold the DWT coefficients by any method you have chosen (you can try several mother wavelets, various thresholds, etc.) and compare them with those of the original signal.
  4. Compare the resulting wavelet estimator with that of DCT estimator, smoothing spline and super smoother estimators. Comment the results.


Computational Note for R users.

To perform wavelet analysis you can use the R wavethresh package written by Guy Nason. In fact, there are 3 basic functions you will need in wavethresh:

See help files for details of their use. More details can be obtained from the WaveThresh Help or in the book "Wavelet Methods in Statistics with R" by Guy Nason available in the library.