Department of Statistics & Operations Research

Statistics Seminars

2017/2018

To subscribe to the list, please follow this link or send email to 12345yekutiel@post.tau.ac.il54321 (remove numbers unless you are a spammer…)

Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: C:\Users\user\Documents\myWebsite\TAU Statistics Seminar Home Page_files\red2.gif

 

 

First Semester

24 October

Haim Avron, TAU

 

Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees

14 November

Roi Weiss, Weizmann Institute

 

Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions

21 November

Douglas Wiens, University of Alberta

 

Robustness of Design: A Survey

5 December

Regev Schweiger, TAU

 

 

12 December

Barak Sober, TAU

 

Approximation of Functions Over Manifolds by Moving Least Squares

26 December

Avi Dgani, Geocartography

 

 

9 January

Phil Reiss, Haifa University

 

 

 
 
 
 
 
 
 

 

 
 
 

 

 

 

 

Second  Semester

6 March

Ruth Heller, TAU

 

 

20 March

Naomi Kaplan, HUJI

 

 

29 May

Donald Rubin, Harvard

 

 

 

 

 

 

 
 
 
 
 
 
 

 

 
 
 

 

 

 

 

 

 

 


 

 

Seminars are held on Tuesdays, 10.30 am, Schreiber Building, 309 (see the TAU map ). The seminar organizer is Daniel Yekutieli.

To join the seminar mailing list or any other inquiries - please call (03)-6409612 or email 12345yekutiel@post.tau.ac.il54321 (remove numbers unless you are a spammer…)

 


Seminars from previous years


 

 

 

ABSTRACTS

 

 

 

·         Haim Avron, Tau

 

Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees

 

Random Fourier features is one of the most popular techniques for scaling up kernel methods, such as kernel ridge regression. However, despite impressive empirical results, the statistical properties of random Fourier features are still not well understood. The talk is based on a recent paper in which we take steps toward filling this gap. Specifically, we approach random Fourier features from a spectral matrix approximation point of view, give tight bounds on the number of Fourier features required to achieve a spectral approximation, and show how spectral matrix approximation bounds imply statistical guarantees for kernel ridge regression.

 

Qualitatively, our results are twofold: on one hand, we show that random Fourier feature approximation can provably speed up kernel ridge regression under reasonable assumptions. At the same time, we show that the method is suboptimal, and sampling from a modified distribution in Fourier space, given by the leverage function of the kernel, yields provably better performance.  We study this optimal sampling distribution for the Gaussian kernel, achieving a nearly complete characterization for the case of low-dimensional bounded datasets. Based on this characterization, we propose an efficient sampling scheme with guarantees superior to random Fourier features in this regime.

 

This is joint work with Michael Kapralov (EPFL), Cameron Musco (MIT), Christopher Musco (MIT), Ameya Velingker (EPFL), and Amir Zandieh (EPFL)

 

 

 

·         Roi Weiss, Weizmann Institute

 

Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions

This talk deals with Nearest-Neighbor (NN) learning algorithms in metric spaces. This seemingly naive learning paradigm remains competitive against more sophisticated methods and, in its celebrated k-NN version, has been placed on a solid theoretical foundation. Although the classic 1-NN is well known to be inconsistent in general, in recent years a series of works has presented variations on the theme of margin-regularized 1-NN algorithms, as an alternative to the Bayes-consistent k-NN. These algorithms enjoy a number of statistical and computational advantages over the traditional k-NN. Salient among these are explicit data-dependent generalization bounds and considerable runtime and memory savings.

In this talk we examine a recently proposed compression-based 1-NN algorithm, which enjoys additional advantage in the form of tighter generalization bounds and increased efficiency in time and space. We show that this algorithm is strongly Bayes-consistent in metric spaces with finite doubling dimension — the first compression-based multi-class 1-NN algorithm proven to be both computationally efficient and Bayes-consistent. Rather surprisingly, we discover that this algorithm continues to be Bayes-consistent even in a certain infinite dimensional setting, in which the basic measure-theoretic conditions on which classic consistency proofs hinge are violated. This is all the more surprising, since it is known that k-NN is not Bayes-consistent in this setting, thus raising several interesting open problems.

Joint work with Aryeh Kontorovich and Sivan Sabato.

 

 

·         Barak Sober, TAU

 

Approximation of Functions Over Manifolds by Moving Least Squares

 

We approximate a function defined over a $d$-dimensional manifold $M⊂R^n$ utilizing only noisy function values at noisy locations on the manifold. To produce the approximation we do not require any knowledge regarding the manifold other than its dimension $d$. The approximation scheme is based upon the Manifold Moving Least-Squares (MMLS) and is therefore resistant to noise in the domain $M$ as well. Furthermore, the approximant is shown to be smooth and of approximation order of $O(h^{m+1})$ for non-noisy data, where $h$ is the mesh size w.r.t $M$, and $m$ is the degree of the local polynomial approximation. In addition, the proposed algorithm is linear in time with respect to the ambient space dimension $n$, making it useful for cases where $d<<n$. This assumption, that the high dimensional data is situated on (or near) a significantly lower dimensional manifold, is prevalent in many high dimensional problems. We put our algorithm to numerical tests against state-of-the-art algorithms for regression over manifolds and show its potential.

This talk is based upon a joint work with Yariv Aizenbud & David Levin

 

 

·         Douglas Wiens, University of Alberta

 

 

 Robustness of Design: A Survey

 

When an experiment is conducted for purposes which include fitting a particular model to the data, then the 'optimal' experimental design is highly dependent upon the model assumptions - linearity of the response function, independence and homoscedasticity of the errors, etc. When these assumptions are violated the design can be far from optimal, and so a more robust approach is called for. We should seek a design which behaves reasonably well over a large class of plausible models.

 

I will review the progress which has been made on such problems, in a variety of experimental and modelling scenarios - prediction, extrapolation, discrimination, survey sampling, dose-response, machine learning, etc.