-----------
Tel-Aviv University - Computer Science Colloquium

Sunday, May 28, 14:15-15:15
COFFEE at 14:00

Schreiber Building, Room 309
-----------

Meaningful pairwise data-clustering

Naftali Tishby

The Hebrew University

Abstract:

Data clustering is a fundamental data analysis problem, with numerous
applications to pattern recognition, signal processing, and learning.
Yet, the problem itself is too often ill defined and there is a confusion
between the details of particular algorithms and their goal. In this talk
we present a simple non-parametric pairwise clustering method which is
both principled and very general.

The algorithm is based on a novel information theoretic method
for extracting relevant structures from complex data, "the information
bottleneck method". Given any two non-independent random variables we
propose to compress one of the variables under a constraint on the mutual
information to the other one. This general principle yields, perhaps
surprisingly, an exact implicit solution which can be obtained via several
converging algorithms. It also provides a general and rich framework for
discussing various problems in signal and data analysis and in machine
learning. In this talk I will discuss the application of this principle to
pairwise data clustering and to joint sequence analysis.

Based partly on joint work with William Bialek and Noam Slonim.

-----------

For colloquium schedule, see http://www.math.tau.ac.il/~zwick/colloq.html