Workshop in Computer Science (Fall 2002):

Workshop on Data Synopses


Prof. Yossi Matias

(classes will be specified separately below)

One of the increasing challenges in many application domains is dealing efficiently with massive data sets. Data repositories with multi-Gigabytes and multi-Terabytes of data become common, and there are many applications that need to deal with massive data sets, or with streaming data of huge quantities, such as:

Traditional data structures are not suitable to handle massive data sets, since available memory and performance constraints allows only for data structures substantially smaller than such sets.

Synopsis data structures concisely capture appropriate characteristics of large data sets, using relatively small memory, and allowing fast computation of approximate queries. Devising effective data synopses is a fast growing area of research.

This workshop will focus on advanced techniques for synopsis data structures:

Method: The various projects will be done in groups of up to three students. The projects are aimed for an integrated working system, yet designed to be done independently.

