Seminar Managing Information on the Web


Tova Milo, Winter 2010


Seminar Information

The seminar focuses on managing, analyzing, sharing, and integrating data and applications across multiple sources, either on the Internet or at enterprises. This topic has received much attention in the database, AI, Web, IR and verification communities. We shall read recent papers in this area, focusing on several specific issues, and then explore possible future directions. A list of tentative topics/papers is enclosed.




         Probabilistic Data

1.       Scalable Probabilistic Databases with Factor Graphs and MCMC, Michael Wick, Andrew McCallum, Gerome Miklau, VLDB 2010 [Boris Kostenko 3/11]

2.       Querying Probabilistic Information Extraction , Daisy Zhe Wang, Michael Franklin, Minos Garofalakis, Joseph Hellerstein, VLDB 2010

3.       Lineage Processing over Correlated Probabilistic Databases, BHARGAV KANAGAL, University of Maryland; Amol Deshpande, Univ of Maryland SIGMOD 2010

4.       Evaluation of probabilistic threshold queries in MCDB, Luis Perez, Rice University; Subi Arumugam, U Florida; Christopher Jermaine, Rice U. SIGMOD 2010

5.       MCDB-R: Risk Analysis in the Database, MCDB-R: Risk Analysis in the Database



         Query Processing

1.      Why not, Adriane Chapman, H. V. Jagadish, SIGMOD 2009 [Alon Vekker 17/11 ]

2.      How to ConQueR Why-Not Questions, Quoc Trung Tran, Chee-Yong Chan, SIGMOD 2010 [Slava Novgorodov 24/11]



         Web, Recommendations and Social Networks

1.       Active Knowledge: Dynamically Enriching RDF Knowledge Bases by Web Services, Nicoleta Preda; Fabian Suchanek, Gjergji Kasneci, Thomas Neumann, Wenjun Yuan, Gerhard Weikum SIGMOD 2010 [Ohad Greenshpan 8/12]

2.       Multiple Features Fusion for Social Media Applications, in Cui, Anthony Tung; Ce Zhang; Zhe Zhao, SIGMOD 2010 [Ori Folger 15/12]

3.       Recsplorer: Recommendation Algorithms based on Precedence Mining, Aditya Parameswaran, Georgia Koutrika, Benjamin Bercovitz, Hector Garcia-Molina, SIGMOD 2010 [Rubi Boim 22/12]

4.       Load-Balanced Query Dissemination in Democratic Communities, Emiran Curtmola; Alin Deutsch; K.K. Ramakrishnan; Divesh Srivastava SIGMOD 2010



         Data Exchange, Extraction and Integration

1.       Towards The Web of Concepts: Extracting Concepts from Large Datasets, Aditya Parameswaran, Hector Garcia-Molina, Anand Rajaraman, VLDB 2010 [Tom Yam 5/1]

2.       MapMerge: Correlating Independent Schema Mappings, Bogdan Alexe, Mauricio Hernandez, Lucian Popa, Wang-Chiew Tan, VLDB 2010

3.       Evaluating Entity Resolution Results, David Menestrina, Steven Whang, Hector Garcia-Molina, VLDB 2010

4.       Entity Resolution with Evolving Rules, Steven Whang, Hector Garcia-Molina, VLDB 2010

5.       Exploiting Content Redundancy for Web Information Extraction, Pankaj Gulhane, Rajeev Rastogi, Srinivasan Sengamedu, Ashwin Tengli, VLDB 2010

6.       Automatic Rule Refinement for Information Extraction, Bin Liu, Laura Chiticariu, Vivian Chu, H. Jagadish, Frederick Reiss, VLDB 2010



         DBs and Flash

1.       FlashStore: High Throughput Persistent Key-Value Store, B. Debnah, S. Sengupta, J. Li, VLDB 2010 [Aviad Zuck 12/1]




1.       Efficient Querying and Maintenance of Network Provenance at Internet-Scale, Wenchao Zhou; Micah Sherr; Tao Tao; Xiaozhou Li; Boon Thau Loo; Yun Mao SIGMOD 2010

1.       Querying Data Provenance, Grigoris Karvounarakis; Zachary Ives, Val Tannen SIGMOD 2010

2.       TRAMP: Understanding the Behavior of Schema Mappings through Provenance, Boris Glavic, Gustavo Alonso, Renée Miller, Laura Haas, VLDB 2010



         Cloud Computing

1.       MRShare: Sharing Across Multiple Queries in MapReduce, Tomasz Nykiel, Michalis Potamias, Chaitanya Mishra, George Kollios, Nick Koudas, VLDB 2010

2.       Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing), Jens Dittrich, Jorge Quiane, Alekh Jindal, Yagiz Kargin, Vinay Setty, Jörg Schad, VLDB 2010