This table lists the top three most cited papers published in each VLDB conference during 2000-2010. It was first generated in July 12, 2014 and latest updated in July 1st 2016. It may contain errors, comments are more than welcome!   

# Author list Paper Title July 12, 2014 July 1st 2016

VLDB 2000

1 Mehmet Altinel, Michael J. Franklin Efficient Filtering of XML Documents for Selective Dissemination of Information Cited by 771 Cited by 833
2 Dieter Pfoser, Christian S. Jensen, Yannis Theodoridis  Novel Approaches to the indexing of Moving Object Trajectories Cited by 690 Cited by 840
3 Michelangelo Diligenti, Frans Coetzee, Steve Lawrence, C. Lee Giles, Marco Gori Focused Crawling Using Context Graphs Cited by 669 Cited by 779

VLDB 2001

1 Jayant Madhavan, Philip A. Bernstein, Erhard Rahm Generic Schema Matching with Cupid Cited by 1576 Cited by 1746
2 Quanzhong Li, Bongki Moon Indexing and Querying XML Data for Regular Path Expressions Cited by 1112 Cited by 1214
3 Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo RoadRunner: Towards Automatic Data Extraction from Large Web Sites Cited by 1086 Cited by 1269
VLDB 2002
1 Gurmeet Singh Manku, Rajeev Motwani Approximate Frequency Counts over Data Streams

Cited by 1166

Cited by 1362

2 Eamonn J. Keogh Exact Indexing of Dynamic Time Warping

Cited by 1100

Cited by 1512

3 Hong Hai Do, Erhard Rahm COMA - A System for Flexible Combination of Schema Matching Approaches

Cited by 1098

Cited by 1262

VLDB 2003
1 Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu A Framework for Clustering Evolving Data Streams Cited by 1176 Cited by 1524
2 Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo, Scott Shenker, Ion Stoica Querying the Internet with PIER Cited by 634 Cited by 684
3 Sara Cohen, Jonathan Mamou, Yaron Kanza, Yehoshua Sagiv XSEarch: A Semantic Search Engine for XML

Cited by 586

Cited by 671

VLDB 2004

1 Amol Deshpande, Carlos Guestrin, Samuel Madden, Joseph M. Hellerstein, Wei Hong Model-Driven Data Acquisition in Sensor Networks Cited by 1035 Cited by 1195
2 Zolt¨˘n Gyöngyi, Hector Garcia-Molina, Jan O. Pedersen Combating Web Spam with TrustRank Cited by 871 Cited by 1088
3 Nilesh N. Dalvi, Dan Suciu Efficient Query Evaluation on Probabilistic Databases Cited by 802 Cited by 968

VLDB 2005


Michael Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Samuel Madden, Elizabeth J. O'Neil, Patrick E. O'Neil, Alex Rasin, Nga Tran, Stanley B. Zdonik

C-Store: A Column-oriented DBMS

Cited by 751 Cited by 1025


Spiros Papadimitriou, Jimeng Sun, Christos Faloutsos

Streaming Pattern Discovery in Multiple Time-Series

Cited by 403

Cited by 486


Charu C. Aggarwal

On k-Anonymity and the Curse of Dimensionality

Cited by 388

Cited by 496

3 Kacholia, Shashank Pandit, Soumen Chakrabarti, S. Sudarshan, Rushi Desai, Hrishikesh Karambelkar Bidirectional Expansion For Keyword Search on Graph Databases Cited by 388 Cited by 479

VLDB 2006


Mohamed F. Mokbel, Chi-Yin Chow, Walid G. Aref

The New Casper: Query Processing for Location Services without Compromising Privacy

Cited by 708

Cited by 986


Xiaokui Xiao, Yufei Tao

Anatomy: Simple and Effective Privacy Preservation

Cited by 493

Cited by 637


Omar Benjelloun, Anish Das Sarma, Alon Y. Halevy, Jennifer Widom

ULDBs: Databases with Uncertainty and Lineage

Cited by 484

Cited by 528

VLDB 2007


Daniel J. Abadi, Adam Marcus, Samuel Madden, Katherine J. Hollenbach

Scalable Semantic Web Data Management Using Vertical Partitioning

Cited by 477

Cited by 623


Jian Pei, Bin Jiang, Xuemin Lin, Yidong Yuan

Probabilistic Skylines on Uncertain Data

Cited by 369

Cited by 442


Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li

Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search

Cited by 278

Cited by 460

VLDB 2008


Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni

PNUTS: Yahoo!'s hosted data serving platform

Cited by 625

Cited by 939


Ronnie Chaiken, Bob Jenkins, Per-Åke Larson, Bill Ramsey, Darren Shakib, Simon Weaver, Jingren Zhou

SCOPE: easy and efficient parallel processing of massive data sets

Cited by 442

Cited by 642


Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, Eamonn J. Keogh

Querying and mining of time series data: experimental comparison of representations and distance measures

Cited by 392

Cited by 634

PVLDB 2009



Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, Raghotham Murthy

Hive - A Warehousing Solution Over a Map-Reduce Framework

Cited by 572

Cited by 1104


Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, Avi Silberschatz

HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads

Cited by 551

Cited by 833


Alan Gates, Olga Natkovich, Shubham Chopra, Pradeep Kamath, Shravan Narayanam, Christopher Olston, Benjamin Reed, Santhosh Srinivasan, Utkarsh Srivastava

Building a HighLevel Dataflow System on top of MapReduce: The Pig Experience

Cited by 241

Cited by 384


PVLDB 2010


Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst

HaLoop: Efficient Iterative Data Processing on Large Clusters

Cited by 383 Cited by 651


Jörg Schad, Jens Dittrich, Jorge-Arnulfo Quian¨¦-Ruiz

Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance

Cited by 246

Cited by 420


Jens Dittrich, Jorge-Arnulfo Quian¨¦-Ruiz, Alekh Jindal, Yagiz Kargin, Vinay Setty, Jörg Schad

Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) Cited by 223 Cited by 374