Hao Wu
Hao Wu Hao Wu ()
Ph.D. Candidate
Database Research Group
Division of Computer Software
Department of Computer Science and Technology
Tsinghua University, Beijing 100084, China
Hao Wu's e-mail

Welcome to my homepage. I'm a Ph.D. candidate at Database Research Group, Tsinghua University, Beijing. My supervisor is Prof. Lizhu Zhou. My current research topic is keyword search over relational databases.

Education 
  • Ph.D. Candidate: Department of Computer Science and Technology, Tsinghua University, Sep 2006 - Now.
  • Bachelor: Department of Computer Science and Technology, Tsinghua University, Sep 2002 - July 2006.
Publications 
  • Guoliang Li, Ju Fan, Hao Wu, Jiannan Wang, and Jianhua Feng. DBease: Making Databases User-Friendly and Easily Accessible. CIDR, 2011.
  • Paper: PDF

    Abstract – Structured query language (SQL) is a classical way to access relational databases. Although SQL is powerful to query relational databases, it is rather hard for inexperienced users to pose SQL queries, as they are required to be familiar with SQL syntax and have a thorough understanding of the underlying schema. To provide an alternative search paradigm, keyword search and form-based search are proposed, which only need users to type in keywords in single or multiple input boxes and return answers after users submit a query with complete keywords. However users often feel left in the dark when they have limited knowledge about the underlying data, and have to use a try-and-see approach for finding information. A recent trend of supporting autocomplete in these systems is a first step towards solving this problem. In this paper, we propose a new search method DBease to make databases user-friendly and easily accessible. DBease allows users to explore data on the fly as they type in keywords, even in the presence of minor errors. DBease has the following unique features. Firstly, DBease can find answers as users type in keywords in single or multiple input boxes. Secondly, DBease can tolerate errors and inconsistencies between query keywords and the data. Thirdly, DBease can suggest SQL queries based on limited query keywords. We study research challenges in this framework for large amounts of data. We have deployed several real prototypes, which have been used regularly and well accepted by users due to its friendly interface and high efficiency.

  • Hao Wu, Guoliang Li, Chen Li, Lizhu Zhou. Seaform: Search-As-You-Type in Forms. VLDB (Demo), 2010.
  • Paper: PDF  |  Poster: PDF, JPEG (full size), JPEG (1024×1024)  |  Video: GIF

    Abstract – Form-style interfaces have been widely used to allow users to access information. In this demonstration paper, we develop a new search paradigm in form-style query interfaces, called Seaform (which stands for Search-As-You-Type in Forms), which computes answers on-the-fly as a user types in a query letter by letter and gives the user instant feedback. Seaform provides better user experiences compared with traditional form-based query systems by reducing the efforts for a user to compose a high-quality query to find relevant answers. Seaform can also enhance faceted search and allow users to on-the-fly explore the underlying data. This search paradigm requires high performance to achieve an interactive speed. We develop efficient techniques and use them to implement two systems on real datasets. We demonstrate the features of these systems.

  • Hao Wu. Search-As-You-Type in Forms: Leveraging the Usability and the Functionality of Search Paradigm in Relational Databases. VLDB (PhD Workshop), 2010.
  • Paper: PDF  |  Slides @ VLDB 2010 Conference: PDF

    Abstract – Querying, or searching, is one of the most important issues in relational databases. There are many search paradigms, such as Structured Query Language (SQL), keyword search, and form search, a.k.a. Query-By-Example (QBE). Among them, QBE is a good trade-off between usability and functionality. However, existing QBE systems are often inconvenient for users to compose high-quality queries quickly. In this PhD workshop paper we investigate the problem of improving the usability of form-based interfaces by enabling them to (1) response a query in real time and (2) tolerate the misplacing of keywords among input boxes. We give the research challenges for achieving high performance and scalability, and introduce two of our prototype systems.

  • Ju Fan, Hao Wu, Guoliang Li, Lizhu Zhou. Suggesting Topic-Based Query Terms as You Type. APWeb, 2010.
  • Paper: PDF

    Abstract – Query term suggestion that interactively expands the queries is an indispensable technique to help users formulate high-quality queries and has attracted much attention in the community of web search. Existing methods usually suggest terms based on statistics in documents as well as query logs and external dictionaries, and they neglect the fact that the topic information is very crucial because it helps retrieve topically relevant documents. To give users gratification, we propose a novel term suggestion method: as the user types in queries letter by letter, we suggest the terms that are topically coherent with the query and could retrieve relevant documents instantly. For effectively suggesting highly relevant terms, we propose a generative model by incorporating the topical coherence of terms. The model learns the topics from the underlying documents based on Latent Dirichlet Allocation (LDA). For achieving the goal of instant query suggestion, we use a trie structure to index and access terms. We devise an efficient top-k algorithm to suggest terms as users type in queries. Experimental results show that our approach not only improves the effectiveness of term suggestion, but also achieves better efficiency and scalability.

Projects 
  • Tastier: A joint research project between Tsinghua University and UC Irvine. It focuses on efficient autocompletion, type-ahead search on large data sets of various types, such as relational data, documents, semi-structured data. "Tastier" stands for type-ahead search techniques in large data sets.

  • DBease (Directed by Prof. Jianghua Feng and Prof. Guoliang Li): A research project focusing on providing user-friendly interfaces for accessing and search databases and making databases user-friendly and easily accessible. "DBease" stands for making DataBase user-friendly and easily accessible.

Systems 

Seaform: A keyword search prototype system that can search a relational table in real time using a form-style user interface. It is now a part of our DBease project. The current version of Seaform can search 1.4 million publications in the DBLP dataset with lightning speed. Try Seaform >>

Collaborators & Friends 

© Copyright 2010 Hao Wu Last modified: