WikiQuery
Computational Repeatability: The WikiQuery Case Study
The Experiment
WikiQuery is a system that supports both keyword-based and structured queries (c-queries) over Wikipedia documents. A novel feature of WikiQuery is that it returns answers that span multiple document answers.
WikiQuery receives as input a set of Wikipedia documents stored in a MySQL database, and uses these documents to create an in-memory index. Then, given a set of queries, for each query it creates a file containing a list of answer for that query. The goal of our experimental evaluation was to assess the quality of these answers and compare them against the answers returned by the BANKS system. The process to compare the answers returned by the two systems is manual.
Besides the actual query results, for each answer we also include additional metadata, for example, the index entry and the scores derived by the WikiQuery system. For our experiments, human evaluators compared the accuracy of WikiQuery with BANKS by running the same input queries and examining the answers derived by the two systems. For each answer, they indicated whether it was ``highly relevant, ``relevant, ``somewhat relevant, ``undecided or ``not relevant.
For details, see the http://www.cs.utah.edu/~juliana/pub/wikiquery-webdb2010.pdf.
Reproducing and Testing Workability
The package containing the files for running WikiQuery can be found at: http://www.cs.utah.edu/~juliana/WikiQuery/WikiQueryPkg.zip
A detailed description of the steps required to run the experiments is available at http://www.cs.utah.edu/~juliana/WikiQuery/WikiQuery_casestudy.pdf. This file also includes information for prospective authors on how to package their experiments.