Students write a program to process autocomplete queries for a given set of
suggestion strings and associated weights.
That is, given a query string, find all suggestion strings that start with
the query string, sorted by weight.
Autocomplete is a pervasive user-interface feature in many modern apps and services,
ranging from text messaging to IDEs to web search.
For example, Google uses autocomplete to display suggestions in real time,
as the user types a web search query.
In one version of the assignment, students implement autocomplete in two stages:
The system sort suffices for sorting. However,
the variant of binary search required is nonstandard,
so students must design and implement their own version.
Along the way, student must define three custom orders.
- Preprocess the suggestions: Sort the suggestions alphabetically.
- Process an autocomplete query: Binary search to find all suggestions
that start with the given query; sort the matching suggestions in descending
order by weight.
The assigment is nifty because autocomplete is a quintessential
application of sorting that is familiar, authentic, and engaging.
Autocomplete also exemplifies how applying a core HCI principle can dramatically
improve user-interface design.
Armed with only sorting, binary search, and a hundred lines of code,
students can compose an effective solution to an important problem.
- Assignment handout
(as offered in Spring 2015).
(culled from the Internet Movie Database, U.S. Social Security,
Google Trillion Word Corpus, Million Song Dataset, NASDAQ, and other sources).
Students compose a program to process autocomplete queries
for a set of suggestions.
sorting, binary search, arrays, string processing, HCI
||late CS1 or early CS2
easy to moderate
- Spotlights a familiar and important application, for which performance is critical.
- Brings sorting and binary search to life—two topics that are often dull
and insufficiently motivated in CS1 and CS2.
- The accompanying interactive GUI and real-world datasets add concreteness, authenticity,
- Requires only about 100 lines of code (and no starter code).
- Implementing binary search is tricky.
- Defining a custom order is awkward in Java 7.
Requires a basic I/O library to read text from a file,
ideally one that supports Unicode.
Many variations of this assignments are suitable for CS1 and CS2.
We also believe this assignment has great potential for adaption in several
upper-level courses (HCI, Networks, Algorithms, and ML).
- Simplify the assignment to allow use of system binary search.
- Refocus the assignment on lists/arrays and string processing by replacing
binary search with sequential search (suitable for small datasets or small numbers of queries).
- Achieve improved performance using fancier data
structures (such as tries, suffix arrays, segment trees, and Cartesian trees).
- Extend the autocomplete data type to support adding/removing suggestion strings
or changing the associated weights.
- Delve into Unicode and related internationalization issues (such as
using a diacritic-insensitive order).