RSS News Feed Aggregation
This is a mid-to-late CS2 assignment that gets students to download
a large collection of online news articles, extract all of the meaningful
words from each of them, and build a search engine sophisticated
enough that it can generate a list of news stories about any single-word
topic. The assignment gets students to build a non-trivial data structure
using the more common collection classes, and in the process they get
to write just the smallest bit of networking code to initiate engine build.
If you cover any concurrency or Swing in your version of CS2, then there's
even more for you to add.
The assignment niche
I've used this assignment early in CS3, where we teach low-level generic programming in C. However, the assignment is much more easily implemented in Java, and is easily adapted to any CS2 curriculum. The primary goal of the assignment is to expose students to non-trivial data structure design issues, simple networking, and (optionally) some multithreading. They're also forced to deal thoughtfully and gracefully with the myriad of networking-related exceptions that can be thrown.
Why I love the assignment (when we use Java!)
The first time I gave the assignment in my CS3 course, reactions were mixed. Everyone loved the networking and concurrency aspects, and everyone also loved the fact that they were building a data structure out of hordes of real information available from all over the planet. But they felt they were working against the C language while trying to implement an application that they felt was otherwise very simple to understand.
My feeling is that these problems go away when you implement the RSS News Feed Aggregator in Java. The Java 1.5 template collection classes are fantastic, the java.net package is super easy to use, and the introduction of the Semaphore class in Java 1.5 makes concurrent programming that much easier to handle.
Resources
- Assignment Handout
- JAR file packing all of the class files in a default package. Just type java -classpath nifty-rss.jar RSSNewsFeedAggregatorApplication. This version relies on the 1.5 version of the Java runtime. Feel free to email me if you'd like the Java source code, and I'll be happy to send it to you.
Contact info
Jerry Cain
Department of Computer Science
Stanford University
mailto: jerry@cs.stanford.edu
This page in support Nick Parlante's ACM SIGCSE Symposia 2006 Panel on Nifty Assignments. Other nifty assignments are available!
Last updated: Friday, March 3, 2006.