Handwriting Recognizer

Author: Stephanie Valentine (valentine [at] unl.edu)

Description:

With the widespread adoption of smartphones, tablets, and other hand-held devices, the role of the traditional mouse and keyboard is in question. Will the mainstream devices of the future include those input modalities? Current trends are pointing toward "new" input devices that can take place of both mouse and keyboard simultaneously: stylus pens. Apple, Microsoft, Samsung, and many more tech powerhouses have pen-driven devices on the market. The pen-like styluses can be used for selection, gestures, as well as hand-written input.

The question is -- how do these devices understand what the user writes? Did the user write a letter? A number? Did they dot an eye or click a button? The engineers that write the software to power these devices use various artificially intelligent algorithms and techniques to tackle the very difficult problem of deciphering human intention.

In this assignment, students get to implement the technique of hand-drawn sketch recognition called template matching. In template matching, for each shape the system can recognize, the system maintains a library of examples called templates. When trying to recognize a new mystery shape, the system calculates a similarity score between the mystery shape and each template in the library. If the mystery shape has a high similarity score with a template known to contain the letter 'q', then it's likely that your mystery shape is also a letter 'q'.

I and my colleague Steve Cooper used this assignment with approximately 120 students enrolled in a CS1 course (late-objects, Java-based, for both CS and non-CS majors) in the fall of 2018, 2019, 2020, and 2021. The latest handout is here: PDF HTML Markdown

Metadata

Summary This assignment asks students to write a simple (yet surprising effective) handwriting recognizer using template-matching methods. Students can view the results of their recognition algorithms in a colorful user interface that they get to draw themselves.
Audience

This assignment is the capstone assignment in my honors CS1 course. It incorporates nearly every topic we discuss in the course, but is assigned as during the module on searching and sorting. My students are fairly pracitced in code-reading, API usage, file I/O, modularity, object-oriented programming, and unit testing prior to this assignment. Students with less practice in those areas (especially in code reading) may struggle with this assignment.

In my opinion, this project makes an excellent bookend to the semester, especially for the non-majors in my course and for those with little programming experience. My students start the course with "Hello, World" and console interaction and end the course with research-based, artificially intelligent handwriting recognition. I think this assignment illustrates the enormity of a student's growth trajectory, which is a great way to keep students hooked on computing.

Difficulty

Students found this assignment moderately difficult but also fun and rewarding. Much of the tedious work is done for the students, leaving them the interesting and gratifying tasks of performing recognition and drawing the graphics.

I gave my students one week (and about 15 hours of TA/instructor office hours) to complete the assignment. I think this was appropriate for my honors students, but I would likely lengthen the timeline to two weeks (or omit the drawing requirements) for courses with high concentrations of struggling students or courses with fewer student contact hours.

Topics The assignment focuses on searching, sorting with comparators, and computer graphics (drawing on a canvas), but touches many other topics: strings, arrays, loops, functions, objects, file I/O, unit testing, etc.
Strengths

The greatest strength of this assignment isn't the starter code, it's the idea and the preprocessed dataset. These items can be ported to nearly any language and output format (console ASCII, .txt file, image generation, interactive web app, etc.).

The recognition algorithm that students must write is rather simple to write, but takes time to conceptualize and design. The algorithm does not lend itself to a single-function architecture, so students get experience breaking up an algorithm into many cooperating functions.

Students get to work on the visualization before the algorithm, so when they complete the algorithm, they can see their results visually. The instant gratification of the visualization motivates students to get started.

Whether rendering graphically (as my students do), or simply outputting the sketches in ASCII, students can see the effectiveness of their searching and sorting algorithms clearly. It's easy to see when your algorithm is calibrated correctly and when it's not.

Students are provided a basic algorithm to implement, but are encouraged to experiment with the algorithm to improve its accuracy. Standard searching and sorting exercises lead to a single "right" answer, but the handwriting recognition domain gives students agency to iteratively tweak and improve their algorithms to achieve better results.

Weaknesses

My version of this assignment (in its current form) is what I call a pancake assignment - it's broad and delicous but not very deep. Students touch (and thus must understand) a lot of different pieces of the application, but don't author many from scratch. This is very reminiscent of industry (adding features to an existing application), but limits student practice of architectural design.

Generating new template files is a time-intensive process involving many mathematical transformations of a sketch's data (scaling, translating, interpolating, etc.). There are files provided for the lower-case English alphabet, but introducing a new family of shapes would be nontrivial. Adjusting the dataset would also require a recalibration of the various provided thresholds (bounding box, maximum distance to be "overlapping", etc.).

Dependencies

The starter code is written in Java, but the data files are saved in XML. The idea can easily be ported to just about any CS1-type language.

The starter code uses built-in Java libraries for graphics (Swing and AWT), so no special libraries are required. Because Swing and AWT are less-than pleasant to work with, though, the starter code provides basic GUI setup and functions for drawing text, points, lines, and rectangles.

Variants

The assignment could be adjusted to use ASCII rendering, rather than graphics. Check out example output of such a variant here. To produce this kind of output, students can use a 2-dimensional char array (perhaps 50 chars wide and 20 rows tall) can represent the coordinate plane. Students can initialize the array with space characters, then add difinitive characters for each point. Then, students can simply print the 2d char array to the console or to a file stream. Students would need to do some scaling of the data files, since all the provided sketches are scaled to 300x300, but the outputted results would still be recognizable.

The starter code could be extended to include a sketching canvas. This would allow students to sketch their own shapes and have these new shapes analyzed. I imagine students would find This incredibly engaging.

The process of creating new template files would make for an interesting exercise in mathematical transformations of sketch data (scaling, translation, interpolation, etc.), as well as in serialization methods.