CIS 22 - Data Structures
Part I - A Bare-Bones HTML Displayer
Overview
A Web browser is used to display HTML files and to allow a user to
navigate the Web by entering specific addresses or following hypertext links.
For this assignment, you will write a very simplified version of a browser
that will work only on local files, (not on remote Web pages)
Your HTML Displayer will be text based, not graphical. (If you have never
used a text based browser, try out lynx)
The Displayer
Your displayer should open an HTML file and display its contents
appropriately -- i.e., extra white space should be ignored, line breaks
should be ignored. You should provide a default setting for the
number of rows and number of columns to be displayed. The text should
be displayed accordingly.
You should keep displaying text on one line until there is no more room on that line.
(depending on the number of columns). Be careful not
to insert a line break in the middle of a word!
If the input text contains a < BR > tag, you should insert a newline.
Similarly, if the input
text contains a < P > tag, you should insert a newline plus an additional
blank line. For this assignment, you can ignore all other HTML
tags except < A > which will be described below.
For example, suppose you have set the number of columns to 40, and
you have an input file that looks like this:
One, two
Buckle my shoe.
< P >
Three, four
Shut the door.
Five, six
Pick up sticks.
|
It would be displayed as:
One, two Buckle my shoe.
Three, four Shut the door. Five, six
Pick up sticks.
|
Hypertext Links
Hypertext links are embedded in HTML files using the < A > tag.
When a browser encounters a link, it needs to do several things:
- display it in some manner so that it stands out
(usually underlined and in blue)
- associate the link with the URL of the references, so that when
the user clicks on the links it will jump to the right place.
You have a similar task. Firstly, you have to display the text
so that it stands out. For our purposes, we will simply offset
the text of the link by brackets []
and number the links.
For example, suppose you have an the following input:
Here is some text.
And now < A HREF="one.html" > here is the first link. < /A >
And now we have a < A HREF ="two.html" > second link < /A >
|
It would be displayed as:
Here is some text. And now [1] here is
the first link. [] And now we have a
[2]second link []
|
The other thing you have to do is "remember" the target of each link.
You can do that by maintaining an array corresponding to the links.
Each entry of the array should contain the name of the file that is
is being linked to.
User Interface
After the file has been displayed, the user should be able to make
several choices:
- enter a link number, such as 2. In the example above, link 2
should result in file two.html being displayed
- enter a file name explicitly. For example, the user might want
to enter three.html (This corresponds to entering a
URL in the Location field of a browser.)
- choose to display the "next page", if the entire file did
not fit on one display page. For example, the number of rows may have been
set to 25 and the file might end up as 40 lines.
- quit
Extra Credit
-
The only HTML tags that you are required to implement are BR, P and A.
You may extend your "browser" to implement many other tags, such as
heading tags (H1, H2), CENTER, HR, UL, OL, etc.
-
Implement a resize feature that allows the
user to change the number of rows and columns in the display area.
-
Provide a more sophisticated display, by displaying links in color.
(This may be system-dependent. Be sure to indicate what system requirements you
are working with.)
Testing
Your program must be thoroughly tested.
An incomplete test will result in the assignment being returned for
resubmission. You can only resubmit once-after that the assignment will no
longer be accepted, so be sure you perform thorough tests the first time.
You should make sure that you test tests both for successful and
for unsuccessful operations. For example, you should include a test
which attempts to access a file that does not exist, as well as an invalid
link number. You should display an appropriate error message in these cases.
You should also deal with some kinds of badly-formed HTML:
what if a tag doesn't have a closing '>'? Or a < A > doesn't
have a matching < /A >? Just as any other browser does, your code
should do its best to display this HTML; you should not generate
error messages for HTML syntax errors.
One set of input
files can be downloaded here.
You must also make up your own input files to use for testing.
What to Submit
I do not want to look at your output on the screen. Dump the screen output,
print it and submit a hardcopy. Also submit all of your input source files.
(You do not need to submit the files you downloaded from here.)
Obviously, submit all the program and header files.