created 05/08/00
Modify the first example in the chapter so that it uses the C-style input loop.
Click here to go back to the main menu.
The file copy example program in the chapter is
longer than it needs to be.
Make it shorter by
using fewer try{}
blocks.
(This will result in less specific error messages).
Write a file copy program that uses a graphical user interface. The user enters the name of the file to copy and the name of the destination file in TextFields and clicks a button to perform the copy. Write error messages into another TextField.
For a nicer program,
look in your documentation for the
FileDialog
class.
Include it in the GUI so the user
can choose the source file graphically.
Write a class that reads data from a file
containing integers in character format.
This class will be a software tool that
other programs can use to simplify their
own input.
The constructor for the class will use
the name of the input file as a parameter,
and will create the appropriate streams.
Write a close()
method
and a method int getNextInt()
that returns the value of the next integer
from the stream.
When an error is encountered,
these methods will write an error message
and stop the program.
The input file may have none, one, or several integers per line. A suitable file could be created by the program for Exercise 1 of the previous chapter. The class will work with any number of integers total, and a varying number of integers per line.
Use BufferedReader
,
FileReader
,
parseInt()
,
and StringTokenizer
.
Test your class by using it in a program that writes the integers from the input file on the monitor, one per line. Once the class is debugged there are many other programs you can write that use it:
Write a program that inputs a Java source code file and outputs a copy of that file with Java keybords surrounded with HTML tags for bold type. For example this input:
public class JavaSource { public static void main ( String[] args ) { if ( args.length == 3 ) new BigObject(); else System.out.println("Too few arguments."); } }
will be transformed into:
<b>public</b> <b>class</b> JavaSource { <b>public</b> <b>static</b> <b>void</b> main ( String[] args ) { <b>if</b> ( args.length == 3 ) <b>new</b> BigObject(); <b>else</b> System.out.println("Too few arguments."); } }
In a browser the code will look like this:
public class JavaSource { public static void main ( String[] args ) { if ( args.length == 3 ) new BigObject(); else System.out.println("Too few arguments."); } }Click here to go back to the main menu.
Any text editor, such as Notepad, can be used to create web pages. Unfortunately, these editors usually do not check spelling. Word processors can open a text file and check its spelling. But when a file is sprinkled with HTML tags they all are flagged as errors and the real spelling errors are hard to see. This exercise is to write a utility that strips the HTML tags from a text file.
Write a program that reads in a text file and writes out another text file. The input file may have any number of HTML tags per line. The output file will be a copy of the input file but with spaces substituted for each HTML tag. The program will not check HTML syntax; it looks at the file as a stream of tokens and substitutes spaces for each token that is a tag. For this program, an HTML tag is any token that looks like one of these:
<Word> </Word>
Assume that Word is a single word (perhaps just one letter or no letters) and that there are no spaces between the left and right angle brackets. With this definition, the following are tags:
<p> </p> <em> </em> <rats> </1234> <blockquote> </>
With this definition, the following are NOT tags (although some are with real HTML):
< p> </ p> <em > </e m> <table border cellpadding=5> <block quote> < /em>Challenging Exercise: Write the program to filter out any tag that looks like one of these:
<Word .... > </Word ... >
Now Word
is a single word that immediately
follows the left angle bracket, but may be followed by
more text which may include spaces.
A tag ends with a right angle bracket, which
might or might not be
preceeded by a space.
Assume that a tag starts and ends on the same line.
With this definition,
the following are tags
<p> </p> <em > </em > <table border cellpadding=5> <word another word> </x y z>
Start by setting a flag to false.
Now look at the input stream of tokens one by one.
When a token starts a tag set a boolean flag to true.
While the flag is true discard tokens until encountering
a tag end
First Part:
Letters of the alphabet occur in text at different frequencies. Write a program that confirms this phenomonon. Your program will be invoked from the command line like this:
C:\mydir> java freqCount avonlea.txt avonlea.rept -allIt will then read through the first text file on the command line (in this case "avonlea.txt") accumulating the counts for each letter. When it reaches the end of the file, it will write a report (in this case "avonlea.rpt") that displays the total number of alphabetic characters "a-zA-Z" and for each character the number of times it occured and the relative frequency with which it occured. In counting characters, regard lower case "a-z" and upper case "A-Z" characters as identical.
You will need an array of 26 long integers, one per character. To increment the count for a particular character you will have to convert it into an index in the range 0..25. Do this by first determining which range the character belongs in:"a-z" or "A-Z" and then subtracting 'a' or 'A' from it, as appropriate:
int inx = (int)ch - (int)'A' ; count[inx]++ ;Discard characters not in either range without increasing any count.
Second Part:
Do the relative frequencies of the initial letters of words differ from the relative frequencies for all letters in a text? Add logic to the program so that it examines only the first character in each word. Allow the user to chose between the two options with a switch on the command line:
C:\mydir>java freqCount avonlea.txt avonlea.rept -firstFor this option it will be convenient to use the Java class
StringTokenizer
to deliver individual words one at a time. In the string of delimiters passed toStringTokenizer
include whitespace and all punctuation that might be at the start or end of a word. This is not quite good enough for an accurate count because some words will be split between linesIt is often true that handling the an- noying details makes up the large maj- ority of the statements in a pro- gram.So, if the last token in a line (returned by
StringTokenizer
) ends with '-', don't include the first letter of the first token on the next line in the count.
Testing:
For testing, create some really simple files that demonstrate that your program is working. For instance:
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa !*$#)#%$#) @##$%))!__ !#4241-432 !_#*%_@*(* !@%#*#.,?+and:
AAAAA AAAAA- BBBBB BBBBB- CCCCC CCCCC- DDDDD-DDDDD EEEEE-EEEEEThe first draft of your program will write its count to the monitor for easy debugging. Add text file output later. It is probably wise to write the first part of the program and debug it before moving on to the second option.
Download a text file of a novel of at least 400K bytes from Project Gutenberg.. Use a file that does not use HTML formatting tags (which would confuse the count). Delete the text at the beginning of the file that is not part of the novel (the legalese and documentation). Run both options of the program on the text.
Example:
Here is a sample run of my program with the text "Ann of Avonlea" from project Gutenberg.
C:\mydir>java freqCount avonlea.txt avonlea.rept -all C:\mydir>type avonlea.rept Total alphabetical characters: 373267 A: 31840 8.53% B: 5942 1.59% C: 7627 2.04% D: 17541 4.69% E: 45614 12.22% F: 7191 1.92% G: 7960 2.13% H: 22500 6.02% I: 25095 6.72% J: 733 0.19% K: 3443 0.92% L: 17534 4.69% M: 9324 2.49% N: 26516 7.1% O: 27344 7.32% P: 6083 1.62% Q: 275 0.07% R: 21285 5.7% S: 23398 6.26% T: 32579 8.72% U: 10720 2.87% V: 4201 1.12% W: 9063 2.42% X: 546 0.14% Y: 8745 2.34% Z: 168 0.04%
End of Exercises.
Click here to go back to the main menu.