Due: Thursday, 8 November 2018
Part I: File I/O, Strings, and Error Handling in C
The purpose of the first task is to give you experience with File I/O and strings in C. The task is to implement a word frequency counter that determines the number of occurrences of each unique word in a text file. The task requirements are as follows.
- The word counter should be case-insensitive. For example, there are two occurrences of the word,"the", in the sentence, "The events in question occurred in the early days of my association with Holmes.".
- The word counter should be able to ignore the punctuation. For instance, "Holmes." and "Holmes," should be counted as "holmes".
- The word counter should read the filename via command line.
- The word counter should print out the top 20 words in terms of the word frequency in descending order.
- You can use your linked list of Project 5 if you wish. Implementing a binary tree or hash map is an excellent and significant extension.
- Use this file to test your word counter.
Your output should look like the following.
the 17 of 7 and 6 was 5 in 4 with 4 a 3 windows 3 broken 2 wings 2 central 2 portion 2 had 2 been 2 but 2 up 2 were 2 these 1 blocked 1 wooden 1
For this task you will create examples of signal handling in C. First, read the man pages on the signal package. In the terminal, you can use man to get the signal man pages.
man 3 signal
- Write a program that can respond to a cntl-C (interrupt). Your main function should use the signal function to set up a handler for the SIGINT signal and then enter an infinite loop. Your handler function just needs to print out a message (e.g. "Interrupted!") and then call exit();
- Write a program that can handle a floating point exception. Your main function should use the signel function to set up a handler for the SIGFPE signal then do something inappropriate with floating point values, such as dividing by zero. See if you can continue the program's execution after the exception takes place.
- Write a program that can handle a segmentation fault error. Your main function should set up a handler for the SIGSEGV signal then execute code that results in a segmentation fault (access illegal memory). Your handler should print something appropriate and exit.
Part II: File I/O, Strings and Error Handling in Selected Languages
- Language A: repeat the word-frequency task from part 1. Make use of built-in data types (like a dictionary), if possible, to make the task easier.
- In both languages, show how the language is
able to read from and write to a file.
- How do you open, close, and read from a (text) file? Is there support for binary files?
- Does your language have built-in support for I/O, or is it part of a set of standard libraries?
- Does your language support opening web locations (URLs) as well as files on the local disk?
- Can the user input information into the program interactively?
- In both languages, show how the language supports error handling. Are there built-in programming structures like try-catch blocks? Give examples that handle several different types of exceptions. You may want to integrate some of this discussion with the file I/O examples. If your language has no specific error handling support, explain if there are any conventions for handling and communicating errors.
- Language B: give examples of how the language handles strings. Are there built-in string operations? Are Strings a special type? How would you split a string on spaces or other characters? Is there anything analagous to the Java toString function that a programmer can use to override the default string conversion?
- Undertake one or more of the tasks for a third language.
- Make your word counter more robust so that it is able to handle invalid command-line inputs and invalid files.
- Implement additional data structures, such as a binary tree or hash map to support the word frequency task.
- Implement the word counter in your B language.
- Write a compilable haiku on functions or file I/O in the selected languages.
Submission for this project has three components:
- Code: The source code for the word counter in C and the selected languages should be submitted to the fileserver along with code for any extensions. Please make the filenames of your source code meaningful.Note that the quality of your comments counts toward your grade.
- README: Submit a README file to the fileserver for your C code and the code for your selected languages. The README file can be a .txt file. No matter what, it should be readable from the terminal (no Word or pdf files). It should be well-organized such that readers can easily understand the usage and the outputs of the C code and the code for your selected languages. In addition, any known bugs should be in the README file. Follow the format and content instructions from the writing guide.
- Wiki Report: The non-C language pages for this
project should have the following elements.
- Title of the project and your name. Include your partner's name if you worked with someone for the non-C tasks.
- A section for each task of part II. Each section should contain appropriate snippets of your sample programs, the outputs, and your explanations. Write these as tutorials for other students in the class. Aim for the right level of detail, and define anything that may not be clear. Presenting examples without explanation is insufficient.
- If you complete extensions in your selected languages, please include a section for each extension, providing sample programs, outputs, and explanations. Be sure to indicate credit, if necessary.
Please note that it is your responsibility to explicitly indicate the extensions you have undertaken. If you complete extensions for the C tasks, please indicate that explicitly in the code and README file. If you complete an extension in a selected language, please indicate this explicitly in your wiki report. Along with each extension, indicate if you completed it with a partner.