Elementary Text Analysis
Write a C program that takes the name of a file as a command-line argument, opens the file, reads through it to determine the number of words in each sentence, displays the total number of words and sentences, and computes the average number of words per sentence. The results should be printed in a table (at standard output), such as shown below:
./analyze-text comp.text
This program counts words and sentences in file "comp.text ".
Sentence: 1 Words: 29
Sentence: 2 Words: 41
Sentence: 3 Words: 16
Sentence: 4 Words: 22
Sentence: 5 Words: 44
Sentence: 6 Words: 14
Sentence: 7 Words: 32
File "comp.text" contains 198 words words in 7 sentences
for an average of 28.3 words per sentence.
General notes
- A word is defined as any contiguous sequence of letters. Apostrophes at the beginning or the end of a word should be ignored. Apostrophes with letters immediately before and after are considered part of a word. For example, "O'Henry", "I've", "you're", "friend's" and "friends'" should each be considered as one word.
- A sentence is defined as any sequence of words that ends with a period, exclamation point, or question mark, except a period after a single capital letter (e.g., an initial) or embedded within digits (e.g., a real number) should not be counted as being the end of a sentence.
-
Digits and punctuation (other than apostrophes, periods, explanation
points, and question marks) should be considered the same as white space.
Thus,
After they walked, talked, and ate, the first person said, "I'd like to swim: crawl, side stroke, and butterfly."
Should be considered the same as
After they walked talked and ate the first person said I'd like to swim crawl side stroke and butterfly
-
White space (e.g., spaces, tabs, line feeds, and return characters)
are considered as equivalent. Multiple white space characters are
considered the same as one space character. Thus, the above
passage would beequivalent to the following:
After they walked talked and ate the first person said I'd like to swim crawl side stroke and butterfly
