Project 2
This project tests your ability to
- work with command line arguments;
- work with files and directories (folders);
- perform search in text files using regular expressions;
- use the file-testers to distinguish files and directories;
- use hashes to store results and sort hashes to display results in the right order.
Please write a script named search.pl that:
- takes a number of arguments from the command line:
- the first argument is always a word or combination of word to search for (this is the only required argument),
- the following arguments can be script options: either -k, or -i or both; where
the -i argument sets case insensitive search, and the -k argument specifies that the given word
combinations must appear either inside the following HTML tags:
- <h1> ... </h1>
- <h2> ... </h2>
- ...
- <h7> ... </h7>
- <b> ... </b>
in this case search is to be performed only against .html files.
- the last argument is a name of the directory where to start the search. If this argument is not given search is to
be performed against all files in the current directory and all its sub-directories.
- print the names of files and the number of matches found. The output should be sorted by the number of matches
from highest to lowest.
Additional requirements are:
- Inside the script you should define an array of directory names where you do not want to perform the search.
For example, if the script is searching inside a directory www that has sub-directories ist334,
img, ist263, and js and your array defined as:
my @not_search_in = ("img", "js");
then the script should search only inside the www, www/ist334, and www/ist263 directories.
- Inside the script you also should define an array of file extensions for the files you need to search in. For example,
if the array defined as
my @ext = ("txt", "html", "doc");
then your script should search only inside files with extensions .txt, .html, and .doc.
This project is due by 11:59pm on Monday, February 23, 2004.