So I am trying to write a Perl script which will take in 3 arguments.
It seems to be working as far as recursively searching through directories and finding all occurrences of the words in a file and prints them to the console.
How can I print these to an output file and also, how would I take the second argument, which is the number, say 5, and have it print to the console the number of words with the most occurrences while printing the words to the output file?
The following is what I have so far:
#!/usr/bin/perl -w
use strict;
search(shift);
my $input = $ARGV[0];
my $output = $ARGV[1];
my %count;
my $file = shift or die "ERROR: $0 FILE\n";
open my $filename, '<', $file or die "ERROR: Could not open file!";
if ( -f $filename ) {
print("This is a file!\n");
while ( my $line = <$filename> ) {
chomp $line;
foreach my $str ( $line =~ /\w+/g ) {
$count{$str}++;
}
}
foreach my $str ( sort keys %count ) {
printf "%-20s %s\n", $str, $count{$str};
}
}
close($filename);
if ( -d $input ) {
sub search {
my $path = shift;
my @dirs = glob("$path/*");
foreach my $filename (@dirs) {
if ( -f $filename ) {
open( FILE, $filename ) or die "ERROR: Can't open file";
while ( my $line = <FILE> ) {
chomp $line;
foreach my $str ( $line =~ /\w+/g ) {
$count{$str}++;
}
}
foreach my $str ( sort keys %count ) {
printf "%-20s %s\n", $str, $count{$str};
}
}
# Recursive search
elsif ( -d $filename ) {
search($filename);
}
}
}
}
I have figured it out. The following is my solution. I'm not sure if it's the best way to do it, but it works.
# Check if there are three arguments in the commandline
if (@ARGV < 3) {
die "ERROR: There must be three arguments!\n";
exit;
}
# Open the file
my $file = shift or die "ERROR: $0 FILE\n";
open my $fh,'<', $file or die "ERROR: Could not open file!";
# Check if it is a file
if (-f $fh) {
print("This is a file!\n");
# Go through each line
while (my $line = <$fh>) {
chomp $line;
# Count the occurrences of each word
foreach my $str ($line =~ /\b[[:alpha:]]+\b/) {
$count{$str}++;
}
}
}
# Check if the INPUT is a directory
if (-d $input) {
# Call subroutine to search directory recursively
search_dir($input);
}
# Close the file
close($fh);
$high_count = 0;
# Open the file
open my $fileh,'>', $output or die "ERROR: Could not open file!\n";
# Sort the most occurring words in the file and print them
foreach my $str (sort {$count{$b} <=> $count{a}} keys %count) {
$high_count++;
if ($high_count <= $num) {
printf "%-31s %s\n", $str, $count{$str};
}
printf $fileh "%-31s %s\n", $str, $count{$str};
}
exit;
# Subroutine to search through each directory recursively
sub search_dir {
my $path = shift;
my @dirs = glob("$path/*");
# Loop through filenames
foreach my $filename (@dirs) {
# Check if it is a file
if (-f $filename) {
# Open the file
open(FILE, $filename) or die "ERROR: Can't open file";
# Go through each line
while (my $line = <FILE>) {
chomp $line;
# Count the occurrences of each word
foreach my $str ($line =~ /\b[[:alpha:]]+\b/) {
$count{$str}++;
}
}
# Close the file
close(FILE);
}
elsif (-d $filename) {
search_dir($filename);
}
}
}