perlfirefoxwww-mechanizewww-mechanize-firefox

Downloads in Firefox using Perl WWW::Mechanize::Firefox


I have a list of URLs of pdf files that i want to download, from different sites.

In my firefox i have chosen the option to save PDF files directly to a particular folder.

My plan was to use WWW::Mechanize::Firefox in perl to download each file (in the list - one by one) using Firefox and renaming the file after download.

I used the following code to do it :

    use WWW::Mechanize::Firefox;
    use File::Copy;

    # @list contains the list of links to pdf files
    foreach $x (@list) {
        my $mech = WWW::Mechanize::Firefox->new(autoclose => 1);
        $mech->get($x);  #This downloads the file using firefox in desired folder

        opendir(DIR, "output/download");
        @FILES= readdir(DIR);
        my $old = "output/download/$FILES[2]";
        move ($old, $new);  # $new is the URL of the new filename
    }

When i run the file, it opens the first link in Firefox and Firefox downloads the file to the desired directory. But, after that the 'new tab' is not closed and the file does not get renamed and the code keeps running (like its encountered an endless loop) and no further file gets downloaded.

What is going on here? Why isnt the code working? How do i close the tab and make the code read all the files in the list? Is there any alternate way to download?


Solution

  • Solved the problem.

    The function,

    $mech->get() 
    

    waits for 'DOMContentLoaded' Firefox event to be fired by Firefox upon page load. As i had set Firefox to download the files automatically, there was no page being loaded. Thus, the 'DOMContentLoaded' event was never being fired. This led to pause in my code.

    I set the function to not wait for the page to load by using the following option

    $mech->get($x, synchronize => 0);
    

    After this, i added 60 second delay to allow Firefox to download the file before code progresses

    sleep 60;
    

    Thus, my final code look like

    use WWW::Mechanize::Firefox;
    use File::Copy;
    
    # @list contains the list of links to pdf files
    foreach $x (@list) {
        my $mech = WWW::Mechanize::Firefox->new(autoclose => 1);
    
        $mech->get($x, synchronize => 0);
        sleep 60;
    
        opendir(DIR, "output/download");
        @FILES= readdir(DIR);
        my $old = "output/download/$FILES[2]";
        move ($old, $new);  # $new is the URL of the new filename
    }