javascriptjquerygreasemonkeymashup

Can Greasemonkey fetch values from a paginated sequence of URL's?


I would like to fetch a value from https://play.google.com/store/account*, which makes the user page through its output. For example:
/store/account?start=0&num=40, then /store/account?start=40&num=40, etc.

Now, when I visit https://play.google.com/apps, I'd like Greasemonkey to tot-up the values from the /store/account pages and then display the final value on that page.

The code listed below can total the value, that I want, from the /store/account pages. However, I want to insert the code in a script that's used for the second URL, so I can prepend it on that same page.

// ==UserScript==
// @name        Google Play 
// @require     http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js
// @grant       GM_setValue   
// @grant       GM_getValue  
// ==/UserScript==

var startParam      = location.search.match (/\bstart=(\d+)/i);
    if (startParam) {
        var totalPrice  = 0;
        var startNum    = parseInt (startParam[1], 10);
        if (startNum    === 0) {
            GM_setValue ("TotalPrice", "0");
        }
        else {
            totalPrice  = parseFloat (GM_getValue ("TotalPrice", 0) );
        }

        $("#tab-body-account .rap-link").each( function () {
            var price   = $(this).attr ("data-docprice").replace (/[^\d\.]/g, "");
            if (price) {
                price   = parseFloat (price);
                if (typeof price === "number") {
                    totalPrice += price;
                }
            }
        } );
        //console.log ("totalPrice: ", totalPrice.toFixed(2) );

        $('.tabbed-panel-tab').before (
            '<div id="SumTotal">*Combined Value: $'+ totalPrice.toFixed(2) +'</div>'
        );

        GM_setValue ("TotalPrice", "" + totalPrice);

        if ( $(".snippet.snippet-tiny").length ) {
            startNum       += 40;
            var nextPage    = location.href.replace (
                /\bstart=\d+/i, "start=" + startNum
            );
            location.assign (nextPage);
        }
    }

Solution

  • The basic approaches for getting data from a page/site for a mashup are:

    1. Scraping via AJAX:
      This works on almost all pages, though it won't work with pages that load the content you want via AJAX. Occasionally, it can also get tricky for sites that require authentication or that restrict referrers.
      Use GM_xmlhttpRequest() for most cases, to allow for cross-domain scripting. This approach will be detailed below.

    2. Loading the resource page(s) in an <iframe>:
      This approach works on AJAX-ified pages, and can be coded to let the user deal with sign-in problems manually. But, this is: slower, more resource intensive, and more complicated to code.

      Since it doesn't seem to be needed for this question's particulars, see "How to get an AJAX get-request to wait for the page to be rendered before returning a response?" for more information on this technique.

    3. Use the site's API, if it has one:
      Alas, most sites don't have an API, so this is probably not an option for you, but it is worth making sure that no API is offered. An API is usually the best approach, if it is available. Do a new search/question for more details about this approach.

    4. Mimicking the site's AJAX calls, if it makes such calls for the kind of info you want:
      This option is also not applicable to most sites, but it can be a clean, efficient technique when it is. Do a new search/question for more details about this approach.


    Fetching value(s) from a sequence of web pages via cross-domain-capable AJAX:

    Use GM_xmlhttpRequest() to load the pages, and jQuery to process their HTML.
    Use GM_xmlhttpRequest()'s onload function to call the next page, if necessary, do not attempt to use synchronous AJAX calls.

    The core logic, from your original script, moves to within the onload function -- except that there is no longer a need to remember values between Greasemonkey runs.

    Here's a complete Greasemonkey script, with some status and error reporting thrown in:

    // ==UserScript==
    // @name        _Total-value mashup
    // @include     https://play.google.com/apps*
    // @require     http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js
    // @grant       GM_addStyle
    // @grant       GM_xmlhttpRequest
    // ==/UserScript==
    
    var startNum        = 0;
    var totalValue      = 0;
    
    //--- Scrape the first account-page for item values:
    $("body").prepend (
        '<div id="gm_statusBar">Fetching total value, please wait...</div>'
    );
    scrapeAccountPage ();
    
    function scrapeAccountPage () {
        var accntPage   = 'https://play.google.com/store/account?start=0&num=40';
        accntPage       = accntPage.replace (/start=\d+/i, "start=" + startNum);
    
        $("#gm_statusBar").append (
            '<span class="gmStatStart">Fetching page ' + accntPage + '...</span>'
        );
    
        GM_xmlhttpRequest ( {
            method:     'GET',
            url:        accntPage,
            //--- getTotalValuesFromPage() also gets the next page, as appropriate.
            onload:     getTotalValuesFromPage,
            onabort:    reportAJAX_Error,
            onerror:    reportAJAX_Error,
            ontimeout:  reportAJAX_Error
        } );
    }
    
    function getTotalValuesFromPage (respObject) {
        if (respObject.status != 200  &&  respObject.status != 304) {
            reportAJAX_Error (respObject);
            return;
        }
    
        $("#gm_statusBar").append ('<span class="gmStatFinish">done.</span>');
    
        var respDoc     = $(respObject.responseText);
        var targetElems = respDoc.find ("#tab-body-account .rap-link");
    
        targetElems.each ( function () {
            var itmVal  = $(this).attr ("data-docprice").replace (/[^\d\.]/g, "");
            if (itmVal) {
                itmVal   = parseFloat (itmVal);
                if (typeof itmVal === "number") {
                    totalValue += itmVal;
                }
            }
        } );
        console.log ("totalValue: ", totalValue.toFixed(2) );
    
        if ( respDoc.find (".snippet.snippet-tiny").length ) {
            startNum       += 40;
            //--- Scrape the next page.
            scrapeAccountPage ();
        }
        else {
            //--- All done!  report the total.
            $("#gm_statusBar").empty ().append (
                'Combined Value: $' + totalValue.toFixed(2)
            );
        }
    }
    
    function reportAJAX_Error (respObject) {
        $("#gm_statusBar").append (
            '<span class="gmStatError">Error ' + respObject.status + '! &nbsp; '
            + '"' + respObject.statusText + '" &nbsp; &nbsp;'
            + 'Total value, so far, was: ' + totalValue
            + '</span>'
        );
    }
    
    //--- Make it look "purty".
    GM_addStyle ( multilineStr ( function () {/*!
        #gm_statusBar {
            margin:         0;
            padding:        1.2ex;
            font-family:    trebuchet ms,arial,sans-serif;
            font-size:      18px;
            border:         3px double gray;
            border-radius:  1ex;
            box-shadow:     1ex 1ex 1ex gray;
            color:          black;
            background:     lightgoldenrodyellow;
        }
        #gm_statusBar .gmStatStart {
            font-size:      0.5em;
            margin-left:    3em;
        }
        #gm_statusBar .gmStatFinish {
            font-size:      0.5em;
            background:     lime;
        }
        #gm_statusBar .gmStatError {
            background:     red;
            white-space:    nowrap;
        }
    */} ) );
    
    function multilineStr (dummyFunc) {
        var str = dummyFunc.toString ();
        str     = str.replace (/^[^\/]+\/\*!?/, '') // Strip function() { /*!
                .replace (/\s*\*\/\s*\}\s*$/, '')   // Strip */ }
                .replace (/\/\/.+$/gm, '') // Double-slash comments wreck CSS. Strip them.
                ;
        return str;
    }
    



    Important: Don't forget the @include, @exclude, and/or @match directives, so your script does not run on every page and iframe!