javascriptwikipedia-apimediawiki-api

403 error when using MediaWiki REST API to `compare` revisions


NOTE: The bug in the API has now been fixed, and this issue no longer arises.


I'm building a simple tool for myself that will display the difference between revisions on Wikipedia using (by sending a request to the WikiMedia REST API as described in the API's docs here: Reference#Compare revisions).

Here is an example of code which retrieves the JSON I want:

const url = "https://en.wikipedia.org/w/rest.php/v1/revision/1157633768/compare/1157658266";
let json = await fetch(url).then(resp=>{ return resp.json(); }); 
console.log(json)

This code works in a local Node REPL, or likewise it works if I point my browser to <en.wikipedia.org>, and use the in-browser console to run the code. Likewise, I can simply curl that url and get the data printed in my terminal.

However, I get a 403 error if I use that same code in my site (or, equivalently, if I use the in-browser console to run the code when pointed at a site other than wikipedia). Here's the error I get:

GET https://en.wikipedia.org/w/rest.php/v1/revision/1157633768/compare/1157658266 403

{error: 'rest-cross-origin-anon-write', httpCode: 403, httpReason: 'Forbidden'}

The error title seems to indicate I'm being forbidden access since I'm trying to write from a cross-origin site, but the request I'm doing is a simple read-only, request isn't it? According to the reference page linked above Error 403 is returned when "Revision not publicly accessible". However, it's clearly publicly accessible since I can access it from any browser or curl without authentication.

[Note, if I make any other requests using this API, like getting the details for a single revision (as in docs here: Reference#Get_revision) with url https://en.wikipedia.org/w/rest.php/v1/revision/1157633768/bare, I do not get this error. It's only when requesting the {from}/compare/{to} route.]

I've tried adding ?origin=* to the url as the site for the MediaWiki Action API suggests here. I know this is a different API, but it was still worth a shot.

I don't know much about cross-origin requests, but I feel like this should be simple. I am not trying to do anything that requires authentication.


Solution

  • The bug originates from the "Origin" header (no pun intended), which the browser may or may not add to all XMLHttpRequests. Unfortunately the "Origin" header is in the list of forbidden header names, so it can't be altered nor deleted.

    To work around that bug, I'd go with what @PCDSandwichMan suggested in the comments and (temporarily) go through a server side proxy. If your server supports PHP and "allow_url_fopen" is set to true, you could add the following proxy.php to your web root:

    <?php
    readfile('https://en.wikipedia.org/w/rest.php/v1' . $_GET['path']);
    

    and change your JS code to

    const url = "/proxy.php?path=" + encodeURIComponent("/revision/1157633768/compare/1157658266");
    const json =await fetch(url).then(resp=>{ return resp.json(); }); 
    console.log(json);
    

    If your web host does not "allow_url_fopen" but has the PHP cURL module enabled, change the contents of proxy.php to:

    <?php
    $ch = curl_init('https://en.wikipedia.org/w/rest.php/v1' . $_GET['path']);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    echo curl_exec($ch);
    curl_close($ch);
    

    I've tested both options successfully.