mediawikiwikipediawikipedia-apimediawiki-api

MediaWiki API to upload images to commons.wikipedia.org


currently i am able to upload the images to test.commons.wikipedia using the code taken from the original mediaWiki

but the images are being uploaded to https://upload.wikimedia.org/wikipedia/test

i want it to be uploaded to this website https://upload.wikimedia.org/wikipedia/commons

when i changed the upload url from URL = "https://test.wikipedia.org/w/api.php" to URL = "https://commons.wikipedia.org/w/api.php"

i am facing this error, when R.json() is called, where R = S.post(URL, files=FILE, data=PARAMS_4), WHERE FILE = {'file':('file_1.jpg', open(FILE_PATH, 'rb'), 'multipart/form-data')} and PARAMS_4 is the params for uploading images

PARAMS_4 = {
    "action": "upload",
    "filename": "file_11a.jpg",
    "format": "json",
    "token": CSRF_TOKEN,
    "ignorewarnings": 1
}
# CSRF_TOKEN WAS SUCCESFULLY RETRIVED but it returns only "+\" for commons, but for test it returns a alphanumeric csrf token ending with "+"

the following error has been caused

File "AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 978, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

what i can understand is that the response is 200 meaning the image has been uploaded, but the json() returned is null or empty so there is a error being caused.

what i think is the params should be tweaked little so that it matches for commons, and i get the json response which can the url or location of the image uploaded to the servee\r

now i want the url of the image that was uploaded, how do i find that, is there any other method ?

now what is difference between https://upload.wikimedia.org/wikipedia/test and https://upload.wikimedia.org/wikipedia/commons, apart from the different reasons they are used for, like are the images in test are present permanently, until explicit deleted/moved or are they available for a time limit, also does the link changes or stays the same as when it was uploaded,

basically i want to know if they arent major difference related to storage, access and performance, then i will stick back to the current folder on the server that is the TEST

i am okay doing it the either way, if there isnt much difference between test and commons, then the current should be fine, else, i want a way to do it for commons, ( upload the image file to commons and get its public url )

or is there any other way i can upload images to wikipedia using api and get the images public url

or how do i get the list of images uploaded by me or my account to wikipedia using mediawiki api

also tried https://<language>.wikipedia.org/wiki/Special:FilePath/<filename>, on uploaded images, but i am not able find the image that was uploaded using https://commons.wikipedia.org/w/api.php, but i am able to find the images uploaded from https://test.wikipedia.org/w/api.php, which means that images are not being uploaded using this api https://commons.wikipedia.org/w/api.php, but why does it return a response with code as 200

any help related in any form is appreciated, please answer related to the question, additional consideration such as licensing is already taken care of


Solution

  • okay i figured it out, there was an issue with the csrf token retrieval and also the user was not logged in,

    though it worked for https://test.wikipedia.org/w/api.php, i dont know why but for https://commons.wikipedia.org/w/api.php , i had to change the order of login and csrf

    import requests
    
    API_ENDPOINT = "https://commons.wikimedia.org/w/api.php"
    USERNAME = "xxxxxxxxxx"  # Replace with your username
    PASSWORD = "xxxxxxxxxx"  # Replace with your password
    FILE_PATH = "xxxxxxxxx"  # Replace with the actual path to your image
    
    FILENAME = "XXXXXXXX"  # Replace with the name you want on Commons
    
    # make sure that the filename is unique and already doesnt exist on commons
    
    
    def upload_image(file_path, filename):
        session = requests.Session()
    
        '''
            get the login token
        '''    
    
        params = {
            'action': 'query',
            'meta': 'tokens',
            'type': 'login',
            'format': 'json'
        }
        response = session.get(API_ENDPOINT, params=params)
        data = response.json()
        token = data['query']['tokens']["logintoken"]
    
        '''
            login to your account
        '''
    
        login_payload = {
            'action': 'login',
            'lgname': USERNAME,
            'lgpassword': PASSWORD,
            'lgtoken': token,
            'format': 'json'
        }
        session.post(API_ENDPOINT, data=login_payload)
    
        '''
            get csrf token
        '''
    
        params = {
            'action': 'query',
            'meta': 'tokens',
            'type': 'csrf',
            'format': 'json'
        }
        response = session.get(API_ENDPOINT, params=params)
        data = response.json()
        csrftoken = data['query']['tokens']["csrftoken"]
    
        with open(file_path, 'rb') as f:
            file_data = f.read()
        files = {'file': (filename, file_data, 'image/*')}  # Adjust MIME type as needed
    
        '''
            upload the image
        '''
    
        payload = {
            'action': 'upload',
            'filename': filename,
            'token': csrftoken,
            'format': 'json',
            "ignorewarnings": 1,
        }
        response = session.post(API_ENDPOINT, files=files, data=payload)
        return response.json()
    
    
    if __name__ == "__main__":
        upload_result = upload_image(FILE_PATH, FILENAME)
        if 'upload' in upload_result and upload_result['upload']['result'] == 'Success':
            image_url = upload_result['upload']['imageinfo']['url']
            print(f"Image uploaded successfully. Public URL: {image_url}")
        elif 'error' in upload_result:
            print(f"Upload failed. Error: {upload_result['error']['info']}")
        else:
            print("Upload failed. Unexpected response.")