pythonfacebookfacebook-graph-apiweb-scrapingfacebook-page

How to get the Facebook Public Page Content Access just to extract data?


For a project at university I need to extract data such as posts and reviews from same Facebook pages. Everything was fine couple of months ago but now to get data from pages you need the Public Page Content Access.

In order to get my app reviewed I need to add:

As a student who just needs to extract some data for an exam I don't have any website/platform where I'd use the app. I'm using the Facebook Graph API on Python.
I looked on this website for a Privacy Policy Generator but I don't have any website nor mobile apps where I'd use the API...

Is there some way for my situation to extract data by API without this requirements or it's better for me to find other solutions, such as web scraping?


Solution

  • To be able to extract data from Facebook using a python code you need to register as a developer on Facebook and then have an access token. Here are the steps for it.

    Go to link developers.facebook.com, create an account there. Go to link developers.facebook.com/tools/explorer. Go to “My apps” drop down in the top right corner and select “add a new app”. Choose a display name and a category and then “Create App ID”. Again get back to the same link developers.facebook.com/tools/explorer. You will see “Graph API Explorer” below “My Apps” in the top right corner. From “Graph API Explorer” drop down, select your app. Then, select “Get Token”. From this drop down, select “Get User Access Token”. Select permissions from the menu that appears and then select “Get Access Token.” Go to link developers.facebook.com/tools/accesstoken. Select “Debug” corresponding to “User Token”. Go to “Extend Token Access”. This will ensure that your token does not expire every two hours.

    Python Code to Access Facebook Public Data: Go to link https://developers.facebook.com/docs/graph-api if want to collect data on anything that is available publicly. See https://developers.facebook.com/docs/graph-api/reference/v2.7/. From this documentation, choose any field you want from which you want to extract data such as “groups” or “pages” etc. Go to examples of codes after having selected these and then select “facebook graph api” and you will get hints on how to extract information. This blog is primarily on getting events data. First of all, import ‘urllib3’, ‘facebook’, ‘requests’ if they are already available. If not, download these libraries. Define a variable token and set its value to what you got above as “User Access Token”.

    token= ‘aiufniqaefncqiuhfencioaeusKJBNfljabicnlkjshniuwnscslkjjndfi’
    

    Getting list of Events: Now to find information on events for any search term say “Poetry” and limiting those events’ number to 10000:

    graph = facebook.GraphAPI(access_token=token, version = 2.7)
    events = graph.request(‘/search?q=Poetry&type=event&limit=10000’)
    

    This will give a dictionary of all the events that have been created on Facebook and has string “Poetry” in its name. To get the list of events, do:

    eventList = events[‘data’]
    

    Extracting all information for a event from the list of events extracted above: Get the EventID of the first event in the list by

    eventid = eventList[1][‘id’]
    

    For this EventID, get all information and set few variables which will be used later by:

    event1=graph.get_object(id=eventid,fields=’attending_count,can_guests_invite,category,cover,declined_count,description,end_time,guest_list_enabled,interested_count,is_canceled,is_page_owned,is_viewer_admin,maybe_count,noreply_count,owner,parent_group,place,ticket_uri,timezone,type,updated_time’)
    attenderscount = event1[‘attending_count’]
    declinerscount = event1[‘declined_count’]
    interestedcount = event1[‘interested_count’]
    maybecount = event1[‘maybe_count’]
    noreplycount = event1[‘noreply_count’]
    

    Getting the list of all those who are attending an event and converting the response into json format:

    attenders = requests.get(“https://graph.facebook.com/v2.7/"+eventid+"/attending? 
    access_token="+token+”&limit=”+str(attenderscount)) 
    attenders_json = attenders.json()
    

    Getting the admins of the event:

    admins = requests.get(“https://graph.facebook.com/v2.7/"+eventid+"/admins? 
    access_token="+token)
    admins_json = admins.json()
    

    And similarly you can extract other information such as photos/videos/feed of that event if you want. Go to https://developers.facebook.com/docs/graph-api/reference/event/ and see “Edges” part in the documentation.