In an attempt to study Benfords law, I'm trying to get a list of the likes on recommendet Instagram Reels. So the plan is simply to open Reels, fetch the Like Count, swipe to the next Reel and repeat until I have enough data.
I'm trying to do this in Python with the Selenium Webdriver:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import pyautogui
import time
driver = webdriver.Chrome()
driver.get("https://www.instagram.com/") # open instagram
time.sleep(2)
driver.find_element(By.XPATH, '//button[text()="Allow all cookies"]').click() # allow cookies
time.sleep(3)
user_field = driver.find_element(By.NAME, "username") # enter username
user_field.send_keys("my_username")
user_field.send_keys(Keys.ENTER)
password_field = driver.find_element(By.NAME, "password") # enter password
password_field.send_keys("my_password")
password_field.send_keys(Keys.ENTER)
time.sleep(5)
driver.get('https://www.instagram.com/reels') # go to reels
for i in range(1, 5): # swipe 5 reels then just wait
time.sleep(5)
# get how many likes the reel has (this doesn't seem to update)
div_element = driver.find_element(By.CSS_SELECTOR, '.html-div.xe8uvvx.xdj266r.x11i5rnm.x1mh8g0r.xexx8yu.x4uap5.x18d9i69.xkhd6sd.x6s0dn4.x1ypdohk.x78zum5.xdt5ytf.xieb3on')
# get likes by xpath (doesn't work crashes program)
# div_element = driver.find_element(By.XPATH, '//*[@id="mount_0_0_vZ"]/div/div/div[2]/div/div/div[1]/div[1]/div[2]/section/main/div[2]/div[9]/div/div[2]/div[1]/div/div/div/span/span')
# get all the buttons (works but there is alot of other unnecessay data)
# div_element = driver.find_element(By.XPATH, '//div[@role="button"]')
like_text = div_element.text
print(like_text) # print out the likes
time.sleep(3)
pyautogui.press('down') # swipe to next reel
time.sleep(5000)
In the code I'm trying to access the highlighted span element in the picture:
This span block seems to be identical for all other reels with the exception of the like count.
But if I run the code it fails to update the like count, so it outputs:
165K
165K
165K
165K
I've tried to acess this element by all sorts of methods (XPATH, CSS_SELECTOR, NAME, ID,...) some of them crash others return nothing. Any ideas what to do?
As mentioned in the comments, it could be that the old reel isn't removed from the DOM when you scroll. Because find_element
returns only the first match, that value would never change if earlier reels remain in the DOM. Instead, try find_elements
and grab the desired value.