i am trying to use Selenium Python to open tiktok user page and scroll down to load all user videos i can open the url and get the source code including all loaded videos data, but when scroll down and time sleep for a while and get source code, the page code is the sane with same videos and nothing new is loaded!!
from selenium import webdriver
from selenium.webdriver.common.by import By
import re
import json
from bs4 import BeautifulSoup
import time
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
# open it, go to a website, and get results
wd = webdriver.Chrome('chromedriver',options=options)
wd.get("https://www.tiktok.com/@tiktok")
time.sleep(20)
#wd.implicitly_wait(10)
#print(wd.page_source)
SCROLL_PAUSE_TIME = 20
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
print(wd.page_source)
i also tried to use this code for scroll down
wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(10)
print(wd.page_source)
but also nothing is loaded in source code! , i am using google colab, any help?
install
# install chromium, its driver, and selenium
!apt-get update
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
!pip install selenium
I had the same problem. Although I am using playwright
, with playwright-stealth
, not selenium
.
The problem is that tiktok detects that you are headless and throttles you. Or at least that was my problem.
Simply adding a browser flag: "--headless=new" fixed it. This argument makes a new recently released version of headless chromium to be used. And this version is much less detectable. Just make sure you use a recent version of chromium.