pythonseleniumgoogle-chromeselenium-webdriver

selenium.WebDriverException: unknown error: session deleted because of page crash from tab crashed (Not on docker)


Good morning,

This is a duplicate of a similar post on StackOverflow WHICH DIDN'T have an answer that solved the problem for me.

For the last couple of days, my Python-Selenium script which uses Chrome Driver 104 has been having issue while scrolling down on infinite scroll, dynamically-loaded pages. This script is used to scroll Facebook and perform certain RPA actions like sending messages, etc. (I have only attached the snippet related to the error).

In summary, the user enters a number of posts to reach, and the script will reach this specific number of posts, for example, the first 1000 post, and perform certain actions (Doesn't violate Facebbook TOS)

This script is NOT running in a docker instance or any kind of container, using my full PC resources. Also, this script has been test on:

1- Windows 11 PC with 16GB ram and i7 Processor

2- MacBook - 16 GB

3- Windows Server 2019 - 32 GB of Ram, i7 Process

4- Linux Ubuntu 22.0 Server - 16 GB of Ram (increased Dev/shm to 30 GB on this server)

5- Google Colab Kernel (increased dev/shm)

All of the above had exactly the same error trace with the same error, session deletion because of page crash.

When the script reaches around 800-900 posts (this is a random number though, it once reached 1,2k posts for me then failed the next time at 400?) the page will become really slow and then crash. Now Something to notice here, I CAN scroll FAR more than 1500 posts normally on my PC (like manually), and it definitely DOES NOT crash. So, I am pretty sure this is a bug in my script, not because of memory issues (Maybe a memory leak in the script, but not a hardware issue I mean). When the script breaks, the ram is actually not near 80% of the total RAM.

If I ran the script in non-headless mode, I would receive an error message on Chrome that says:

"Oh Snap, Chrome out of memory"

To save your time, I read the following posts on Stackover flow and they didn't help:

1- unknown error: session deleted because of page crash from unknown error: cannot determine loading status from tab crashed with ChromeDriver Selenium

2- selenium.WebDriverException: unknown error: session deleted because of page crash from tab crashed

3- Python Selenium session deleted because of page crash from unknown error: cannot determine loading status from tab crashed

4- Getting "org.openqa.selenium.WebDriverException: unknown error: session deleted because of page crash" error when executing automation scripts (Which uses Java, but still read it though)

5- Selenium error with simple driver.get() method : session deleted because of page crash from unknown error: cannot determine loading status

What I did to try and solve the issue (and It didn't work):

1- Resized Window, according to this post.

2- Used Chrome Options --no-sandbox and --disable-dev-shm-usage

3- Tried using --js-flags (--max_old_space_size=8096)

4- Disabled all notifications, geolocation messages, images

5- Made sure my dev/shm on mac and linux is large enough as well as the temp folder in Windows

6- Added a LOT of time.sleep() between the scrolls.

7- Tried using a different scrolling method (To go to the bottom of page with javascript, 'driver.execute_script()'

8- Using Firefox GeckoDriver as well as Edge and Opera.

9- Using different ways to check the number of posts on the page (Bs4, LXML) which doesn't seem to be the issue as the issue happens in the scroll part.

The snippet that causes the issue: (The chrome options aren't listed in the code, but I load them from a separate file, I will write them down after the code though)


# Start Selenium Imports
from selenium import webdriver
from selenium.webdriver.chrome.options import  Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
# Selenium Imports Finished
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.action_chains import ActionChains

def login(email, password):
    driver.get('https://www.facebook.com/')
    #Email
    driver.find_element(By.NAME,'email').send_keys(email)
    #Password
    driver.find_element(By.NAME,'pass').send_keys(password, Keys.RETURN)
    time.sleep(2)

def reachPosts(noOfPosts = 50) -> None:
    posts = driver.find_element(By.XPATH,"//div[@role='feed']").find_elements(By.CSS_SELECTOR, ".g4tp4svg.mfclru0v.om3e55n1.p8bdhjjv")
    
    postsNo = len(posts) 
    posts = None
    while  postsNo < noOfPosts+1:
        scroll_down()
        posts = driver.find_element(By.XPATH,"//div[@role='feed']").find_elements(By.CSS_SELECTOR, ".g4tp4svg.mfclru0v.om3e55n1.p8bdhjjv")
        time.sleep(1)
        print(len(posts))
        postsNo = len(posts)
        if postsNo >= 1000:
            time.sleep(10)
        posts = None
    
    posts = None
#----------------Scroll Function!-----------------------------#
def scroll_down():
    """A method for scrolling the page."""

    # Scroll down to the bottom.
    #driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    for i in range(3):
        actions.send_keys(Keys.SPACE).perform()


#-----------------End-----------------------------------------#

def openGroup(facebookUrl, inputDate):
    print("Opening Facebook Link")
    driver.get(f'{facebookUrl}?sorting_setting=CHRONOLOGICAL')
    time.sleep(2)
    
    reachPosts(creds["Number of posts"])
   
    posts = driver.find_element(By.XPATH,"//div[@role='feed']").find_elements(By.CSS_SELECTOR, ".g4tp4svg.mfclru0v.om3e55n1.p8bdhjjv")
   
    noOfPosts = creds["Number of posts"]
   
    
def main():
    global creds
    creds = openCredentials()
    
    login(creds["email"], creds["password"])
    
    
    for group in creds['Facebook Groups']:
        openGroup(group, creds["Date"])
        time.sleep(3)

Chrome Options used:

                     "--disable-extensions",
                    "--disable-application-cache",
                    "--headless"
                    "window-size=600,450",    
                    "--disable-blink-features=AutomationControlled",
                    "--enable-javascript",
                    "disable-infobars",
                    "--js-flags='--max_old_space_size=8196'",
                    "--max_old_space_size=4096",
                    "max_old_space_size=9000",
                    "--disable-dev-shm-usage",
                    "--incognito",
                    "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"

The error

Traceback (most recent call last):
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\src\facebookScraperIndiv.py", line 313, in <module>
    main()
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\src\facebookScraperIndiv.py", line 302, in main
    openGroup(group, creds["Date"])
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\src\facebookScraperIndiv.py", line 254, in openGroup
    reachPosts(creds["Number of posts"])
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\src\facebookScraperIndiv.py", line 84, in reachPosts
    scroll_down()
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\src\facebookScraperIndiv.py", line 104, in scroll_down
    actions.send_keys(Keys.SPACE).perform()
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\lib\site-packages\selenium\webdriver\common\action_chains.py", line 78, in perform
    self.w3c_actions.perform()
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\lib\site-packages\selenium\webdriver\common\actions\action_builder.py", line 88, in perform
    self.driver.execute(Command.W3C_ACTIONS, enc)
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 434, in execute
    self.error_handler.check_response(response)
  File "D:\Work & Projects\Work\Upwork\Facebook Groups Scraper\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 243, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
from unknown error: cannot determine loading status
from tab crashed
  (Session info: chrome=105.0.5195.102)
Stacktrace:
Backtrace:
        Ordinal0 [0x0024DF13+2219795]
        Ordinal0 [0x001E2841+1779777]
        Ordinal0 [0x000F4100+803072]
        Ordinal0 [0x000E6F18+749336]
        Ordinal0 [0x000E5F94+745364]
        Ordinal0 [0x000E6528+746792]
        Ordinal0 [0x000EF42F+783407]
        Ordinal0 [0x000FA938+829752]
        Ordinal0 [0x0014F3CF+1176527]
        Ordinal0 [0x0013E616+1107478]
        Ordinal0 [0x00117F89+950153]
        Ordinal0 [0x00118F56+954198]
        GetHandleVerifier [0x00542CB2+3040210]
        GetHandleVerifier [0x00532BB4+2974420]
        GetHandleVerifier [0x002E6A0A+565546]
        GetHandleVerifier [0x002E5680+560544]
        Ordinal0 [0x001E9A5C+1808988]
        Ordinal0 [0x001EE3A8+1827752]
        Ordinal0 [0x001EE495+1827989]
        Ordinal0 [0x001F80A4+1867940]
        BaseThreadInitThunk [0x76236739+25]
        RtlGetFullPathName_UEx [0x774D90AF+1215]
        RtlGetFullPathName_UEx [0x774D907D+1165]
        (No symbol) [0x00000000]

Solution

  • For everyone facing the same issue in the future. The problem isn't with the chrome driver, it is with the DOM getting too large that causes the V8 JS memory to break at a point and call the OOM.

    To fix this for me, I thought of using the Facebook mobile version and it actually worked. Facebook mobile version of the website is much lighter and has much less complicated DOM which allows me to reach 5k+ posts actually.

    I hope this helps everyone wondering the same. If you have a similar issue, try to find ways to simplify the DOM or have another simplified DOM Views, I found some extensions help with that as well.