pythonselenium

selenium working first for but not second. Element not found in cache - perhaps the page


I have asked about three question about this script in here. Several errors later I have got so far. I was trying to work my way around this script but I'm stuck at this part and don't know how to fix it.

Basically I would like to see unread messages on a website and answer it later. I'm stuck at the part I have a for loop checking for every unread messages and keeping the id of the conversation so I can use it later on the URL.

Here is the code:

#!/usr/bin/python
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import re

email = "xxx@gmail.com"
password = "xxxxx"

print "Openning Browser"
browser = webdriver.Firefox()
browser.get("https://olx.pt/account/?ref[0][action]=myaccount&ref[0][method]=index")
print "Logging into OLX"
elem = browser.find_element_by_name("login[email]")
elem.send_keys(email)
elem = browser.find_element_by_name("login[password]")
elem.send_keys(password)
elem.send_keys(Keys.RETURN)
print "Loged into OLX"
time.sleep(5)
browser.get("https://olx.pt/myaccount/answers/")

while browser.find_elements_by_css_selector("tr.unreaded"):
   print "Unreaded messages!"
   unread_answers = browser.find_elements_by_css_selector("tr.unreaded") 
   for unread_row in unread_answers:
    row_id = unread_row.get_attribute("id")
    m = re.search('answer_row_(\d+)', row_id)
    row_number = m.group(1)
    print row_number
    print "First loop"
    browser.refresh()
    time.sleep(5) 
else:
    print "All read!"

Here is the output:

Openning Browser
Logging into OLX
Loged into OLX
Unreaded messages!
315911723
First loop
Traceback (most recent call last):
File "loginolxbackup.py", line 28, in <module>
row_id = unread_row.get_attribute("id")
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 113, in get_attribute
resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 469, in _execute
return self._parent.execute(command, params)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 201, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: Element not found in the cache - perhaps the page has changed since it was looked up
Stacktrace:
at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:9407)
at Utils.getElementAt (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:8992)
at WebElement.getElementAttribute (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12099)
at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12614)
at DelayedCommand.prototype.executeInternal_ (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12619)
at DelayedCommand.prototype.execute/< (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12561)

The html page I'm looking at is something like this:

<tr id="answer_row_3121238" class="bla bla bla">
...
<tr id="answer_row_3121428" class="bla bla bla">
...
<tr id="answer_row_3124238" class="bla bla bla">

I have tried printing out m and I saw it has 3 objects which mean it's fetching all of the unread messages.

I'm banging my head against the wall without any luck. Any advice/help would be much appreciated.


Solution

  • When you use browser.refresh() the DOM is rendered and the WebDriver is loosing all the elements it previously located, and that what causing the exception. It even in the stack trace: perhaps the page has changed since it was looked up.

    Either avoid refreshing the page (I don't see here any need for it) or relocate all the messages every iteration of the for loop.

    Example for relocation every iteration

    unread_answers = browser.find_elements_by_css_selector("tr.unreaded")
    messages_len = len(unread_answers)
    for x in range(0, messages_len - 1):
        unread_answers = browser.find_elements_by_css_selector("tr.unreaded")
        row_id = unread_answers[x].get_attribute("id")
        m = re.search('answer_row_(\d+)', row_id)
        row_number = m.group(1)
        print row_number
        print "First loop"
        browser.refresh()
        time.sleep(5)