pythonselenium-webdriverweb-scraping

Python/Selenium: dynamically generated elements with same URL - html doesn't respond to clicking


I'm fairly new to web scraping and may have gone over my head on this, but I am trying to scrape apartment information from a dynamically generated website( https://noveatknox.com/floorplans/). I've gotten as far as being able to scrape the information I need on the "generic" url (it defaults to the 19th floor). I am trying to have selenium click on each floor so I can pull the available units and their information. I've even isolated the link to "click" on each floor. However the inner HTML code always returns a "no apartment available" and therefore cannot find any information.

I sense there is something wrong with the "click" for the page to load no apartments. The dynamic html code makes it very hard to pull and I cannot find a spot which houses all the information.

Here's what I have so far (a[11] refers to a specific floor which I know has available apartments). I plan to apply a range to loop through all floors once I nail down the base code:

xpath = '//*[@id="mobile-floor-carousel-list"]/a[11]'
# Click page
floor = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, xpath))).click()

print("clicked on page")

floorplans = WebDriverWait(driver, 10).until(
    EC.presence_of_all_elements_located((By.XPATH, "//div[@id='unit-list-items']"))
)
print("waited for page to load")

print(driver.find_element(By.XPATH,"//html").get_attribute('innerHTML'))

The hyperlink is shown as below:

<a href="#" class="skylease__toolbar-floor-link swiper-slide skylease__mobile-floor-link--has-units" data-js-hook="mobile-floor-selector" data-floor="14" style="width: 50.6154px; margin-right: 5px;"><div><span>Floor</span><span>14</span><span class="skylease-avail-count"><span data-floor="14" data-js-hook="available-unit-count">4</span> avail</span></div></a>

The inner HTML only shows this when there should be apartment info after:

</div>
   <div id="unit-list" class="skylease__unit-list">
   <p id="unit-list-message" class="skylease__unit-message">There are no available apartments on this floor.</p>
   <div class="skylease__unit-list-items" id="unit-list-items"></div>
      </div>

It should look something like this (what I see when I manually do this):

    <div id="unit-list" class="skylease__unit-list">
                <p id="unit-list-message" class="skylease__unit-message" style="display: none;">There are no available apartments on this floor.</p>    
    <div class="skylease__unit-list-items" id="unit-list-items"><a href="#" class="skylease__unit-list-item skylease__unit-list-item--alt" data-unit="1504" style="display: block;">
        ....edited out for simplicity
        
                <div class="skylease__unit-list-item-info-wrap">
                    <div class="skylease__unit-list-item-info">
                        <div class="skylease__unit-list-item-details skylease__unit-list-item-details--unit">
                            #1504
        ...edited out for simplicity
                    <div class="skylease__unit-list-item-info">
                        <p class="skylease__unit-list-item-details">Studio</p>
                        <p class="skylease__unit-list-item-details">1 bath</p>
                        <p class="skylease__unit-list-item-details"></p>
                                            <p class="skylease__unit-list-item-details">512 sq. ft.</p>
                                        <p class="skylease__unit-list-item-details"></p>
                    </div>
        
                                    <div class="skylease__unit-list-item-info">
                            <p class="skylease__unit-list-item-details skylease__unit-list-item-details--price">
                                                        <span>1,807</span>
                                                            <span>3683</span>
                                                    </p>
                        </div>
                            </div>

Solution

  • Here's a simpler way to do this.

    1. Loop through the floors across the top of the floorplan.
    2. If the number of units available is not 0, then print the floor and available unit count, then click it.
    3. Loop through each available unit in the right panel and print it.

    The working code is below.

    from selenium import webdriver
    from selenium.webdriver.support.wait import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    url = 'https://noveatknox.com/floorplans/'
    driver = webdriver.Chrome()
    driver.maximize_window()
    driver.get(url)
    
    wait = WebDriverWait(driver, 10)
    
    floors = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "#mobile-floor-carousel-list > a span[data-js-hook]")))
    for floor in floors:
        if floor.text != "0":
            print("Floor: " + floor.get_attribute("data-floor") + ", Available units: " + floor.text)
            floor.click()
            for unit in wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "#unit-list-items a"))):
                print(unit.text)
                print("")
            print("")
    

    and it outputs...

    Floor: 4, Available units: 1
    #418       
    Studio     
    1 bath     
    605 sq. ft.
    1,801      
    3815       
    
    
    Floor: 5, Available units: 2
    #503  
    2 bed 
    2 bath
    1166 sq. ft.
    3,826
    7896
    

    and so on...